RADOGOST: Where archaeological documentation goes so it doesn’t disappear

What happens to documentation after the field season ends? RADOGOST helps store and share archaeological data so it can be found, understood, and reused.

Imagine a classic scenario: fieldwork, hundreds of photos, RTK GPS files, sketches, context descriptions, 3D models, artefact tables, analytical notes. Then “after the season”, when archaeologists return home and sit down at desks and in libraries, everything lands on a hard drive. The data gets sorted and usually ends up in folders named “NEW”, “FINAL”, “FINAL_2”… and starts living a life of its own. After a year, no one remembers which file is current. After five years, no one knows whose coordinates those were, what the methodology was, or whether it’s even allowed to show it to anyone.

Now consider another perspective: science based on verification, re-analysis, and data reuse. This is exactly where RADOGOST comes in.

Ilustracja wykonana w Midjourey, promt: obraz abstrakcyjny, wczesnośredniowieczny rycerz w centrum danych ed. J.M.Chyla
Ilustration made in Midjourney, promt: abstract painting of a early medieval warrior in data center
ed. J.M.Chyla

What is RADOGOST?

RADOGOST is an open repository for archaeological research data, created so that data from research can be stored safely, described sensibly, searched easily, and shared, including with people outside the research team. The repository is available at radogost.dariah.pl and is being developed by teams from the University of Warsaw and Maria Curie-Skłodowska University (UMCS), in collaboration with experts in digital infrastructure and tools.

In practice, RADOGOST is meant to be a place where “life after excavations” matters just as much as the field season itself.

Why does archaeology need a data repository?

In archaeology it’s easy to forget that the “real” excavation is not only soil. It is also thousands of small decisions and records: where exactly an artefact was found, in which layer, in what relation to other finds and features; what the profile looked like; what the coordinates were for the site, trench, or artefact locations. Excavation also means thousands of photos, drawings, 3D models, GIS layers, notes, and tables. All of this is part of the context and in archaeology context cannot be documented again. Once we disturb it (even in the best research-driven way), it cannot be “recreated” like a laboratory experiment.

Then the season ends and a second, less spectacular part of the work begins. Data travels to hard drives, cloud folders, private accounts, and directories with names that often serve a magical function: they promise a quick finish through sheer affirmation. And that is still the ideal case. Over the following months, many things can happen that strongly affect our digital data: someone changes a computer, someone leaves the team, links expire. With time, even the authors are no longer sure which version is the correct one, and an outsider has no chance of understanding what they are actually looking at. In the worst-case scenario, not only the site disappears, but its digital trace disappears too. From the perspective of science, that is a double loss.

To prevent these serious and common research problems, an archaeological data repository is needed. Not only as a “file warehouse”, but as a place where data receives description, becomes organised, and gains meaning: it turns into information. It becomes clear who created it, how, when, under what rules it can be used, and how it should be cited.

A repository also provides something an ordinary hard drive cannot: accessibility, durability, order, and the real possibility of returning to sources. This is not only about a backup. It is about data described in a way that someone outside the team can understand: what the columns in a table mean, how GIS layers were created, what photogrammetry settings were used, what is a working version and what is final. Thanks to this, research results can be verified, compared with other projects, and interpreted anew. Just as importantly, they can be included in broader comparisons: between regions, periods, and documentation methods. This is the moment when data stops being “our files” and starts functioning as part of a shared knowledge infrastructure, useful to other researchers, heritage protection institutions, educators, and many other users.

Polish researchers and open science

This changes the logic of collaboration. Instead of exchanging data via emails, external drives, and expiring links, a model emerges in which data can be published for reuse in a controlled way, with clear access rules, licences, attribution, and citability. This matters especially in archaeology, where fieldwork is limited, research is expensive, and contexts never repeat. If data is accessible and well described, we do not have to “start from zero” with every new project and every new research question. We can build on what already exists instead of wasting time guessing what was once recorded in a file without metadata.

This is open science in practice. Knowledge does not end with a publication. It is transparently grounded in data that others can find, understand, and reuse. In a world where analytical tools change quickly, this approach is especially important. Data collected today can gain “a second life” in a few years, when new methods, better algorithms, different questions, and new ways of connecting sources appear. A repository works like a bridge between the present and the future of research. It protects data from disappearing and also prepares it so that we can extract more from it than the authors originally expected at the moment of data collection.

That is why open science requires care and consistency. It is not a trend or an administrative obligation. It is an investment in the quality and credibility of research: in the ability to check conclusions, in methodological transparency, and in meaningful collaboration between institutions, even across countries. It is also an investment in archaeology itself, a discipline that often works with irreversible and vanishing contexts, and therefore cannot afford to lose its source base.

And this is where RADOGOST enters the picture. It is a response from Polish archaeology to a growing demand for open, durable, and intuitive research data repositories. It opens the way for Polish data to circulate beyond local networks and enter the global scientific ecosystem: to become visible, comparable, and ready for use in international research, education, public outreach, and heritage protection.

Ilustracja wykonana w Midjourey, promt: obraz abstrakcyjny, wczesnośredniowieczny rycerz w centrum danych ed. J.M.Chyla
Ilustration made in Midjourney, promt: abstract painting of a early medieval warrior in data center
ed. J.M.Chyla

What can be stored in RADOGOST?

The repository is designed for digital archaeological data. Depending on the collection and needs, this can include, for example:

  • field documentation (descriptions, tables, metadata),
  • photographs and their descriptions,
  • Geographic Information Systems (GIS) data (vector data, meaning points, lines, or areas with x, y, z coordinates and descriptive attributes; raster data, meaning images composed of rows and columns of pixels holding colour information or values such as elevation above sea level; and metadata, meaning data about data),
  • 3D models, including photogrammetric models,
  • analysis outputs (result files, charts, summaries, databases, scripts, vocabularies),
  • digitised and processed archival resources (where rights/licences allow).

One thing is crucial: files alone are not enough. We also need descriptions (metadata and documentation) that explain what something is, how it was created, at what scale, with what method, and under what licence it can be used. In RADOGOST these descriptions are tailored to archaeology, because archaeological data has its own language: space, time, and context. This is why metadata includes a 4D perspective: not only “where” (x, y, z), but also “when”, meaning chronological information without which even the best documentation loses half its meaning.

Here the RADOGOST team drew on an international solution: PeriodO. It is a structured catalogue of historical and archaeological period definitions, used to connect different datasets despite local differences in naming and time boundaries. Records relevant to Polish chronology were then refined and expanded so that when entering data, users can directly assign it to chronologies and archaeological cultures present in the lands of present-day Poland, in a consistent way that can be compared across projects.

In parallel, we have also prepared vocabularies that organise what appears in archaeology almost always, but is often described with different terms: types of immovable and movable heritage, and research methods by which the data was collected. This makes searching and working with datasets easier not only within a single project, but also when we want to compare data across teams, regions, and institutions.

At the same time, RADOGOST metadata does not impose a single uniform form, method, or scope of description. It leaves plenty of room for researchers’ own commentary: clarifications, nuances, and context that cannot be squeezed into a few form fields. This information remains discoverable because the repository supports searching also within user-created descriptions, assisted by a dedicated chatbot.

What tools does the platform offer?

From the start, RADOGOST focuses on practical features:

  • One secure place for research data: you can deposit files, organise them into collections, and then search, browse, and download them without asking the author for “the right folder”.
  • DOI for a dataset: a permanent identifier so data can be cited in articles like publications.
  • Flexible data description (metadata): the repository adapts the description schema to archaeology instead of forcing everything into a generic form.
  • Access control: data can be open immediately, require permission, or be under embargo (published later, for example after analysis is finished).
  • Private sharing links: including anonymous “for review” links, when you want to share data with peer reviewers before making it public.
  • Creative Commons licences at file level: making it clear what can be done with a given resource (a photo, a table, a 3D model).
  • Map view alongside the dataset list: data can be described geospatially and then searched on a map (because in archaeology “where” can be as important as “what”), also with a chronological filter.
  • 3D model preview: so users do not need to download large files just to see what they contain.
  • OCR and HTR: tools for “reading” scans and handwriting, useful for archives and older documentation.
  • Accessibility improvements (WCAG): to make the repository usable for people and institutions with different needs and levels of digital skills.
  • Integration with Pandora tools: designed for initial data analysis before moving to more advanced workflows.

A separate feature is the chatbot, designed to help navigate resources: it searches descriptive metadata and full-text data (for example notes, descriptions, and GIS data).

RADOGOST is also meant to be visible on a broader scale thanks to integration with ARIADNE: a European portal/aggregator that collects, links, and manages archaeological data by harvesting dataset descriptions (metadata) from multiple repositories, making them easier to find and compare beyond local and regional contexts. In practice, this means Polish collections can “exit” local circulation and gain a greater chance that someone will find them and reuse them meaningfully.

Ilustracja wykonana w Midjourey, promt: obraz abstrakcyjny, wczesnośredniowieczny rycerz w centrum danych ed. J.M.Chyla
Ilustration made in Midjourney, promt: abstract painting of a early medieval warrior in data center
ed. J.M.Chyla

FAIR: four letters that make a difference

If you have ever received a file called “results_final.xlsx” without column descriptions and without a date, you know the problem. FAIR is an attempt to reduce this chaos through four principles:

  • Findable: it can be found (not only by the author),
  • Accessible: access is possible (with clear rules),
  • Interoperable: it can be connected with other data/tools,
  • Reusable: it can be reused because we know how it was created.

RADOGOST is designed in this logic: a repository is not a “file warehouse”, but infrastructure that supports ordering, standardisation, and reuse.

Ilustracja wykonana w Midjourey, promt: obraz abstrakcyjny, wczesnośredniowieczny rycerz w centrum danych ed. J.M.Chyla
Ilustration made in Midjourney, promt: abstract painting of a early medieval warrior in data center
ed. J.M.Chyla

Who is behind it and where does the funding come from?

RADOGOST is part of the broader DARIAH-HUB initiative financed under Poland’s National Recovery Plan (KPO). DARIAH-HUB develops a national ecosystem of tools and services for research on heritage and archaeological materials, including an archaeological module, training components, and tools for data harmonisation.

RADOGOST represents a shift in thinking: from “data as a private archive” to “data as a shared scientific resource that can be used without losing quality and context”. Today’s research toolkit is not only a trowel and a brush, but also data standards, repositories, and methodological transparency.

The project Digital Research Infrastructure for the Humanities and Art Studies DARIAH-PL, led by the Institute of Computer Science of the Polish Academy of Sciences (IPI PAN), is implemented under investment A2.4.1 of the National Recovery Plan and Resilience Enhancement.

Repository link: radogost.dariah.pl

Rozpowszechniaj

Leave a Reply

Your email address will not be published. Required fields are marked *