Documentation
The DSETools Project
This documentation provides the methodological details underlying the Catalogue of Software Tools for Digital Scholarly Editing. The database was designed to map the ecosystem of software tools used within the field of digital philology and scholarly editing.
The primary goal is to provide researchers with a structured entry point to discover tools that can be integrated into different stages of an editorial workflow.
This catalogue was conceived as an appendix to my PhD thesis (http://dx.doi.org/10.25434/chiara-martignano_phd2023), and was later published on Zenodo as an Excel spreadsheet. This second version of the catalogue is the result of completely refactoring the data model behind it, following the research I conducted within the PRIN 2022 ATLAS project (https://dh-atlas.github.io/). The second version's data model is compliant with the ATLAS ontology (https://dh-atlas.github.io/deliverables/ontology/2.0/index-en) and enables in-depth description of the software tools according to the FAIR principles (https://doi.org/10.1038/sdata.2016.18
Database Structure
Each record in the Airtable database is classified according to the following macro-categories:
- Name: the name of the tool.
- Landing page: the URL of the official website, repository, or wiki that the creators of the tool offer as a showcase to learn about the tool.
- Description: a brief description of the tool, detailing its main features and system requirements.
- Category: one of the three macro-categories identified during the creation of the catalogue: 'Visualization', tools intended exclusively for the visualization (or publication) of a DSE; 'Mixed', tools that can be used for both processing and publishing a DSE; and finally, 'Production', tools that can replace or complement the philologist in a particular phase of the editorial process. The distinction between the proposed categories is not always clear-cut; tools have generally been categorized based on their main functionalities and the type of output or result they allow to obtain. A tool is categorized as either for visualization or mixed when it allows the production of a result that can be defined as a digital scholarly edition.
- Research activities: one or more of the activities listed in the TaDiRAH taxonomy that best describe the main functionalities of the tools.
- Status: describes the current status of the software tool. Possible values are: 'Completed', the tool is considered to be complete, as there will be no major modifications in the future; 'Deprecated', it is recommended to no longer use the tool; 'Under development'; 'Discontinued', the tool is no longer updated; 'Withdrawn', the tool is no longer maintained.
- Access rights: describes how users can access the tool. Possible values are: 'Open access', the tool is immediately and permanently online, and free for all on the Web, without financial and technical barriers; 'Restricted access', the tool is available in a system but with some type of restriction for full open access, for example: the user must log-in to the system or must send an email to the system administrator; 'Metadata only access', only the metadata about the tool are availble; 'Embargoed access', the tool is metadata only access until released for open access on a certain date.
- Current version: the number of the latest version of the tool, formatted according to existing standards such as semantic versionig (e.g., 1.0.0).
- Release date: the release date of the version listed above. Wehen only the year is available the date is set on January first of that year. Respectively, when only the year and month are available, the date is set on the first day of the month.
- License: the shortened name of the license under which the tool is available (e.g., GPL-3.0 instead of GNU General Public License version 3).
- Creator: the name of the creator or creators of the tool. In the case of commercial software, the name of the company that produced it is provided. Similarly, if the tool was produced by a research center, a collective, or another type of group with an official name, the group’s name is provided. If the tool was created by one or more scholars, their respective names and affiliations are indicated.
- Contributor: the full name of the people or organizations who contributed to the creation of the tool.
- Publisher: the names of the organization that funded the development of the tool. These can be universities, research centers, cultural institutions, public projects, and funds. Since many of these funds are time-limited, I chose to include this information as it helps understand how stable a tool is in terms of development, maintenance, and user support.
- Research project: the name of the research project in which the tool is developed.
- Access Point: the URL to access the tool online.
- Download: the URL to download the tool.
- Documentation: the URL to the tool's documentation.
- Input format: the various formats accepted by the tool as input. In some cases, it is not possible to obtain exact formats, as official sources indicate “text” or “image” generically. In these cases, the most commonly used formats, which are likely supported, are listed: TXT for text and JPEG for images.
- Output format: the output formats in which the tool allows exporting the DSE or other types of data. It should be noted that some tools, especially among mixed or visualization-only ones, do not provide data export.
- Programming language: languages used to develop the tool.
- Research product reuse: other research products used to develop the tool.
- Based on: tools, libraries, and frameworks used to develop the tool. Some of the listed tools have been integrated into other tools (for example, OpenSeadragon and VisColl have been integrated into EVT).
- Collaborative working: indicates whether the tool is designed for collaborative work, allowing multiple users to work simultaneously on the same materials.
- Code repository URL: the URL of the repository where the source code of the tool is stored.
- Tool language: the language(s) used within the tool’s user interface, documentation and website.
- Supported language: the language(s) supported by the tool. This piece of information is particularly valuable for automatic transcription and collation tools, lemmatisers and other types of computational tools for linguistic analysis.
- Note on software tool: in this field, various pieces of information are entered, such as the type of tool (e.g. desktop software, web-based service, library or web-based application), its system requirements, its relationship with other tools, and the availability of any missing information.
- Editions: the names of DSEs that have been created using the tool. This field is very useful as it allows to see concretely how a DSE is presented thanks to the tool and/or what scientific results the tool enables.
- Bibliographic references: bibliographic references to publications in which the creators present their tool or other scholars review or report their experience using the software with a concrete use case.
- Identifier: a permanent unique identifier of the tool, such as a DOI or Handle.
- Record contributor: the full name of the person who contributed to entering the tool’s information.
- Last updated: the date and time the tool entry was last updated.
Selection Criteria
It was not possible to provide all the above-listed information for many tools, either because it was not available in official sources (primarily websites and repositories), or because updated and reliable bibliographic sources could not be identified. The omission of information does not imply that it cannot be obtained through further in-depth study or by contacting the creators themselves. Furthermore, the information may no longer be up to date or valid; for instance, hyperlinks to websites may be inactive.
This catalogue is not exhaustive as it mainly includes tools produced in Italy, Europe and the United States.
How to Contribute
The catalogue is a community-driven resource. If you are a developer, a researcher, or a user, you can suggest new additions or report inaccuracies.
Feel free to contact me at chiaramartignano[at]gmail[dot]com.
References
- Martignano, Chiara. 2025. “A Catalogue of Software Tools for Digital Scholarly Editing”. Umanistica Digitale 9 (19):1-15. https://doi.org/10.6092/issn.2532-8816/21093.
- Martignano, Chiara. 2023. ‘On Why and How We Should Build a Catalogue of Software Products for Digital Scholarly Editing’. In Memoria Digitale: Forme Del Testo e Organizzazione Della Conoscenza. Atti Del XII Convegno Annuale AIUCD. (Siena), 130–33. https://doi.org/10.6092/unibo%2Famsacta%2F7721.
- Martignano, Chiara, Giorgia Rubin, Sebastiano Giacomini, Alessia Bardi, Marina Buzzoni, Marilena Daquino, Riccardo Del Gratta, Chiara Martignano et al. ‘ATLAS: A Data Model for Describing FAIR Digital Humanities Research Outcomes’. In Diversità, Equità e Inclusione: Sfide e Opportunità per l’Informatica Umanistica Nell’Era Dell’Intelligenza Artificiale. Verona: AIUCD, 2025. https://doi.org/10.6092/unibo/amsacta/8380.
- Martignano, Chiara, Alessia Bardi, Marina Buzzoni, Marilena Daquino, RICCARDO DEL GRATTA, ANGELO MARIO DEL GROSSO, Franz Fischer, et al. «DH ATLAS: White Book V1.2». Zenodo, 25 febbraio 2025. https://doi.org/10.5281/zenodo.14925266.
Acknowledgements
I would like to thank the ATLAS project team, especially Sebastiano Giacomini and Giorgia Rubin, for our collaborative work on the FAIR description of research products in the Digital Humanities.
I would also like to thank Marcus Pöckelmann and Roberto Rosselli del Turco for their contributions to the collection and description of automatic collation software tools.