Coptic SCRIPTORIUM

Created and maintained by Caroline T. Schroeder (University of Oklahoma) and Amir Zeldes (Georgetown University).
https://copticscriptorium.org/

Coptic SCRIPTORIUM is a digital platform for interdisciplinary and computational research in texts in the Coptic language. Prof. Carline T. Schroeder and Amir Zeldes’s goal is to make the late Antique Egypt text production more accessible to researchers. The website is a large collection of databases regarding Coptic documents, and digital tools. The project is planned to analyze the importance of Coptic texts in relation to biblical studies, early Christian history, and linguistics. The Advisory Board which consists of scholars with diverse areas of expertise has a robust contribution to digital project management, data curation, and digital editing of documents.

The structure of the main page is simple and straightforward. The website is divided into six resource sections such as “Corpora”, “Coptic Dictionary”, “NLP Service”, “ANNIS Database”, “Tools”, and “About”. At the lower part of the Home page, the latest news is highlighted which is very helpful to the user providing the recent changes or updates regarding the website. The website uses a web application that provides the most current version of the data in all formats using the CTS URN system. 

The Home page

The Corpora portion of the site also looks like it will be a great resource for scholars and students. An incredible feature is that texts are citable and accessible through stable URN (Uniform Resource Name). There is a search option on the right side of the top menu bar where the user can enter the specific URN for the material that one is searching for. The Coptic text files are available in HTML, TEI XML, PAULA XML, and ANNIS formats.  All files can be downloaded from the GitHub site’s corpus repository. The Corpora includes various works from different subjects such as works by Shenoute of Artipe, Besa’s letter of Aphthonia, Books of the Old and New Testament, Coptic Saint’s Lives, and so on. The user can easily search any document by using author, people, places, my name, and advanced search. These works are also presented in normalized, analytic, diplomatic, or chapter view. These are the visualization that depended on the various combinations of annotations and data models.  It appears to me that it’ll be better if they add the original images of manuscripts with the text. It gives a visual effect to the readers.

As an open-source and open-access initiative, this digital project facilitates a collaborative platform for researchers who want to work with Coptic texts. The user is introduced to new ways of interacting with texts. One of these is ANNIS which is created for searches. For new user guidance Tutorial and Cheat Sheet, Video Tutorial are included under the ANNIS Database section. It includes excellent features such as lemmatization (identifying individual words), syntactical analysis (analyzing the grammar of clauses), and entity tagging (identifying ‘things’, usually nouns and noun groups). The visualizations are generated by ANNIS using annotations and CSS. User can easily save their search queries and the result of the search can be downloaded.

Coptic Universal Dependency Treebank

Another great addition is Coptic Dictionary Online, developed with German partners in the KELLIA project, which aims to easily explore Coptic words in all dialects and provide translations in English, French and German via human and machine-readable interfaces. The dictionary is the result of a collaboration between Coptic Scriptorium and lexicographers in Germany at the Berlin-Brandenburg and Göttingen Academies of Science, the Free University in Berlin, and the Universities of Göttingen and Leipzig. This collaboration has been funded by the National Endowment for the Humanities (NEH) and the German Research Foundation (DFG). The Comprehensive Coptic Lexicon includes over 1,267,000 tokens of searchable, linguistically analyzed Coptic data from dozens of ancient Coptic works. This Online Coptic Dictionary is the winner of Best DH Tool or Suite of Tools in 2019. The Quick Search bar is really helpful where one can search for words without returning to the main search page. 

As a Digital history project, it highlights data modeling and data curation for Coptic digital text corpora. The scholarly potential is offered by modern technology, as its utilization of digital and computational tools provides users with insightful visualizations of historical developments regarding analyzing, process, and annotating the Coptic language. For annotation of the digital Coptic text, various digital tools are developed such as Font and Character converters, Tokenizer, Normalizer, Lemmatizer, Part of speech tagger, and Language of origin tagger.

In the News section, users easily get updated information regarding the website. One can search by ‘Recent Posts’, ‘Categories’, or ‘Tags’ options. The website is very up-to-date. All the changes are timely updated as a blog post. The target audience of this site is everyone who wants to research Coptic texts such as graduate students, faculty, and researchers. Under the Documentation section, FAQ supplies quick answers for users’ queries regarding the new terms, tools, and corpora. The website meets all the needs of users in an accessible way. Its well-designed features should provide useful ideas on how to use data visualization, and other tools for those who want to create digital history projects.

This digital project has been supported by a series of funder projects such as Digitizing a Corpus for Interdisciplinary Research in Ancient Egyptian (NEH Preservation and Access Grant), A Corpus, Tools, and Methods for Corpus Linguistics and Computational Historical Research in Ancient Egypt (ODH Start-Up Grant), Kellia, A Linked Digital Environment for Coptic Studies: Integrating Heterogeneous Data with Machine Learning and Natural Language Processing (Digital Humanities Advancement Grant). The focus of these grants is developing more tools that deliver high-accuracy analyses with less manual intervention including spelling variations and growing the collection of corpora.

Coptic SCRIPTORIUM serves as a model for Digital Humanities projects that perfectly utilize historical corpora or corpora in languages outside the Indo-European and Semitic language families. A searchable, richly-annotated corpus of texts using the ANNIS search and visualization architecture establishes a collaborative platform for scholars to contribute texts, and annotations as well as conduct research. The researcher can contribute to the development of the tools through the GitHub site. As it is an open-access project, anyone can reuse and add some digital texts by using GitHub sharing repositories. Overall, the project is well-maintained, and its technology and data are constantly being updated.  Regardless, this is a great digital research project that opens a door for users to learn about digital applications for analyzing texts and contributing to Coptic language and literature.

Ria De
Loyola University Chicago
Reviewed: October 2022

Ria De Ria

Leave a Reply

Your email address will not be published. Required fields are marked *