Preliminary Research on Computer-Assisted Transcription of Medieval Scripts in the Latin Alphabet using AI Computer Vision techniques and Machine Learning. A Romanian Exploratory Initiative

Authors

  • Adinel C. DINCĂ Babeș-Bolyai University Zetta Cloud, Cluj-Napoca; adinel.dinca@gmail.com
  • Emil ȘTEȚCO Babeș-Bolyai University Zetta Cloud, Cluj-Napoca; emil.stetco@zettacloud.ro

DOI:

https://doi.org/10.24193/subbdigitalia.2020.1.03

Keywords:

Middle Ages, Latin writing, palaeography, Artificial Intelligence, Computer Vision, automatic transcription.

Abstract

The objective of the present paper is to introduce to a wider audience, at a very early stage of development, the initial results of a Romanian joint initiative of AI software engineers and palaeographers in an experimental project aiming to assist and improve the transcription effort of medieval texts with AI software solutions, uniquely designed and trained for the task. Our description will start by summarizing the previous attempts and the mixed-results achieved in e-palaeography so far, a continuously growing field of combined scholarship at an international level. The second part of the study describes the specific project, developed by Zetta Cloud, with the aim of demonstrating that, by applying state of the art AI Computer Vision algorithms, it is possible to automatically binarize and segment text images with the final scope of intelligently extracting the content from a sample set of medieval handwritten text pages.

References

Aiolli, Fabio & Ciula, Arianna. “A case study on the System for Paleographic Inspections (SPI): challenges and new developments”. Proceedings of the 2009 Conference on Computational Intelligence and Bioengineering: Essays in Memory of Antonina Starita, IOS Press, 2009, pp. 53-66.

Aiolli, Fabio & Simi, Maria & Sona, Diego & Sperduti, Alessandro & Starita, Antonina & Zaccagnini, Gabriele. “SPI: A System for Paleographic Inspections”. AIIA Notizie, vol. 4, 1999, pp. 34-48.

Aussems, Mark & Brink, Axel. “Digital palaeography”. Codicology and palaeography in the digital age 2. Edited by Rehbein, Malte & Sahle, Patrick & Schassan, Torsten, BoD, 2009, pp. 293-308.

Bertrand, Paul. “La numérisation des actes: evolutions, révolutions. Vers une nouvelle forme d’édition de textes diplomatiques?”. Vom Nutzen des Edierens. Akten des internationalen Kongresses zum 150-jährigen Bestehen des Instituts für Österreichische Geschichtsforschung, Wien, 3.-5. Juni 2004. Merta, Brigitte & Sommerlechner, Andrea & Weigl, Herwig Böhlau, 2005, pp. 171-176.

Bruckner, Albert. Scriptoria medii aevi Helvetica: Denkmäler schweizerischer Schreibkunst des Mittelalters, vol. VIII. Schreibschulen der Diözese Konstanz, Stift Engelberg, Genf, 1950.

Ciula, Arianna. “Digital palaeography: Using the digital representation of medieval script to support palaeographic analysis”. Digital Medievalist, 1, 2005. www.digitalmedievalist.org/journal/1.1/ciula/. [20.06.2020].

Cloppet, Florence & Daher, Hani & Églin, Véronique & Emptoz, Hubert & Exbrayat, Matthieu & Joutel, Guillaume & Lebourgeois, Frank & Martin, Lionel & Moalla, Ikram & Siddiqi, Imran & Vincent, Nicole. “New Tools for Exploring, Analysing and Categorising Medieval Scripts”. Digital Medievalist, vol. 7, 2011. journal.digitalmedievalist.org/articles/10.16995/dm.44/. [20.06.2020].

Derolez, Albert. The palaeography of Gothic manuscript books from the twelfth to the early sixteenth century. Cambridge University Press, 2003.

Dincă, Adinel C. “Datarea manuscriselor medievale latineşti. Evaluari metodologice”, Anuarul Institutului de Istorie «George Bariţiu» din Cluj-Napoca, Series Historica, tome L, 2011, pp. 295-306.

Dincă, Adinel C. “The Medieval Book in Early Modern Transylvania. Preliminary Assessments”, Studia UBB, Historia, vol. 62, Issue 1, 2017, pp. 23-34.

Fischer, Andreas & Wüthrich, Markus & Liwicki, Marcus & Frinken, Volkmar & Bunke, Horst & Viehhauser, Gabriel & Stolz, Michael. Automatic Transcription of Handwritten Medieval Documents. Conference paper at Proc. 15th Int. Conf. on Virtual Systems and Multimedia (VSMM'09), 2009. DOI: 10.1109/VSMM.2009.26.

Granasztói, György. “Computerized Analysis of a Medieval Tax Roll. Acta Historica Academiae Scientiarum Hungaricae, vol. 17, no. 1/2, 1971, pp. 13-25.

Hassner, Tal & Rehbein, Malte & Stokes, Peter A. & Wolf, Lior. “Manifesto from Dagstuhl Perspectives Workshop 12382. Computation and Palaeography: Potentials and Limits”. Dagstuhl Manifestos, vol. 2, issue 1, 2012, pp. 14-35, drops.dagstuhl.de/opus/volltexte/2013/4167/pdf/dagman-v002-i001-p014-12382.pdf. [20.06.2020].

Hassner, Tal & Sablatnig, Robert & Stutzmann, Dominique & Tarte, Ségolène. “Report from Dagstuhl Seminar 14302. Digital Palaeography: New Machines and Old Texts”. Dagstuhl Reports, vol. 4, issue 7, 2014, pp. 112-134. www.researchgate.net/

publication/269168418_Digital_Palaeography_New_Machines_and_Old_Texts_Dagstuhl_Seminar_14302 [20.06.2020].

Kestemont, Mike & Christlein, Vincent & Stutzmann, Dominique. “Artificial Paleography: Computational Approaches to Identifying Script Types in Medieval Manuscripts”. Speculum, vol. 92 (S1), 2017, pp. S86-S109. DOI: 10.1086/694112.hal-01854939. [20.06.2020]

Kiessling, Benjamin. Kraken – a Universal Text Recognizer for the Humanities. Paper presented at Digital Humanities Conference 2019 (DH2019), Utrecht, the Netherlands. doi.org/10.34894/Z9G2EX, dev.clariah.nl/files/dh2019/boa/0673.html [20.06.2020]

Lehmann, Paul (ed.). Ludwig Traube, Zur Paläographie und Handschriftenkunde, Beck, 1909.

Muzerelle, Denis & Gurrado, Maria, (eds.), Analyse d’image et paléographie systématique : travaux du programme “Graphem”: communications présentées au colloque international “Paléographie fondamentale, paléographie expérimentale: l’écriture entre histoire et science” (Institut de recherche et d’histoire des textes (CNRS), Paris, 14-15 avril 2011). Association Gazette du livre médiéval, 2011.

Newman, M. E. J. “Power laws, Pareto distributions and Zipf’s law”. Contemporary Physics, vol. 46, no. 5, 2005. DOI:10.1080/00107510500052444.arxiv.org/PS_cache/cond-mat/pdf/0412/0412004v3.pdf [20.06.2020]

Oriflamms. Compte-rendu final du projet ORIFLAMMS / ORIFLAMMS Final report. 2017. oriflamms.hypotheses.org/files/2017/04/Oriflamms-Compte-rendu-final.pdf. [20.06.2020].

Putnam, George F. “Soviet historians, quantitative methods, and digital computers”, Computers and the Humanities, vol. 6, Issue 1, September 1971, pp. 23-29.

Schnapp, Jeffrey & Presner, Todd & Lunenfeld, Peter & Drucker, Johanna. Digital Humanities Manifesto 2.0, jeffreyschnapp.com/wp-content/uploads/2011/10/Manifesto_V2.pdf. [20.06.2020]

Słoń, Marek. “Pryncypia edytorstwa źródeł historycznych w dobie rewolucji cyfrowej [Principles of Editing Historical Sources at the Time of the Digital Revolution]”, Studia Źródłoznawcze/Studies in Historical Sources, vol. LIII, 2015, pp. 155-161.

Stokes, Peter A. & Kiessling, Benjamin & Tissot, Robin & Stökl Ben Ezra, Daniel. EScripta: A New Digital Platform for the Study of Historical Texts and Writing, paper presented at Digital Humanities Conference 2019 (DH2019), Utrecht, the Netherlands. hal-02310781. dev.clariah.nl/files/dh2019/boa/0322.html [20.06.2020].

Stokes, Peter A. “Computer-Aided Palaeography, Present and Future”. Kodikologie und Paläographie im digitalen Zeitalter / Codicology and Palaeography in the Digital Age, BoD, 2009, pp. 309-338, kups.ub.uni-koeln.de/2978/. [20.06.2020].

Stokes, Peter A. “Computer-Aided Palaeography, Present and Future”. Kodikologie und Paläographie im digitalen Zeitalter / Codicology and Palaeography in the Digital Age. Edited by Rehbein, Malte & Sahle, Patrick & Schaßan, Torsten, BoD, 2009, pp. 309-38.

Stokes, Peter A. “Digital Resource and Database for Palaeography, Manuscripts and Diplomatic”. Gazette du livre médiéval, vol. 56-57, 2011, pp. 141-142; www.persee.fr/doc/galim_0753-5015_2011_num_56_1_1991. [20.06.2020].

Stokes, Peter A., “Palaeography and Image-Processing: Some Solutions and Problems”. Digital Medievalist, vol. 3, 2007. http://doi.org/10.16995/dm.15. [20.06.2020].

Stutzmann, Dominique & Kermorvant, Christopher & Vidal, Enrique & Chanda, Sukalpa & Hamel, Sébastien & Puigcerver Pérez, Joan & Schomaker, Lambert & Toselli, Alejandro H. “Handwritten Text Recognition, Keyword Indexing, And Plain Text Search In Medieval Manuscripts”. Conference paper at Digital Humanities 2018 Puentes-Bridges. Book of Abstracts, p. 298-302. dh2018.adho.org/wp-content/uploads/2018/06/dh2018_abstracts.pdf. [20.06.2020].

Stutzmann, Dominique. “Clustering of medieval scripts through computer image analysis: Towards an evaluation protocol”. Digital Medievalist, vol. 10, 2016. DOI: http://doi.org/10.16995/dm.61. [20.06.2020]

Wakelin, Daniel. “«An anthology of images»: DIY digital photography in manuscript studies”, DIY Digitization. diydigitization.org/contributed-papers/wakelin/ [20.06.2020]

Widner, Michael. “Toward Text-Mining the Middle Ages: Digital Scriptoria and Networks of Labor”. The Routledge research companion to digital medieval literature. Edited by Boyle, Jennifer & Burgess, Helen J., Routledge, 2017, pp. 131-144.

Downloads

Published

2020-12-10

How to Cite

DINCĂ, A. C., & ȘTEȚCO, E. (2020). Preliminary Research on Computer-Assisted Transcription of Medieval Scripts in the Latin Alphabet using AI Computer Vision techniques and Machine Learning. A Romanian Exploratory Initiative. Studia Universitatis Babeș-Bolyai Digitalia, 65(1), 37–52. https://doi.org/10.24193/subbdigitalia.2020.1.03

Issue

Section

Articles

Similar Articles

1 2 > >> 

You may also start an advanced similarity search for this article.