Generative AI for Handwritten and Damaged Heritage - ICDAR 2024

French version

In August 2024, our team successfully participated in the International Conference on Document Analysis and Recognition (ICDAR 2024) held in Athens. As the first gathering for the Computer Vision community focused on handwritten document understanding, knowledge enhancement and AI for heritage, ICDAR provided an exceptional platform for researchers, professionals and institutions to exchange ideas, showcase innovations, and discuss the latest advancements in the field.

ICDAR

During the conference, Chahan, Aliénor, Marie and Boris delivered three comprehensive presentations that highlighted our latest research endeavors for heritage preservation and oriental scripts processing. The first presentation delved into our ongoing project on deciphering damaged Armenian inscriptions. These ancient texts, often marred by weathering and deterioration over centuries — or intentional destruction, as we are currently observing with Azerbaijani efforts to destroy Armenian heritage — present significant challenges for accurate interpretation: the letters are ambiguous, sometimes illegible, even for specialists. We explored and discussed the role of artificial intelligence in detecting and deciphering these inscriptions. Our models, trained on a wide range of graffitis and damaged inscriptions dating before the 9th century, successfully detect up to 90% of the content. These results provide researchers with time-saving tool for the analysis of inscriptions. This represents a first step in enhancing accessibility to image databases for the documentation of Armenian inscriptions.

The paper: Vidal-Gorène, C., Decours-Perez, A. (2024). Detecting and Deciphering Damaged Medieval Armenian Inscriptions Using YOLO and Vision Transformers. In: Mouchère, H., Zhu, A. (eds) Document Analysis and Recognition – ICDAR 2024 Workshops. ICDAR 2024. Lecture Notes in Computer Science, vol 14936. Springer, Cham. https://doi.org/10.1007/978-3-031-70642-4_2

ICDAR Armenian inscriptions
Detection of characters in Armenian inscription

In our second presentation, Chahan showcased our pioneering work on the restoration of burnt manuscripts using generative networks (GANs). This project, undertaken in collaboration with the École nationale des chartes - PSL (Paris), aims to assist in reading damaged documents and to provide a visual restoration of the original manuscript. Our experiments were conducted specifically in the context of generating fake manuscripts for training Handwritten Text Recognition (HTR) models. For ancient and rare languages, the number of pages that need to be manually transcribed to obtain a good HTR model is substantial, even though platforms like Calfa Vision can considerably accelerate this work. The use of GANs for generating training data for Optical Character Recognition (OCR) systems is crucial for recognizing scripts with limited resources. Our current experiments show a recognition gain of up to 30% on new scripts or vocabulary.

The paper: Vidal-Gorène, C., Camps, JB. (2024). Image-to-Image Translation Approach for Page Layout Analysis and Artificial Generation of Historical Manuscripts. In: Mouchère, H., Zhu, A. (eds) Document Analysis and Recognition – ICDAR 2024 Workshops. ICDAR 2024. Lecture Notes in Computer Science, vol 14936. Springer, Cham. https://doi.org/10.1007/978-3-031-70642-4_9

See also: Vidal-Gorène, C., Camps, JB., Clérice, T. (2024). Synthetic Lines from Historical Manuscripts: An Experiment Using GAN and Style Transfer. In: Foresti, G.L., Fusiello, A., Hancock, E. (eds) Image Analysis and Processing - ICIAP 2023 Workshops. ICIAP 2023. Lecture Notes in Computer Science, vol 14366. Springer, Cham. https://doi.org/10.1007/978-3-031-51026-7_40

ICDAR Armenian inscriptions
Using Generative networks to generate and restore manuscripts

Our third presentation focused on the task of recognizing Chinese handwriting, a writing system renowned for its complexity due to the vast number of unique characters and a very specific reading order. In collaboration with Marie Bizais-Lillig from the Université de Strasbourg, and within the scope of the ChiKnowPo project (funded by Collex Persée), of which Calfa was a partner, we explored different strategies to enhance AI's ability to accurately identify and interpret handwritten Chinese characters using generative networks once again. Through the creation of a specialized dataset, the final model achieves good recognition scores between 96% and 99%, and a correct reading order of 97.81%.

The paper: Bizais-Lillig, M., Vidal-Gorène, C., Dupin, B. (2024). Optimizing HTR and Reading Order Strategies for Chinese Imperial Editions with Few-Shot Learning. In: Mouchère, H., Zhu, A. (eds) Document Analysis and Recognition – ICDAR 2024 Workshops. ICDAR 2024. Lecture Notes in Computer Science, vol 14936. Springer, Cham. https://doi.org/10.1007/978-3-031-70642-4_3

Data and GitHub: tps://github.com/calfa-co/chi-know-po

ICDAR Armenian inscriptions
Layout detection and understanding of Chinese manuscript

Beyond our presentations, we engaged in numerous exciting discussions with fellow attendees, exchanging ideas on documentation and the preservation of written heritage, as well as the new state-of-the-art approaches for Text Recognition and document analysis. Our publications available in the proceedings of the conference not only document our advancements but also provide open-access code repositories, enabling other members of the community to utilize and build upon our work.

Calfa Team