End-to-end platform evaluation for Spanish Handwritten Text Recognition

  • Eduardo Xamena CONICET- Universidad Nacional de Salta
  • Héctor Emanuel Barboza Universidad Nacional de Salta
  • Carlos Ismael Orozco Universidad Nacional de Salta
Keywords: handwritten text recognition, segmentation, end-to-end htr, historical manuscripts processing

Abstract

The task of automated recognition of handwritten texts requires various phases and technologies both optical and language related. This article describes an approach for performing this task in a comprehensive manner, using machine learning throughout all phases of the process. In addition to the explanation of the employed methodology, it describes the process of building and evaluating a model of manuscript recognition for the Spanish language. The original contribution of this article is given by the training and evaluation of Offline HTR models for Spanish language manuscripts, as well as the evaluation of a platform to perform this task in a complete way. In addition, it details the work being carried out to achieve improvements in the models obtained, and to develop new models for different complex corpora that are more difficult for the HTR task.

Downloads

Download data is not yet available.

Author Biographies

Eduardo Xamena, CONICET- Universidad Nacional de Salta

Doctor en Ciencias de la Computación, Universidad Nacional del Sur (2015). Investigador del CONICET. Miembro del  nstituto de Investigación en Ciencias Sociales y Humanidades (ICSOH)- CONICET - UNSa. Docente Universitario en la Universidad Nacional de Salta. Investigador Categoría IV (Ingeniería) de acuerdo a la Secretaría de Políticas Universitarias SPU. Trabaja en temas relacionados a la Recuperación de Información. Lleva a cabo diversas investigaciones en temáticas que involucran Machine Learning, Procesamiento del Lenguaje Natural y Minería de Datos sobre la Historia Argentina y la Justicia Provincial.

Héctor Emanuel Barboza, Universidad Nacional de Salta

Estudiante avanzado de Licenciatura en Análisis de Sistemas en Departamento de Informática, Facultad de Ciencias Exactas - Universidad Nacional de Salta, Argentina. Investigador alumno con una Beca de Estímulo a la Vocación Científica (EVC) otorgada por el Consejo Interuniversitario Nacional (CIN). Auxiliar docente Universitario. Actualmente Analista programador en la empresa Nuntius IT especializado en Scrum para el desarrollo de software.

Carlos Ismael Orozco, Universidad Nacional de Salta

Licenciado en Análisis de Sistemas, Universidad Nacional de Salta (2013). Investigador Categoría V (Ingeniería) de acuerdo a la Secretaría de Políticas Universitarias SPU. Actualmente Doctorando en Ciencias de la Computación – Universidad de Buenos Aires. Argentina. Grupo de Investigación Procesamiento de Imágenes y Visión por Computadora. Entre sus áreas de interés se destacan el procesamiento de imágenes y redes neuronales profundas. Se desempeña como Jefe de Trabajos Prácticos en el Departamento de Informática, Facultad de Ciencias Exactas, Universidad Nacional de Salta.

References

Ahlawat, S. & Rishi, R. (2017), Off-line handwritten numeral recognition using hybrid feature set–a comparative analysis, Procedia computer science 122, 1092--1099.
Bluche, T. (2015), Deep neural networks for large vocabulary handwritten text recognition, PhD thesis, Paris 11.
Castro, D.; Bezerra, B. L. D. & Valença, M. (2018), Boosting the deep multidimensional long-short-term memory network for handwritten recognition systems, in 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 127--132.
De Sousa Neto, A. F.; Bezerra, B. L. D.; Toselli, A. H. & Lima, E. B. (2020), HTR-Flor++: A Handwritten Text Recognition System Based on a Pipeline of Optical and Language Models, in Proceedings of the ACM Symposium on Document Engineering 2020, Association for Computing Machinery, New York, NY, USA.
Granell, E.; Romero, V. & Martinez-Hinarejos, C.-D. (2020), Study of the influence of lexicon and language restrictions on computer assisted transcription of historical manuscripts, Neurocomputing.
Graves, A.; Fernández, S.; Gomez, F. & Schmidhuber, J. (2006), Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, in Proceedings of the 23rd international conference on Machine learning, pp. 369--376.
Graves, A. & Schmidhuber, J. (2009), Offline handwriting recognition with multidimensional recurrent neural networks, in Advances in neural information processing systems, pp. 545--552.
Grüning, T., Leifert, G., Strauß, T., Michael, J., & Labahn, R. (2019). A two-stage method for text line detection in historical documents. International Journal on Document Analysis and Recognition (IJDAR), 22(3), 285-302.
Jeong, J.; Park, H. & Kwak, N. (2017), Enhancement of SSD by concatenating feature maps for object detection, CoRR abs/1705.09587.
Kang, L.; Riba, P.; Rusiсol, M.; Fornés, A. & Villegas, M. (2020), Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition, arXiv preprint arXiv:2005.13044.
Kar, R.; Saha, S.; Bera, S. K.; Kavallieratou, E.; Bhateja, V. & Sarkar, R. (2019), Novel approaches towards slope and slant correction for tri-script handwritten word images, The Imaging Science Journal 67(3), 159--170.
Liebl, B., & Burghardt, M. (2020). An Evaluation of DNN Architectures for Page Segmentation of Historical Newspapers. arXiv preprint arXiv:2004.07317.
Michael, J.; Labahn, R.; Grüning, T. & Zöllner, J. (2019), Evaluating sequence-to-sequence models for handwritten text recognition, in 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1286--1293.
More, P. K. & Dighe, D. D. (2016), A review on document image binarization technique for degraded document images, Int. Res. J. Eng. Technol, 1132--1138.
Neto, A. F. S.; Bezerra, B. L. D. & Toselli, A. A. H. (2020), Towards the Natural Language Processing as Spelling Correction for Offline Handwritten Text Recognition Systems, Applied Sciences 10(21), 7711.
Niblack, W. (1986), An Introduction to Digital Image Processing (Englewood CliVs, NJ, Prentice-Hall.
Oliveira, S. A.; Seguin, B. & Kaplan, F. (2018), dhSegment: A generic deep-learning approach for document segmentation, in 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 7--12.
Otsu, N. (1979), A threshold selection method from gray-level histograms, IEEE transactions on systems, man, and cybernetics 9(1), 62--66.
Reul, C.; Christ, D.; Hartelt, A.; Balbach, N.; Wehner, M.; Springmann, U.; Wick, C.; Grundig, C.; Büttner, A. & Puppe, F. (2019), OCR4all—An open-source tool providing a (semi-) automatic OCR workflow for historical printings, Applied Sciences 9(22), 4853.
Romero, V.; Fornés, A.; Serrano, N.; Sánchez, J. A.; Toselli, A. H.; Frinken, V.; Vidal, E. & Lladуs, J. (2013), The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition, Pattern Recognition 46(6), 1658--1669.
Sánchez, J. A.; Mühlberger, G.; Gatos, B.; Schofield, P.; Depuydt, K.; Davis, R. M.; Vidal, E. & de Does, J. (2013), tranScriptorium: a european project on handwritten text recognition, in Proceedings of the 2013 ACM symposium on Document engineering, pp. 227--228.
Sánchez, J. A.; Romero, V.; Toselli, A. H.; Villegas, M. & Vidal, E. (2019), A set of benchmarks for handwritten text recognition on historical documents, Pattern Recognition 94, 122--134.
Sarathy, S. & Manikandan, J. (2018), Design and evaluation of a real-time character recognition system, in 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 519--525.
Published
2021-12-20
Section
Articles

Most read articles by the same author(s)