End-to-end platform evaluation for Spanish Handwritten Text Recognition
Abstract
The task of automated recognition of handwritten texts requires various phases and technologies both optical and language related. This article describes an approach for performing this task in a comprehensive manner, using machine learning throughout all phases of the process. In addition to the explanation of the employed methodology, it describes the process of building and evaluating a model of manuscript recognition for the Spanish language. The original contribution of this article is given by the training and evaluation of Offline HTR models for Spanish language manuscripts, as well as the evaluation of a platform to perform this task in a complete way. In addition, it details the work being carried out to achieve improvements in the models obtained, and to develop new models for different complex corpora that are more difficult for the HTR task.
Downloads
References
Bluche, T. (2015), Deep neural networks for large vocabulary handwritten text recognition, PhD thesis, Paris 11.
Castro, D.; Bezerra, B. L. D. & Valença, M. (2018), Boosting the deep multidimensional long-short-term memory network for handwritten recognition systems, in 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 127--132.
De Sousa Neto, A. F.; Bezerra, B. L. D.; Toselli, A. H. & Lima, E. B. (2020), HTR-Flor++: A Handwritten Text Recognition System Based on a Pipeline of Optical and Language Models, in Proceedings of the ACM Symposium on Document Engineering 2020, Association for Computing Machinery, New York, NY, USA.
Granell, E.; Romero, V. & Martinez-Hinarejos, C.-D. (2020), Study of the influence of lexicon and language restrictions on computer assisted transcription of historical manuscripts, Neurocomputing.
Graves, A.; Fernández, S.; Gomez, F. & Schmidhuber, J. (2006), Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, in Proceedings of the 23rd international conference on Machine learning, pp. 369--376.
Graves, A. & Schmidhuber, J. (2009), Offline handwriting recognition with multidimensional recurrent neural networks, in Advances in neural information processing systems, pp. 545--552.
Grüning, T., Leifert, G., Strauß, T., Michael, J., & Labahn, R. (2019). A two-stage method for text line detection in historical documents. International Journal on Document Analysis and Recognition (IJDAR), 22(3), 285-302.
Jeong, J.; Park, H. & Kwak, N. (2017), Enhancement of SSD by concatenating feature maps for object detection, CoRR abs/1705.09587.
Kang, L.; Riba, P.; Rusiсol, M.; Fornés, A. & Villegas, M. (2020), Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition, arXiv preprint arXiv:2005.13044.
Kar, R.; Saha, S.; Bera, S. K.; Kavallieratou, E.; Bhateja, V. & Sarkar, R. (2019), Novel approaches towards slope and slant correction for tri-script handwritten word images, The Imaging Science Journal 67(3), 159--170.
Liebl, B., & Burghardt, M. (2020). An Evaluation of DNN Architectures for Page Segmentation of Historical Newspapers. arXiv preprint arXiv:2004.07317.
Michael, J.; Labahn, R.; Grüning, T. & Zöllner, J. (2019), Evaluating sequence-to-sequence models for handwritten text recognition, in 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1286--1293.
More, P. K. & Dighe, D. D. (2016), A review on document image binarization technique for degraded document images, Int. Res. J. Eng. Technol, 1132--1138.
Neto, A. F. S.; Bezerra, B. L. D. & Toselli, A. A. H. (2020), Towards the Natural Language Processing as Spelling Correction for Offline Handwritten Text Recognition Systems, Applied Sciences 10(21), 7711.
Niblack, W. (1986), An Introduction to Digital Image Processing (Englewood CliVs, NJ, Prentice-Hall.
Oliveira, S. A.; Seguin, B. & Kaplan, F. (2018), dhSegment: A generic deep-learning approach for document segmentation, in 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 7--12.
Otsu, N. (1979), A threshold selection method from gray-level histograms, IEEE transactions on systems, man, and cybernetics 9(1), 62--66.
Reul, C.; Christ, D.; Hartelt, A.; Balbach, N.; Wehner, M.; Springmann, U.; Wick, C.; Grundig, C.; Büttner, A. & Puppe, F. (2019), OCR4all—An open-source tool providing a (semi-) automatic OCR workflow for historical printings, Applied Sciences 9(22), 4853.
Romero, V.; Fornés, A.; Serrano, N.; Sánchez, J. A.; Toselli, A. H.; Frinken, V.; Vidal, E. & Lladуs, J. (2013), The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition, Pattern Recognition 46(6), 1658--1669.
Sánchez, J. A.; Mühlberger, G.; Gatos, B.; Schofield, P.; Depuydt, K.; Davis, R. M.; Vidal, E. & de Does, J. (2013), tranScriptorium: a european project on handwritten text recognition, in Proceedings of the 2013 ACM symposium on Document engineering, pp. 227--228.
Sánchez, J. A.; Romero, V.; Toselli, A. H.; Villegas, M. & Vidal, E. (2019), A set of benchmarks for handwritten text recognition on historical documents, Pattern Recognition 94, 122--134.
Sarathy, S. & Manikandan, J. (2018), Design and evaluation of a real-time character recognition system, in 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 519--525.
The articles published in the journal Ciencia y Tecnología are the exclusive property of their authors. Their opinions and content belong to their authors, and the Universidad de Palermo declines all responsibility for the rights that may arise from reading and/or interpreting the content of the published articles.
The reproduction, use or exploitation by any third party of the published articles is not authorized. Its use is only authorized for exclusively academic and/or research purposes.