Attention-based deep learning model for Arabic handwritten text recognition

Main Article Content

Takwa Ben Aïcha Gader
Afef Kacem Echi


Keywords : Arabic handwriting recognition, attention mechanism, BLSTM, CNN, CTC, RNN
Abstract

This work proposes a segmentation-free approach to Arabic Handwritten Text Recognition (AHTR): an attention-based Convolutional Neural Network - Recurrent Neural Network - Connectionist Temporal Classification (CNN-RNN-CTC) deep learning architecture. The model receives as input an image and provides, through a CNN, a sequence of essential features, which are transferred to an Attention-based Bidirectional Long Short-Term Memory Network (BLSTM). The BLSTM gives features sequence in order, and the attention mechanism allows the selection of relevant information from the features sequences. The selected information is then fed to the CTC, enabling the loss calculation and the transcription prediction. The contribution lies in extending the CNN by dropout layers, batch normalization, and dropout regularization parameters to prevent over-fitting. The output of the RNN block is passed through an attention mechanism to utilize the most relevant parts of the input sequence in a flexible manner. This solution enhances previous methods by improving the CNN speed and performance and controlling over model over-fitting. The proposed system achieves the best accuracy of 97.1% for the IFN-ENIT Arabic script database, which competes with the current state-of-the-art. It was also tested for the modern English handwriting of the IAM database, and the Character Error Rate of 2.9% is attained, which confirms the model's script independence.

Article Details

How to Cite
Ben Aïcha Gader, T., & Kacem Echi, A. (2022). Attention-based deep learning model for Arabic handwritten text recognition. Machine Graphics and Vision, 31(1/4), 49–73. https://doi.org/10.22630/MGV.2022.31.1.3
References

G. A. Abandah, F. T. Jamour, and E. A. Qaralleh. Recognizing handwritten Arabic words using grapheme segmentation and recurrent neural networks. International Journal on Document Analysis and Recognition, 17(3):275-291, 2014. https://doi.org/10.1007/s10032-014-0218-7. (Crossref)

R. Ahmad, M. Z. Afzal, S. F. Rashid, M. Liwicki, T. Breuel, and A. Dengel. KPTI: Katib's Pashto text imagebase and deep learning benchmark. In Proc. 2016 15th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), pages 453-458, Shenzhen, China, 23-26 Oct 2016. IEEE. https://doi.org/10.1109/ICFHR.2016.0090. (Crossref)

R. Ahmad, S. Naz, M. Z. Afzal, et al. KHATT: A deep learning benchmark on Arabic script. In Proc. 2017 14th IAPR Int. Conf. Document Analysis and Recognition (ICDAR)}, pages 10-14, Kyoto, Japan, 9-15 Nov 2017. IEEE. https://doi.org/10.1109/ICDAR.2017.358. (Crossref)

R. Ahmad, S. Naz, M. Z. Afzal, et al. A deep learning based Arabic script recognition system: Benchmark on KHAT. International Arab Journal of Information Technology, 17(3):299-305, 2020. https://doi.org/10.34028/iajit/17/3/3. (Crossref)

R. Ahmed, M. Gogate, A. Tahir, et al. Novel deep convolutional neural network-based contextual recognition of Arabic handwritten scripts. Entropy, 23(3):340, 2021. https://doi.org/10.3390/e23030340. (Crossref)

R. Ahmed, K. Dashtipour, M. Gogate, A. Raza, et al. Offline Arabic handwriting recognition using deep machine learning: A review of recent advances. In Advances in Brain Inspired Cognitive Systems. Proc. Int. Conf. Brain Inspired Cognitive Systems (BICS) 2019, pages 457-468, Guangzhou, China, 13-14 Jul 2019. 2020. Springer International Publishing. https://doi.org/10.1007/978-3-030-39431-8_44. (Crossref)

A. A. Al Rababah. Neural networks precision in technical vision systems. International Journal of Computer Science and Network Security, 20(3):29-36, 2020. http://paper.ijcsns.org/07_book/202003/20200305.pdf.

M. Amrouch, M. Rabi, and Y. Es-Saady. Convolutional feature learning and CNN based HMM for Arabic handwriting recognition. In Image and Signal Processing. Proc. Int. Conf. Image and Signal Processing (ICISP) 2018, volume 9887 of Lecture Notes in Computer Science, pages 265-274, Cherbourg, France, 2-4 Jul 2018. Springer. https://doi.org/10.1007/978-3-319-94211-7_29. (Crossref)

Z. Asebriy, S. Raghay, O. Bencharef, and Y. Chihab. Comparative systems of handwriting Arabic character recognition. In Proc. 2014 2nd World Conf. Complex Systems (WCCS), pages 90-93, Agadir, Morocco, 10-12 Nov 2014. https://doi.org/10.1109/ICoCS.2014.7060923. (Crossref)

M. Awni, M. I. Khalil, and H. M. Abbas. Offline Arabic handwritten word recognition: A transfer learning approach. Journal of King Saud University - Computer and Information Sciences, 34(10, Part B):9654-9661, 2022. https://doi.org/10.1016/j.jksuci.2021.11.018. (Crossref)

S. A. Azeem and H. Ahmed. Effective technique for the recognition of offline Arabic handwritten words using hidden markov models. International Journal on Document Analysis and Recognition, 16(4):399-412, 2013. https://doi.org/10.1007/s10032-013-0201-8. (Crossref)

T. Bluche. Deep Neural Networks for Large Vocabulary Handwritten Text Recognition. PhD thesis, Université Paris 11, 2015.

T. Bluche. Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. In Advances in Neural Information Processing Systems 29 - Proc. 30th Conf. NIPS 2016, volume 29, pages 838-846, Barcelona, Spain, 5-10 Dec 2019. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2016/file/2bb232c0b13c774965ef8558f0fbd615-Paper.pdf.

T. Bluche, J. Louradour, and R. Messina. Scan, Attend and Read: End-to-end handwritten paragraph recognition with MDLSTM attention. In Proc. 2017 14th IAPR Int. Conf. Document Analysis and Recognition (ICDAR), pages 1050-1055, Kyoto, Japan, 9-15 Nov 2017. IEEE. https://doi.org/10.1109/ICDAR.2017.174. (Crossref)

T. Bluche and R. Messina. Gated convolutional recurrent neural networks for multilingual handwriting recognition. In Proc. 2017 14th IAPR Int. Conf. Document Analysis and Recognition (ICDAR), pages 646-651, Kyoto, Japan, 9-15 Nov 2017. IEEE. https://doi.org/10.1109/ICDAR.2017.111. (Crossref)

D. Castro, B. L. D. Bezerra, and M. Valença. Boosting the deep multidimensional long-short-term memory network for handwritten recognition systems. In Proc. 2018 16th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), pages 127-132, Niagara Falls, NY, USA, 5-8 Aug 2018. IEEE. https://doi.org/10.1109/ICFHR-2018.2018.00031. (Crossref)

L. Chen, R. Yan, L. Peng, A. Furuhata, and X. Ding. Multi-layer recurrent neural network based offline Arabic handwriting recognition. In Proc. 2017 1st Int. Workshop on Arabic Script Analysis and Recognition (ASAR), pages 6-10, Nancy, France, 3-5 Apr 2017. IEEE. https://doi.org/10.1109/ASAR.2017.8067749. (Crossref)

Z. Chen, Y. Wu, F. Yin, and C.-L. Liu. Simultaneous script identification and handwriting recognition via multi-task learning of recurrent neural networks. In Proc. 2017 14th IAPR Int. Conf. Document Analysis and Recognition (ICDAR), pages 525-530, Kyoto, Japan, 09-15 Nov 2017. IEEE. https://doi.org/10.1109/ICDAR.2017.92. (Crossref)

Y. Chherawala, P. P. Roy, and M. Cheriet. Feature design for offline Arabic handwriting recognition: Handcrafted vs automated? In Proc. 2013 12th IAPR Int. Conf. Document Analysis and Recognition (ICDAR), pages 290-294, Washington, DC, USA, 25-28 Aug 2013. IEEE. https://doi.org/10.1109/ICDAR.2013.65. (Crossref)

A. Chowdhury and L. Vig. An efficient end-to-end neural model for handwritten text recognition. arXiv, 2018. arXiv:1807.07965v2. https://doi.org/10.48550/arXiv.1807.07965.

D. Coquenet, C. Chatelain, and T. Paquet. End-to-end handwritten paragraph text recognition using a vertical attention network. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(1):508-524, 2022. https://doi.org/10.1109/TPAMI.2022.3144899. (Crossref)

P. Doetsch, M. Kozielski, and H. Ney. Fast and robust training of recurrent neural networks for offline handwriting recognition. In Proc. 2014 14th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), pages 279-284, Hersonissos, Greece, 01-04 Sep 2014. IEEE. https://doi.org/10.1109/ICFHR.2014.54. (Crossref)

P. Dreuw, P. Doetsch, C. Plahl, and H. Ney. Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained gaussian HMM: A comparison for offline handwriting recognition. In 2011 18th IEEE Int. Conf. Image Processing (ICIP), pages 3541-3544, Brussels, Belgium, 11-14 Sep 2011. IEEE. https://doi.org/10.1109/ICIP.2011.6116480. (Crossref)

K. Dutta, P. Krishnan, M. Mathew, and C.V. Jawahar. Improving CNN-RNN hybrid networks for handwriting recognition. In Proc. 2018 16th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), pages 80-85, Niagara Falls, NY, USA, 5-8 Aug 2018. IEEE. https://doi.org/10.1109/ICFHR-2018.2018.00023. (Crossref)

B. El Qacimy, A. Hammouch, and M. A. Kerroum. A review of feature extraction techniques for handwritten Arabic text recognition. In Proc. 2015 Int. Conf. Electrical and Information Technologies (ICEIT), pages 241-245, Marrakech, Morocco, 25-27 Mar 2015. IEEE. https://doi.org/10.1109/EITech.2015.7162979. (Crossref)

B. El Qacimy, M. A. Kerroum, and A. Hammouch. Word-based Arabic handwritten recognition using SVM classifier with a reject option. In Proc. 2015 15th Int. Conf. Intelligent Systems Design and Applications (ISDA), pages 64-68, Marrakech, Morocco, 14-16 Dec 2015. IEEE. https://doi.org/10.1109/ISDA.2015.7489190. (Crossref)

A. El-Sawy, M. Loey, and H. El-Bakry. Arabic handwritten characters recognition using convolutional neural network. WSEAS Transactions on Computer Research, 5:11-19, 2017. https://www.wseas.com/journals/articles.php?id=3300.

M. Elleuch and M. Kherallah. Convolutional deep learning network for handwritten arabic script recognition. In Proc. Int. Conf. Hybrid Intelligent Systems (HIS 2019), volume 1179 of Advances in Intelligent Systems and Computing, pages 103-112, Sehore, India, 10-12 Dec 2019. Springer. https://doi.org/10.1007/978-3-030-49336-3_11. (Crossref)

M. Elleuch, R. Maalej, and M. Kherallah. A new design based-SVM of the CNN classifier architecture with dropout for offline Arabic handwritten recognition. Procedia Computer Science, 80:1712-1723, 2016. https://doi.org/10.1016/j.procs.2016.05.512. (Crossref)

M. Elleuch, N. Tagougui, and M. Kherallah. Deep learning for feature extraction of Arabic handwritten script. In Computer Analysis of Images and Patterns. Proc. Int. Conf. Computer Analysis of Images and Patterns (CAIP) 2015, volume 9257 of Lecture Notes in Computer Science, pages 371-382, Valletta, Malta, 2-4 Sep 2015. Springer. https://doi.org/10.1007/978-3-319-23117-4_32. (Crossref)

S. España-Boquera, M. J. Castro-Bleda, J. Gorbe-Moya, and F. Zamora-Martinez. Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(4):767-779, 2010. https://doi.org/10.1109/TPAMI.2010.141. (Crossref)

A. Graves, S. Fern'andez, F. Gomez, and J. Schmidhuber. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In ICML '06: Proc. 23rd Int. Conf. Machine Learning, pages 369-376, Pittsburgh, PA, USA, 25-29 Jun 2006. https://doi.org/10.1145/1143844.1143891. (Crossref)

A. Graves and J. Schmidhuber. Offline handwriting recognition with multidimensional recurrent neural networks. In Advances in Neural Information Processing Systems 21 - Proc. 22nd Conf. NeurIPS 2008, volume 21, pages 545-552. Curran Associates, Inc., 2008. https://proceedings.neurips.cc/paper/2008/file/66368270ffd51418ec58bd793f2d9b1b-Paper.pdf.

Keras Special Interest Group. Keras. simple. flexible. powerful. https://keras.io.

L. Gui, X. Liang, X. Chang, and A. G. Hauptmann. Adaptive context-aware reinforced agent for handwritten text recognition. In Proc. 29th British Machine Vision Conference (BMVC) 2018, volume 207, Newcastle, United Kingdom, 3-6 Sep 2018. British Machine Vision Association and Society for Pattern Recognition. http://bmvc2018.org/contents/papers/0628.pdf.

S. Haboubi, S. Maddouri, N. Ellouze, and H. El-Abed. Invariant primitives for handwritten Arabic script: A contrastive study of four feature sets. In Proc. 2009 10th IAPR Int. Conf. Document Analysis and Recognition (ICDAR), pages 691-697, Barcelona, Spain, 26-29 Jul 2009. IEEE. https://doi.org/10.1109/ICDAR.2009.281. (Crossref)

S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735-1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735. (Crossref)

X. Huang, L. Qiao, W. Yu, et al. End-to-end sequence labeling via convolutional recurrent neural network with a connectionist temporal classification layer. International Journal of Computational Intelligence Systems, 13(1):341-351, 2020. https://doi.org/10.2991/ijcis.d.200316.001. (Crossref)

K. Jayech, M. A. Mahjoub, and N. E. B. Amara. Arabic handwritten word recognition based on dynamic Bayesian network. International Arab Journal of Information Technology, 13(6B):1024-1031, 2016. https://doi.org/10.34028/iajit/16/13/6B. https://iajit.org/PDF/Vol.13, No.3/7681.pdf.

L. Kang, P. Riba, M. Rusiñol, et al. Pay attention to what you read: Non-recurrent handwritten text-line recognition. Pattern Recognition, 129:108766, 2022. https://doi.org/10.1016/j.patcog.2022.108766. (Crossref)

L. Kang, P. Riba, M. Villegas, et al. Candidate fusion: Integrating language modelling into a sequence-to-sequence handwritten word recognition architecture. Pattern Recognition, 112:107790, 2021. https://doi.org/10.1016/j.patcog.2020.107790. (Crossref)

L. Kang, J. I. Toledo, P. Riba, et al. Convolve, attend and spell: An attention-based sequence-to-sequence model for handwritten word recognition. In Proc. 40th German Conf. Pattern Recognition (GCPR) 2018, volume 11269 of Lecture Notes in Computer Science, pages 459-472, Stuttgart, Germany, 9-12 Oct 2018. Springer. https://doi.org/10.1007/978-3-030-12939-2_32. (Crossref)

A. Khémiri, A. K. Echi, and M. Elloumi. Bayesian versus convolutional networks for Arabic handwriting recognition. Arabian Journal for Science and Engineering, 44(11):9301-9319, 2019. https://doi.org/10.1007/s13369-019-03939-y. (Crossref)

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In Proc. 3rd Int. Conf. Learning Representations, ICLR 2015, San Diego, CA, 7-9 May 2015. Accessible in arXiv. https://doi.org/10.48550/arXiv.1412.6980.

M. Kozielski, P. Doetsch, and H. Ney. Improvements in RWTH's system for off-line handwriting recognition. In Proc. 2013 IAPR 12th Int. Conf. Document Analysis and Recognition (ICDAR), pages 935-939, Washington, DC, USA, 25-28 Aug 2013. IEEE. https://doi.org/10.1109/ICDAR.2013.190. (Crossref)

M. Kozielski, D. Rybach, S. Hahn, et al. Open vocabulary handwriting recognition using combined word-level and character-level language models. In Proc. 2013 IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), pages 8257-8261, Vancouver, Canada, 26-31 May 2013. IEEE. https://doi.org/10.1109/ICASSP.2013.6639275. (Crossref)

P. Krishnan, K. Dutta, and C. V. Jawahar. Word spotting and recognition using deep embedding. In Proc. 2018 13th IAPR Int. Workshop on Document Analysis Systems (DAS), pages 1-6, Vienna, Austria, 24-27 Apr 2018. IEEE. https://doi.org/10.1109/DAS.2018.70. (Crossref)

A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6):84-90, 2017. https://doi.org/10.1145/3065386. (Crossref)

A. Lawgali, M. Angelova, and A. Bouridane. HACDB: Handwritten Arabic characters database for automatic character recognition. In Proc. European Workshop on Visual Information Processing (EUVIP), pages 255-259, Paris, France, 10-12 Jun 2013. https://ieeexplore.ieee.org/abstract/document/6623974.

M. Liwicki, A. Graves, and H. Bunke. Neural networks for handwriting recognition. In Computational Intelligence Paradigms in Advanced Pattern Classification, volume 386 of Studies in Computational Intelligence, pages 5-24. Springer, 2012. https://doi.org/10.1007/978-3-642-24049-2_2. (Crossref)

J. Louradour and C. Kermorvant. Curriculum learning for handwritten text line recognition. In Proc. 2014 11th IAPR Int. Workshop on Document Analysis Systems (DAS), pages 56-60, Tours, France, 07-10 Apr 2014. IEEE. https://doi.org/10.1109/DAS.2014.38. (Crossref)

R. Maalej and M. Kherallah. Improving MDLSTM for offline Arabic handwriting recognition using dropout at different positions. In Artificial Neural Networks and Machine Learning. Proc. Int. Conf. on Artificial Neural Networks (ICANN) 2016, volume 9887 of Lecture Notes in Computer Science, pages 431-438, Barcelona, Spain, 6-9 Sep 2016. Springer. https://doi.org/10.1007/978-3-319-44781-0_51. (Crossref)

R. Maalej and M. Kherallah. Convolutional neural network and BLSTM for offline Arabic handwriting recognition. In Proc. 2018 Int. Arab Conf. Information Technology (ACIT), pages 1-6, Werdanye, Lebanon, 28-30 Nov 2018. IEEE. https://doi.org/10.1109/ACIT.2018.8672667. (Crossref)

U.-V. Marti and H. Bunke. The IAM-database: an English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition, 5:39-46, 2002. https://doi.org/10.1007/s100320200071. (Crossref)

J. Michael, R. Labahn, T. Grüning, and J. Zöllner. Evaluating sequence-to-sequence models for handwritten text recognition. In Proc. 2019 IAPR Int. Conf. Document Analysis and Recognition (ICDAR), pages 1286-1293, Sydney, NSW, Australia, 20-25 Sep 2019. IEEE. https://doi.org/10.1109/ICDAR.2019.00208. (Crossref)

A. Mohsin and M. Sadoon. Developing an Arabic handwritten recognition system by means of artificial neural network. Journal of Engineering and Applied Sciences, 15(1):1-3, 2019. https://doi.org/10.36478/jeasci.2020.1.3. (Crossref)

V. Märgner and H. El Abed. IFN/ENIT-database. Database of handwritten Arabic words, 2002. http://ifnenit.com.

V. Pham, T. Bluche, C. Kermorvant, and J. Louradour. Dropout improves recurrent neural networks for handwriting recognition. In Proc. 2014 14th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), pages 285-290, Hersonissos, Greece, 01-04 Sep 2014. IEEE. https://doi.org/10.1109/ICFHR.2014.55. (Crossref)

J. Puigcerver. Are multidimensional recurrent layers really necessary for handwritten text recognition? In Proc. 2017 14th IAPR Int. Conf. Document Analysis and Recognition (ICDAR), pages 67-72, Kyoto, Japan, 9-15 Nov 2017. IEEE. https://doi.org/10.1109/ICDAR.2017.20. (Crossref)

S. F. Rashid, M.-P. Schambach, J. Rottland, and S. von der Nüll. Low resolution Arabic recognition with multidimensional recurrent neural networks. In Proc. 4th Int. Workshop on Multilingual OCR (MOCR '13), pages 1-5, Washington, DC, USA, 24 Aug 2013. https://doi.org/10.1145/2505377.2505385. (Crossref)

V. M. Safarzadeh and P. Jafarzadeh. Offline Persian handwriting recognition with CNN and RNN-CTC. In Proc. 2020 25th Int. Computer Conf., Computer Society of Iran (CSICC), pages 1-10, Tehran, Iran, 1-2 Jan 2020. IEEE. https://doi.org/10.1109/CSICC49403.2020.9050073. (Crossref)

H. Scheidl. Build a handwritten text recognition system using TensorFlow. In B. Huberman et al., editors, Towards Data Science, 2015. [Accessed 4 May 2022]. https://towardsdatascience.com/build-a-handwritten-text-recognition-system-using-tensorflow-2326a3487cd5.

F. Slimane, R. Ingold, S. Kanoun, et al. A new Arabic printed text image database and evaluation protocols. In Proc. 2009 10th IAPR Int. Conf. Document Analysis and Recognition (ICDAR), pages 946-950, Barcelona, Spain, 26-29 Jul 2009. IEEE. https://doi.org/10.1109/ICDAR.2009.155. (Crossref)

J. Sueiras, V. Ruiz, A. Sanchez, and J. F. Velez. Offline continuous handwriting recognition using sequence to sequence neural networks. Neurocomputing, 289:119-128, 2018. https://doi.org/10.1016/j.neucom.2018.02.008. (Crossref)

P. Voigtlaender, P. Doetsch, and H. Ney. Handwriting recognition with large multidimensional long short-term memory recurrent neural networks. In Proc 2016 15th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), pages 228-233, Shenzhen, China, 23-26 Oct 2016. IEEE. https://doi.org/10.1109/ICFHR.2016.0052. (Crossref)

P. Voigtlaender, P. Doetsch, S. Wiesler, et al. Sequence-discriminative training of recurrent neural networks. In Proc 2015 IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), pages 2100-2104, South Brisbane, QLD, Australia, 19-24 Apr 2015. IEEE. https://doi.org/10.1109/ICASSP.2015.7178341. (Crossref)

M. Yousef, K. F. Hussain, and U. S. Mohammed. Accurate, data-efficient, unconstrained text recognition with convolutional neural networks. Pattern Recognition, 108:107482, 2020. https://doi.org/10.1016/j.patcog.2020.107482. (Crossref)

Statistics

Downloads

Download data is not yet available.
Recommend Articles