Arabic Handwriting Recognition Using Machine Learning Approaches

Download Arabic Handwriting Recognition Using Machine Learning Approaches PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Arabic Handwriting Recognition Using Machine Learning Approaches book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages.
Arabic Handwriting Recognition Using Machine Learning Approaches

While handwriting recognition tasks for Latin script based languages have received considerable attention, far less work has been done on the Arabic script. Arabic poses some unique challenges, such as a larger character set, the presence of dots and diacritics, and intra-word whitespace regions. Machine learning approaches have the potential to significantly improve state of the art Arabic handwriting recognition results. This dissertation presents several such machine learning techniques, such as writer adaptation and segmentation free unconstrained text processing. We integrate these techniques into novel algorithms for general recognition, word spotting, and transcript mapping. Writer adaptation or specialization is the adjustment of handwriting recognition algorithms to a specific writer's style of handwriting. Such adjustment yields significantly improved recognition rates over a generalized recognition counterpart algorithms. Specialization is commonly used in online Latin script handwriting applications, such as for tablet computers or PDAs. Some rudimentary offline Latin script adaptation methods have been proposed recently in the literature as well. Handwriting adaptation for the Arabic script, however, is unexplored. An iterative bootstrapping model is presented which adapts a writer-independent model to a writer-dependent model using a small number of words achieving a large recognition rate increase in the process. Furthermore, a confidence weighting method is described which generates better results by weighting words based on their length. Script features unique to Arabic are discussed, as well as they are incorporated into the adaptation process. Even though Arabic has many more character classes than languages such as English, significant improvement is observed. One issue common to Arabic recognition tasks is generating candidate word regions on a page. Attempting to definitely segment the document into such regions (automatic segmentation) can meet with some success, but the performance of such an algorithm is often a limiting factor in spotting performance. Another approach is to directly scan the image on the page without attempting to generate such a definite segmentation. Such segmentation-free approaches result in better recognition at a performance cost. The algorithms discussed are tested using a database of truthed, page-length, handwritten Arabic documents. Where applicable, the literature standard IFN/ENIT database is used for testing as well. We validate our approaches by exploring the implications on such tasks as word spotting (attempting to find a query word or image and placement in a set of documents), transcript mapping (the automatic alignment of a handwritten document with its machine readable transcript), and general unconstrained recognition. Novel algorithms for these tasks are also presented. Specifically, contributions in this dissertation include novel descriptions of machine learning algorithms applied to Arabic handwriting recognition problems and quantification of the improvement generated by their usage. Examples of such algorithms include writer adaptation, versatile search, and the advantage and trade offs gained by processing such tasks as word spotting in a segmentation-free fashion instead of a segmentation-based manner.
Pattern Recognition and Image Analysis

This 2-volume set constitutes the refereed proceedings of the 9th Iberian Conference on Pattern Recognition and Image Analysis, IbPRIA 2019, held in Madrid, Spain, in July 2019. The 99 papers in these volumes were carefully reviewed and selected from 137 submissions. They are organized in topical sections named: Part I: best ranked papers; machine learning; pattern recognition; image processing and representation. Part II: biometrics; handwriting and document analysis; other applications.
Document Analysis and Recognition – ICDAR 2023 Workshops

This two-volume set LNCS 14193-14194 constitutes the proceedings of International Workshops co-located with the 17th International Conference on Document Analysis and Recognition, ICDAR 2023, held in San José, CA, USA, during August 21–26, 2023. The total of 43 regular papers presented in this book were carefully selected from 60 submissions. Part I contains 22 regular papers that stem from the following workshops: ICDAR 2023 Workshop on Computational Paleography (IWCP); ICDAR 2023 Workshop on Camera-Based Document Analysis and Recognition (CBDAR); ICDAR 2023 International Workshop on Graphics Recognition (GREC); ICDAR 2023 Workshop on Automatically Domain-Adapted and Personalized Document Analysis (ADAPDA); Part II contains 21 regular papers that stem from the following workshops: ICDAR 2023 Workshop on Machine Vision and NLP for Document Analysis (VINALDO); ICDAR 2023 International Workshop on Machine Learning (WML).