Abdelkarim Elbaati - Academia.edu (original) (raw)
Uploads
Papers by Abdelkarim Elbaati
The ADAB database (The Arabic handwriting Data Base) was developed to advance the research and de... more The ADAB database (The Arabic handwriting Data Base) was developed to advance the research and development of Arabic on-line handwritten systems. This database is developed in cooperation between the Institut fuer Nachrichtentechnik (IfN) and Research Groups in Intelligent Machines, University of Sfax, Tunisia. The text written is from 937 Tunisian town/village names. A pre-label assigned to each file consists of the postcode in a sequence of Numeric Character References, which stored in the UPX file format. An InkML file including trajectory information and a plot image of the word trajectory are also generated. Additional information about the writer can also be provided.All documents and papers that uses the ADAB database will acknowledge the use of the database by including an appropriate citation to the following:[1] H. Boubaker, A. Elbaati, N. Tagougui, M. Kherallah, H. Elabed, and A. M. Alimi, "Online Arabic Databases and Applications," Book chapter in : Märgner, V....
In this paper, we present a new approach to the temporal order restoration of the off-line handwr... more In this paper, we present a new approach to the temporal order restoration of the off-line handwriting. After the preprocessing steps of the word image, a suitable algorithm makes it possible to segment its skeleton in three types of strokes. After that, we developed a genetic algorithm GA in order to optimize the best trajectory of these segments. The repetition of a segment will be studied in a secondary algorithm so that we do not disturb the GA operations. The techniques used in GA are the selection, crossover and the mutation. The fitness function value depends on right-left direction (direction of the Arab writing), the segments repetition and angular deviation on the crossing of the occlusion stroke. To validate our approach, we tested it on the On/Off LMCA dual Arabic handwriting, the Latin IRONOFF and the off-line IFN/ENIT datasets. 1.
In this paper, we propose an automatic analysis system for the Arabic handwriting postal addresse... more In this paper, we propose an automatic analysis system for the Arabic handwriting postal addresses recognition, by using the beta elliptical model. Our system is divided into different steps: analysis, pre-processing and classification. The first operation is the filtering of image. In the second, we remove the border print, stamps and graphics. After locating the address on the envelope, the address segmentation allows the extraction of postal code and city name separately. The pre-processing system and the modeling approach are based on two basic steps. The first step is the extraction of the temporal order in the image of the handwritten trajectory. The second step is based on the use of Beta-Elliptical model for the representation of handwritten script. The recognition system is based on Graph-matching algorithm. Our modeling and recognition approaches were validated by using the postal code and city names extracted from the Tunisian postal envelopes data. The recognition rate obtained is about 98%.
The deep learning-based approaches have proven highly successful in handwriting recognition which... more The deep learning-based approaches have proven highly successful in handwriting recognition which represents a challenging task that satisfies its increasingly broad application in mobile devices. Recently, several research initiatives in the area of pattern recognition studies have been introduced. The challenge is more earnest for Arabic scripts due to the inherent cursiveness of their characters, the existence of several groups of similar shape characters, large sizes of respective alphabets, etc. In this paper, we propose an online Arabic character recognition system based on hybrid Beta-Elliptic model (BEM) and convolutional neural network (CNN) feature extractor models and combining deep bidirectional long short-term memory (DBLSTM) and support vector machine (SVM) classifiers. First, we use the extracted online and offline features to make the classification and compare the performance of single classifiers. Second, we proceed by combining the two types of feature-based syste...
Online signals are rich in dynamic features such as trajectory chronology, velocity, pressure and... more Online signals are rich in dynamic features such as trajectory chronology, velocity, pressure and pen up/down movements. Their offline counterparts consist of a set of pixels. Thus, online handwriting recognition accuracy is generally better than offline. In this paper, we propose an original framework for recovering temporal order and pen velocity from offline multi-lingual handwriting. Our framework is based on an integrated sequence-to-sequence attention model. The proposed system involves extracting a hidden representation from an image using a Convolutional Neural Network (CNN) and a Bidirectional Gated Recurrent Unit (BGRU), and decoding the encoded vectors to generate dynamic information using a BGRU with temporal attention. We validate our framework using an online recognition system applied to a benchmark Latin, Arabic and Indian On/Off dual-handwriting character database. The performance of the proposed multi-lingual system is demonstrated through a low error rate of point...
Document Analysis and Recognition – ICDAR 2021 Workshops
2019 International Conference on Document Analysis and Recognition (ICDAR)
In this paper, we present an original framework for offline handwriting recognition. Our develope... more In this paper, we present an original framework for offline handwriting recognition. Our developed recognition system is based on Sequence to Sequence model employing the encoder decoder LSTM, for recovering temporal order from offline handwriting. Handwriting temporal recovery consists of two parts which are respectively extracting features using a Convolution Neural Network (CNN) followed by an LSTM layer and decoding the encoded vectors to generate temporal information using BLSTM. To produce a human-like velocity, we make a Sampling operation by the consideration of trajectory curvatures. Our work is validated by the LSTM recognition system based on Beta Elliptic model that is applied on Arabic and Latin On/Off dual handwriting character database.
2019 International Conference on Document Analysis and Recognition (ICDAR)
The deep learning-based approaches have proven highly successful in handwriting recognition which... more The deep learning-based approaches have proven highly successful in handwriting recognition which represent a challenging task that satisfies its increasingly broad application in mobile devices. Recently, several research initiatives in the area of pattern recognition studies have been introduced. The challenge is more earnest for Arabic scripts due to the inherent cursiveness of their characters, the existence of several groups of similar shape characters, large sizes of respective alphabets, etc. In this paper, we propose an online Arabic character recognition system based on hybrid Beta-Elliptic model (BEM) and convolutional neural network (CNN) feature extractor models and combining deep bidirectional long short-term memory (DBLSTM) and support vector machine (SVM) classifiers. First, we use the extracted online and offline features to make the classification and compare the performance of single classifiers. Second, we proceed by combining the two types of feature-based systems using different combination methods to enhance the global system discriminating power. We have evaluated our system using LMCA and Online-KHATT databases. The obtained recognition rate is in a maximum of 95.48% and 91.55% for the individual systems using the two databases respectively. The combination of the on-line and off-line systems allows improving the accuracy rate to 99.11% and 93.98% using the same databases which exceed the best result for other state-of-the-art systems.
IEEE International Conference on Systems, Man and Cybernetics, 2002
This paper describes the processing of multilingual documents (Arabic/Latin), extracted from Arab... more This paper describes the processing of multilingual documents (Arabic/Latin), extracted from Arabic scientific articles whose displays pages contain Arabic lines which sometimes include one or more Latin words because they have no exact equivalent in Arabic. Processing these blocks we need to extract Arabic text from multilingual blocks. We propose an original method to locate Latin words from heterogeneous blocks.
The online signal is rich in dynamic features such as trajectory chronology, velocity, pressure a... more The online signal is rich in dynamic features such as trajectory chronology, velocity, pressure and pen up/down. Their offline counterpart consists of a set of pixels. Thus, the online handwriting recognition accuracy is generally better than the offline one. In this paper, we propose an original framework for recovering temporal order and pen velocity from offline handwriting. Our framework is based on sequence to sequence Gated Recurrent Unit (Seq2Seq GRU) model. The proposed system consists in extracting a hidden representation from an image using Convolutional Neural Network (CNN) and Bidirectional GRU (BGRU), and decoding the encoded vectors to generate dynamic information using BGRU. We validate our framework by an online recognition system applied on Latin, Arabic and Indian On/Off dual handwriting character database. To prove the performanceof the proposed system, we achieve a low error rate of point coordinates and a high accuracy rate of the LSTM recognitionsystem.
The ADAB database (The Arabic handwriting Data Base) was developed to advance the research and de... more The ADAB database (The Arabic handwriting Data Base) was developed to advance the research and development of Arabic on-line handwritten systems. This database is developed in cooperation between the Institut fuer Nachrichtentechnik (IfN) and Research Groups in Intelligent Machines, University of Sfax, Tunisia. The text written is from 937 Tunisian town/village names. A pre-label assigned to each file consists of the postcode in a sequence of Numeric Character References, which stored in the UPX file format. An InkML file including trajectory information and a plot image of the word trajectory are also generated. Additional information about the writer can also be provided.All documents and papers that uses the ADAB database will acknowledge the use of the database by including an appropriate citation to the following:[1] H. Boubaker, A. Elbaati, N. Tagougui, M. Kherallah, H. Elabed, and A. M. Alimi, "Online Arabic Databases and Applications," Book chapter in : Märgner, V....
In this paper, we present a new approach to the temporal order restoration of the off-line handwr... more In this paper, we present a new approach to the temporal order restoration of the off-line handwriting. After the preprocessing steps of the word image, a suitable algorithm makes it possible to segment its skeleton in three types of strokes. After that, we developed a genetic algorithm GA in order to optimize the best trajectory of these segments. The repetition of a segment will be studied in a secondary algorithm so that we do not disturb the GA operations. The techniques used in GA are the selection, crossover and the mutation. The fitness function value depends on right-left direction (direction of the Arab writing), the segments repetition and angular deviation on the crossing of the occlusion stroke. To validate our approach, we tested it on the On/Off LMCA dual Arabic handwriting, the Latin IRONOFF and the off-line IFN/ENIT datasets. 1.
In this paper, we propose an automatic analysis system for the Arabic handwriting postal addresse... more In this paper, we propose an automatic analysis system for the Arabic handwriting postal addresses recognition, by using the beta elliptical model. Our system is divided into different steps: analysis, pre-processing and classification. The first operation is the filtering of image. In the second, we remove the border print, stamps and graphics. After locating the address on the envelope, the address segmentation allows the extraction of postal code and city name separately. The pre-processing system and the modeling approach are based on two basic steps. The first step is the extraction of the temporal order in the image of the handwritten trajectory. The second step is based on the use of Beta-Elliptical model for the representation of handwritten script. The recognition system is based on Graph-matching algorithm. Our modeling and recognition approaches were validated by using the postal code and city names extracted from the Tunisian postal envelopes data. The recognition rate obtained is about 98%.
The deep learning-based approaches have proven highly successful in handwriting recognition which... more The deep learning-based approaches have proven highly successful in handwriting recognition which represents a challenging task that satisfies its increasingly broad application in mobile devices. Recently, several research initiatives in the area of pattern recognition studies have been introduced. The challenge is more earnest for Arabic scripts due to the inherent cursiveness of their characters, the existence of several groups of similar shape characters, large sizes of respective alphabets, etc. In this paper, we propose an online Arabic character recognition system based on hybrid Beta-Elliptic model (BEM) and convolutional neural network (CNN) feature extractor models and combining deep bidirectional long short-term memory (DBLSTM) and support vector machine (SVM) classifiers. First, we use the extracted online and offline features to make the classification and compare the performance of single classifiers. Second, we proceed by combining the two types of feature-based syste...
Online signals are rich in dynamic features such as trajectory chronology, velocity, pressure and... more Online signals are rich in dynamic features such as trajectory chronology, velocity, pressure and pen up/down movements. Their offline counterparts consist of a set of pixels. Thus, online handwriting recognition accuracy is generally better than offline. In this paper, we propose an original framework for recovering temporal order and pen velocity from offline multi-lingual handwriting. Our framework is based on an integrated sequence-to-sequence attention model. The proposed system involves extracting a hidden representation from an image using a Convolutional Neural Network (CNN) and a Bidirectional Gated Recurrent Unit (BGRU), and decoding the encoded vectors to generate dynamic information using a BGRU with temporal attention. We validate our framework using an online recognition system applied to a benchmark Latin, Arabic and Indian On/Off dual-handwriting character database. The performance of the proposed multi-lingual system is demonstrated through a low error rate of point...
Document Analysis and Recognition – ICDAR 2021 Workshops
2019 International Conference on Document Analysis and Recognition (ICDAR)
In this paper, we present an original framework for offline handwriting recognition. Our develope... more In this paper, we present an original framework for offline handwriting recognition. Our developed recognition system is based on Sequence to Sequence model employing the encoder decoder LSTM, for recovering temporal order from offline handwriting. Handwriting temporal recovery consists of two parts which are respectively extracting features using a Convolution Neural Network (CNN) followed by an LSTM layer and decoding the encoded vectors to generate temporal information using BLSTM. To produce a human-like velocity, we make a Sampling operation by the consideration of trajectory curvatures. Our work is validated by the LSTM recognition system based on Beta Elliptic model that is applied on Arabic and Latin On/Off dual handwriting character database.
2019 International Conference on Document Analysis and Recognition (ICDAR)
The deep learning-based approaches have proven highly successful in handwriting recognition which... more The deep learning-based approaches have proven highly successful in handwriting recognition which represent a challenging task that satisfies its increasingly broad application in mobile devices. Recently, several research initiatives in the area of pattern recognition studies have been introduced. The challenge is more earnest for Arabic scripts due to the inherent cursiveness of their characters, the existence of several groups of similar shape characters, large sizes of respective alphabets, etc. In this paper, we propose an online Arabic character recognition system based on hybrid Beta-Elliptic model (BEM) and convolutional neural network (CNN) feature extractor models and combining deep bidirectional long short-term memory (DBLSTM) and support vector machine (SVM) classifiers. First, we use the extracted online and offline features to make the classification and compare the performance of single classifiers. Second, we proceed by combining the two types of feature-based systems using different combination methods to enhance the global system discriminating power. We have evaluated our system using LMCA and Online-KHATT databases. The obtained recognition rate is in a maximum of 95.48% and 91.55% for the individual systems using the two databases respectively. The combination of the on-line and off-line systems allows improving the accuracy rate to 99.11% and 93.98% using the same databases which exceed the best result for other state-of-the-art systems.
IEEE International Conference on Systems, Man and Cybernetics, 2002
This paper describes the processing of multilingual documents (Arabic/Latin), extracted from Arab... more This paper describes the processing of multilingual documents (Arabic/Latin), extracted from Arabic scientific articles whose displays pages contain Arabic lines which sometimes include one or more Latin words because they have no exact equivalent in Arabic. Processing these blocks we need to extract Arabic text from multilingual blocks. We propose an original method to locate Latin words from heterogeneous blocks.
The online signal is rich in dynamic features such as trajectory chronology, velocity, pressure a... more The online signal is rich in dynamic features such as trajectory chronology, velocity, pressure and pen up/down. Their offline counterpart consists of a set of pixels. Thus, the online handwriting recognition accuracy is generally better than the offline one. In this paper, we propose an original framework for recovering temporal order and pen velocity from offline handwriting. Our framework is based on sequence to sequence Gated Recurrent Unit (Seq2Seq GRU) model. The proposed system consists in extracting a hidden representation from an image using Convolutional Neural Network (CNN) and Bidirectional GRU (BGRU), and decoding the encoded vectors to generate dynamic information using BGRU. We validate our framework by an online recognition system applied on Latin, Arabic and Indian On/Off dual handwriting character database. To prove the performanceof the proposed system, we achieve a low error rate of point coordinates and a high accuracy rate of the LSTM recognitionsystem.