Jihad El-Sana | Ben Gurion University of the Negev (original) (raw)

Papers by Jihad El-Sana

Research paper thumbnail of Text Line Extraction in Historical Documents Using Mask R-CNN

Research paper thumbnail of Writer Identification for Historical Arabic Documents

2014 22nd International Conference on Pattern Recognition, 2014

Identification of writers of handwritten historical documents is an important and challenging tas... more Identification of writers of handwritten historical documents is an important and challenging task. In this paper we present several feature extraction and classification approaches for the identification of writers in historical Arabic manuscripts. The approaches are able to successfully identify writers of multipage documents. The feature extraction methods rely on different principles, such as contour-, textural-and key point-based and the classification schemes are based on averaging and voting. For all experiments a dedicated data set based on a publicly available database is used. The experiments show promising results and the best performance was achieved using a novel feature extraction based on key point descriptors.

Research paper thumbnail of Text line segmentation for gray scale historical document images

Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, 2011

In this paper we present a new approach for text line segmentation that works directly on gray-sc... more In this paper we present a new approach for text line segmentation that works directly on gray-scale document images. Our algorithm constructs distance transform directly on the gray-scale images, which is used to compute two types of seams: medial seams and separating seams. A medial seam is a chain of pixels that crosses the text area of a text line and a separating seam is a path that passes between two consecutive rows. The medial seam determines a text line and the separating seams define the upper and lower boundaries of the text line. The medial and separating seams propagate according to energy maps, which are defined based on the constructed distance transform. We have performed various experimental results on different datasets and received encouraging results.

Research paper thumbnail of SHREC'10 track: Robust shape retrieval

Eurographics Workshop on 3D Object Retrieval, EG 3DOR, 2010

The 3D Shape Retrieval Contest 2010 (SHREC'10) robust shape retrieval benchmark simulates a retri... more The 3D Shape Retrieval Contest 2010 (SHREC'10) robust shape retrieval benchmark simulates a retrieval scenario, in which the queries include multiple modifications and transformations of the same shape. The benchmark allows evaluating how algorithms cope with certain classes of transformations and what is the strength of the transformations that can be dealt with. The present paper is a report of the SHREC'10 robust shape retrieval benchmark results.

Research paper thumbnail of Is a Deep Learning Algorithm Effective for the Classification of Medieval Hebrew Scripts?

Jewish Studies in the Digital Age

In this research, we apply deep-learning techniques to Hebrew paleography to automatically classi... more In this research, we apply deep-learning techniques to Hebrew paleography to automatically classify and process medieval Hebrew manuscripts. Our work is based on contemporary Hebrew paleography (Malachi Beit-Arié, Colette Sirat, Norman Golb, Ada Yardeni, Benjamin Richler) that recognizes fifteen subtypes of medieval Hebrew script. Automatic recognition of these scripts allows to determine the approximate origin and date of writing for not-dated, fragmentary, and damaged manuscripts. To train the deep neural network, we compile a Visual Media Lab-Hebrew Paleography (VML-HP) dataset that contains 537 high-resolution manuscript page images. The images were hand-picked from the SfarData (http:/sfardata.nli.org.il/) dataset; in some rare cases, we also included pages from other manuscripts' collections. For testing the model, we define a notion of typical and blind test sets. The typical test set consists of the unseen pages of the manuscripts used in training. The blind test set, on the contrary, consists of pages from unseen manuscripts, thus, providing us with a real-life scenario. To train the model, we used patches extracted from the documents' pages. To filter irrelevant patches (empty patches or patches that contain decorations), we developed a clean patch generation algorithm that can generate patches containing pure text regions (for the VML-HP dataset, we generated 150K train patches). In all the experiments, we trained the network on the training set and tested it on both test sets, typical and blind. The objective training function was cross-entropy loss and was minimized using the Adam optimizer algorithm. The training was performed until there was no improvement in validation loss with five epochs' patience. The model with the least validation loss was used for testing.

Research paper thumbnail of Learning Free Line Detection in Manuscripts using Distance Transform Graph

2019 International Conference on Document Analysis and Recognition (ICDAR), 2019

We present a fully automated learning free method, for line detection in manuscripts. We begin by... more We present a fully automated learning free method, for line detection in manuscripts. We begin by separating components that span over multiple lines, then we remove noise, and small connected components such as diacritics. We apply a distance transform on the image to create the image skeleton. The skeleton is pruned, its vertexes and edges are detected, in order to generate the initial document graph. We calculate the vertex v-score using its t-score and l-score quantifying its distance from being an absolute link in a line. In a greedy manner we classify each edge in the graph either a link, a bridge or a conflict edge. We merge every two edges classified as link together, then we merge the conflict edges next. Finally we remove the bridge edges from the graph generating the final form of the graph. Each edge in the graph equals to one extracted line. We applied the method on the DIVA-hisDB dataset on both public and private sections. The public section participated in the recently conducted Layout Analysis for Challenging Medieval Manuscripts Competition, and we have achieved results surpassing the vast majority of these systems.

Research paper thumbnail of View-Dependent Rendering for Polygonal Datasets

Research paper thumbnail of Text line extraction using fully convolutional network and energy minimization

Text lines are important parts of handwritten document images and easier to be analyzed by furthe... more Text lines are important parts of handwritten document images and easier to be analyzed by further applications. Despite recent progress in text line detection, text line extraction from a handwritten document remains an unsolved task. This paper proposes to use a fully convolutional network for text line detection and energy minimization for text line extraction. Detected text lines are represented by blob lines that strike through the text lines. These blob lines assist an energy function for text line extraction. The detection stage can locate arbitrarily oriented text lines. Furthermore, the extraction stage is capable of finding out the pixels of text lines with various heights and interline proximity independent of their orientations. Besides, it can finely split the touching and overlapping text lines without an orientation assumption. We evaluate the proposed method on VML-AHTE, VML-MOC and Diva-HisDB datasets. The first contains overlapping, touching and close text lines wi...

Research paper thumbnail of Unsupervised Learning of Text Line Segmentation by Differentiating Coarse Patterns

Document Analysis and Recognition – ICDAR 2021, 2021

Despite recent advances in the field of supervised deep learning for text line segmentation, unsu... more Despite recent advances in the field of supervised deep learning for text line segmentation, unsupervised deep learning solutions are beginning to gain popularity. In this paper, we present an unsupervised deep learning method that embeds document image patches to a compact Euclidean space where distances correspond to a coarse text line pattern similarity. Once this space has been produced, text line segmentation can be easily implemented using standard techniques with the embedded feature vectors. To train the model, we extract random pairs of document image patches with the assumption that neighbour patches contain a similar coarse trend of text lines, whereas if one of them is rotated, they contain different coarse trends of text lines. Doing well on this task requires the model to learn to recognize the text lines and their salient parts. The benefit of our approach is zero manual labelling effort. We evaluate the method qualitatively and quantitatively on several variants of text line segmentation datasets to demonstrate its effectivity.

Research paper thumbnail of Engaging Students in Covariational Reasoning within an Augmented Reality Environment

Augmented Reality in Educational Settings, 2019

Research paper thumbnail of Segmentation-Free Online Arabic Handwriting Recognition

International Journal of Pattern Recognition and Artificial Intelligence, 2011

Arabic script is naturally cursive and unconstrained and, as a result, an automatic recognition o... more Arabic script is naturally cursive and unconstrained and, as a result, an automatic recognition of its handwriting is a challenging problem. The analysis of Arabic script is further complicated in comparison to Latin script due to obligatory dots/stokes that are placed above or below most letters. In this paper, we introduce a new approach that performs online Arabic word recognition on a continuous word-part level, while performing training on the letter level. In addition, we appropriately handle delayed strokes by first detecting them and then integrating them into the word-part body. Our current implementation is based on Hidden Markov Models (HMM) and correctly handles most of the Arabic script recognition difficulties. We have tested our implementation using various dictionaries and multiple writers and have achieved encouraging results for both writer-dependent and writer-independent recognition.

Research paper thumbnail of Using Scale-Space Anisotropic Smoothing for Text Line Extraction in Historical Documents

Lecture Notes in Computer Science, 2014

Text line extraction is vital prerequisite for various document processing tasks. This paper pres... more Text line extraction is vital prerequisite for various document processing tasks. This paper presents a novel approach for text line extraction which is based on Gaussian scale space and dedicated binarization that utilize the inherent structure of smoothed text document images. It enhances the text lines in the image using multi-scale anisotropic second derivative of Gaussian filter bank at the average height of the text line. It then applies a binarization, which is based on component-tree and is tailored towards line extraction. The final stage of the algorithm is based on an energy minimization framework for removing spurious text line and assigning connected components to lines. We have tested our approach on various datasets written in different languages at range of image quality and received high detection rates, which outperform state-of-the-art algorithms. Our MATLAB code is publicly available. 3

Research paper thumbnail of Hierarchical On-line Arabic Handwriting Recognition

2009 10th International Conference on Document Analysis and Recognition, 2009

In this paper, we present a multi-level recognizer for online Arabic handwriting. In Arabic scrip... more In this paper, we present a multi-level recognizer for online Arabic handwriting. In Arabic script (handwritten and printed), cursive writing-is not a style-it is an inherent part of the script. In addition, the connection between letters is done with almost no ligatures, which complicates segmenting a word into individual letters. In this work, we have adopted the holistic approach and avoided segmenting words into individual letters. To reduce the search space, we apply a series of filters in a hierarchical manner. The earlier filters perform light processing on a large number of candidates, and the later filters perform heavy processing on a small number of candidates. In the first filter, global features and delayed strokes patterns are used to reduce candidate word-part models. In the second filter, local features are used to guide a dynamic time warping (DTW) classification. The resulting k top ranked candidates are sent for shape-context based classifier, which determines the recognized word-part. In this work, we have modified the classic DTW to enable different costs for the different operations and control their behavior. We have performed several experimental tests and have received encouraging results.

Research paper thumbnail of Statistics Arabic Language Statistics for Efficient Script Recognition

Arabic script is naturally cursive and unconstrained. As a result, automatic recognition of its h... more Arabic script is naturally cursive and unconstrained. As a result, automatic recognition of its handwriting is a challenging problem. In comparison to Latin script, analysis of Arabic script is further complicated by both cursiveness and the obligatory dots or stokes that are placed above or below most letters. The naturally inherited cursiveness and the large number and positions of additional strokes discourage the segmentation-free approach of analysis because of the anticipated huge number of combinations needed to produce different words. At the same time, segmentation of Arabic script to individual characters is almost impossible and often, responsible to many misclassified items in Arabic script recognizers. This paper presents statistics on the Arabic language using a very large corpus of Arabic words. These statistical results, which could are used to improve the efficiency and accuracy of Arabic script recognizers also indicate that a holistic approach is computationally a...

Research paper thumbnail of Out-of-core algorithms for scientific visualization and computer graphics

IEEE Visualization, 2003

This course will focus on describing techniques for handling datasets larger than main memory in ... more This course will focus on describing techniques for handling datasets larger than main memory in scientific visualization and computer graphics. Recently, several external memory techniques have been developed for a wide variety of graphics and visualization problems, including surface simplification, volume rendering, isosurface generation, ray tracing, surface reconstruction, and so on. This work has had significant impact given that in recent years there has been a rapid increase in the raw size of datasets. Several ...

Research paper thumbnail of Digital Hebrew Paleography: Script Types and Modes

Journal of Imaging

Paleography is the study of ancient and medieval handwriting. It is essential for understanding, ... more Paleography is the study of ancient and medieval handwriting. It is essential for understanding, authenticating, and dating historical texts. Across many archives and libraries, many handwritten manuscripts are yet to be classified. Human experts can process a limited number of manuscripts; therefore, there is a need for an automatic tool for script type classification. In this study, we utilize a deep-learning methodology to classify medieval Hebrew manuscripts into 14 classes based on their script style and mode. Hebrew paleography recognizes six regional styles and three graphical modes of scripts. We experiment with several input image representations and network architectures to determine the appropriate ones and explore several approaches for script classification. We obtained the highest accuracy using hierarchical classification approach. At the first level, the regional style of the script is classified. Then, the patch is passed to the corresponding model at the second lev...

Research paper thumbnail of Deep learning for paleographic analysis of medieval Hebrew manuscripts

In this interdisciplinary project we apply deep learning models to classify script types and sub-... more In this interdisciplinary project we apply deep learning models to classify script types and sub-types in medieval Hebrew manuscripts. The model incorporates the the techniques and databases of Hebrew paleography and (with reservations) Hebrew codicology. Main theoretical base of our project is the SfarData dataset, that includes the full codicological descriptions and paleographical definitions of all dated medieval Hebrew manuscripts till the year 1540. In some exceptional cases, we go beyond this dataset framework. The major source of the data in terms of high definition photos of manuscripts is the Institute of Microfilmed Hebrew Manuscripts at the National Library of Israel that has undertaken the mission to collect copies of all extant Hebrew manuscripts from all over the world. We mostly use manuscripts from the National library of Israel, the British library, and the French National library. This multidisciplinary project brings together researchers from both fields, Humanities and Computer Science. Currently, one professor, one lecturer, one post-doc, and two doctoral students are participating in the project. This is a very exciting work in which there are no ready-made solutions for the various challenges. We collectively discuss ways to address these challenges and adapt our solution on the go. During the presentation, we will talk about how our project functions and how we strive to achieve a common result. The inevitable difficulties that we face during this collaboration include, inter alia, different research systems in Humanities and in Computer Sciences, lack of common terminology, different technical training, different requirements for publications and conferences, etc.

Research paper thumbnail of Unsupervised text line segmentation

ArXiv, 2020

We present an unsupervised text line segmentation method that is inspired by the relative varianc... more We present an unsupervised text line segmentation method that is inspired by the relative variance between text lines and spaces among text lines. Handwritten text line segmentation is important for the efficiency of further processing. A common method is to train a deep learning network for embedding the document image into an image of blob lines that are tracing the text lines. Previous methods learned such embedding in a supervised manner, requiring the annotation of many document images. This paper presents an unsupervised embedding of document image patches without a need for annotations. The main idea is that the number of foreground pixels over the text lines is relatively different from the number of foreground pixels over the spaces among text lines. Generating similar and different pairs relying on this principle definitely leads to outliers. However, as the results show, the outliers do not harm the convergence and the network learns to discriminate the text lines from th...

Research paper thumbnail of Experiment study on utilizing convolutional neural networks to recognize historical Arabic handwritten text

2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR), 2017

Deep learning is a form of hierarchical learning, it consists of multiple layers of representatio... more Deep learning is a form of hierarchical learning, it consists of multiple layers of representations that gradually transform data into high level concepts. Deep learning has been providing the state of the art results for various computer vision problems. However, a typical deep leaning algorithm needs a large amount of data to train a deep model and guarantee the models ability to generalize. It is not easy to generate large labeled datasets and it is one of the main barriers to apply deep learning for many problems. Data augmentation schemes were introduced to overcome this limitation, by extending small available labeled datasets. In this work we experiment with extending a small labeled dataset of Arabic continuous subwords by an orders of magnitude. The labeled dataset, which consist of handwritten Arabic subwords is used to synthesize a large collection of labeled dataset. The synthesized subwords are based on one or multiple writing styles from the original labeled dataset. We also experiment with generating various printed forms of subwords. We include only Naskh font, as most of the Arabic historical manuscripts were written in this type of font. We train several convolutional neural networks using handwritten, printed and synthesized datasets and obtain encouraging results.

Research paper thumbnail of Synthesizing versus Augmentation for Arabic Word Recognition with Convolutional Neural Networks

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR), 2018

In this paper, we present a sub-word recognition method for historical Arabic manuscripts, using ... more In this paper, we present a sub-word recognition method for historical Arabic manuscripts, using convolutional neural networks. We investigate the benefit of extending training set with synthetically created samples in comparison to augmentation. We show that annotating around ten pages of a manuscript and extending it, is sufficient for successful sub-word recognition in the whole manuscript. In addition, we show the contribution of using different combinations of training sets and compare their sub-word recognition performance in the whole manuscript.

Research paper thumbnail of Text Line Extraction in Historical Documents Using Mask R-CNN

Research paper thumbnail of Writer Identification for Historical Arabic Documents

2014 22nd International Conference on Pattern Recognition, 2014

Identification of writers of handwritten historical documents is an important and challenging tas... more Identification of writers of handwritten historical documents is an important and challenging task. In this paper we present several feature extraction and classification approaches for the identification of writers in historical Arabic manuscripts. The approaches are able to successfully identify writers of multipage documents. The feature extraction methods rely on different principles, such as contour-, textural-and key point-based and the classification schemes are based on averaging and voting. For all experiments a dedicated data set based on a publicly available database is used. The experiments show promising results and the best performance was achieved using a novel feature extraction based on key point descriptors.

Research paper thumbnail of Text line segmentation for gray scale historical document images

Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, 2011

In this paper we present a new approach for text line segmentation that works directly on gray-sc... more In this paper we present a new approach for text line segmentation that works directly on gray-scale document images. Our algorithm constructs distance transform directly on the gray-scale images, which is used to compute two types of seams: medial seams and separating seams. A medial seam is a chain of pixels that crosses the text area of a text line and a separating seam is a path that passes between two consecutive rows. The medial seam determines a text line and the separating seams define the upper and lower boundaries of the text line. The medial and separating seams propagate according to energy maps, which are defined based on the constructed distance transform. We have performed various experimental results on different datasets and received encouraging results.

Research paper thumbnail of SHREC'10 track: Robust shape retrieval

Eurographics Workshop on 3D Object Retrieval, EG 3DOR, 2010

The 3D Shape Retrieval Contest 2010 (SHREC'10) robust shape retrieval benchmark simulates a retri... more The 3D Shape Retrieval Contest 2010 (SHREC'10) robust shape retrieval benchmark simulates a retrieval scenario, in which the queries include multiple modifications and transformations of the same shape. The benchmark allows evaluating how algorithms cope with certain classes of transformations and what is the strength of the transformations that can be dealt with. The present paper is a report of the SHREC'10 robust shape retrieval benchmark results.

Research paper thumbnail of Is a Deep Learning Algorithm Effective for the Classification of Medieval Hebrew Scripts?

Jewish Studies in the Digital Age

In this research, we apply deep-learning techniques to Hebrew paleography to automatically classi... more In this research, we apply deep-learning techniques to Hebrew paleography to automatically classify and process medieval Hebrew manuscripts. Our work is based on contemporary Hebrew paleography (Malachi Beit-Arié, Colette Sirat, Norman Golb, Ada Yardeni, Benjamin Richler) that recognizes fifteen subtypes of medieval Hebrew script. Automatic recognition of these scripts allows to determine the approximate origin and date of writing for not-dated, fragmentary, and damaged manuscripts. To train the deep neural network, we compile a Visual Media Lab-Hebrew Paleography (VML-HP) dataset that contains 537 high-resolution manuscript page images. The images were hand-picked from the SfarData (http:/sfardata.nli.org.il/) dataset; in some rare cases, we also included pages from other manuscripts' collections. For testing the model, we define a notion of typical and blind test sets. The typical test set consists of the unseen pages of the manuscripts used in training. The blind test set, on the contrary, consists of pages from unseen manuscripts, thus, providing us with a real-life scenario. To train the model, we used patches extracted from the documents' pages. To filter irrelevant patches (empty patches or patches that contain decorations), we developed a clean patch generation algorithm that can generate patches containing pure text regions (for the VML-HP dataset, we generated 150K train patches). In all the experiments, we trained the network on the training set and tested it on both test sets, typical and blind. The objective training function was cross-entropy loss and was minimized using the Adam optimizer algorithm. The training was performed until there was no improvement in validation loss with five epochs' patience. The model with the least validation loss was used for testing.

Research paper thumbnail of Learning Free Line Detection in Manuscripts using Distance Transform Graph

2019 International Conference on Document Analysis and Recognition (ICDAR), 2019

We present a fully automated learning free method, for line detection in manuscripts. We begin by... more We present a fully automated learning free method, for line detection in manuscripts. We begin by separating components that span over multiple lines, then we remove noise, and small connected components such as diacritics. We apply a distance transform on the image to create the image skeleton. The skeleton is pruned, its vertexes and edges are detected, in order to generate the initial document graph. We calculate the vertex v-score using its t-score and l-score quantifying its distance from being an absolute link in a line. In a greedy manner we classify each edge in the graph either a link, a bridge or a conflict edge. We merge every two edges classified as link together, then we merge the conflict edges next. Finally we remove the bridge edges from the graph generating the final form of the graph. Each edge in the graph equals to one extracted line. We applied the method on the DIVA-hisDB dataset on both public and private sections. The public section participated in the recently conducted Layout Analysis for Challenging Medieval Manuscripts Competition, and we have achieved results surpassing the vast majority of these systems.

Research paper thumbnail of View-Dependent Rendering for Polygonal Datasets

Research paper thumbnail of Text line extraction using fully convolutional network and energy minimization

Text lines are important parts of handwritten document images and easier to be analyzed by furthe... more Text lines are important parts of handwritten document images and easier to be analyzed by further applications. Despite recent progress in text line detection, text line extraction from a handwritten document remains an unsolved task. This paper proposes to use a fully convolutional network for text line detection and energy minimization for text line extraction. Detected text lines are represented by blob lines that strike through the text lines. These blob lines assist an energy function for text line extraction. The detection stage can locate arbitrarily oriented text lines. Furthermore, the extraction stage is capable of finding out the pixels of text lines with various heights and interline proximity independent of their orientations. Besides, it can finely split the touching and overlapping text lines without an orientation assumption. We evaluate the proposed method on VML-AHTE, VML-MOC and Diva-HisDB datasets. The first contains overlapping, touching and close text lines wi...

Research paper thumbnail of Unsupervised Learning of Text Line Segmentation by Differentiating Coarse Patterns

Document Analysis and Recognition – ICDAR 2021, 2021

Despite recent advances in the field of supervised deep learning for text line segmentation, unsu... more Despite recent advances in the field of supervised deep learning for text line segmentation, unsupervised deep learning solutions are beginning to gain popularity. In this paper, we present an unsupervised deep learning method that embeds document image patches to a compact Euclidean space where distances correspond to a coarse text line pattern similarity. Once this space has been produced, text line segmentation can be easily implemented using standard techniques with the embedded feature vectors. To train the model, we extract random pairs of document image patches with the assumption that neighbour patches contain a similar coarse trend of text lines, whereas if one of them is rotated, they contain different coarse trends of text lines. Doing well on this task requires the model to learn to recognize the text lines and their salient parts. The benefit of our approach is zero manual labelling effort. We evaluate the method qualitatively and quantitatively on several variants of text line segmentation datasets to demonstrate its effectivity.

Research paper thumbnail of Engaging Students in Covariational Reasoning within an Augmented Reality Environment

Augmented Reality in Educational Settings, 2019

Research paper thumbnail of Segmentation-Free Online Arabic Handwriting Recognition

International Journal of Pattern Recognition and Artificial Intelligence, 2011

Arabic script is naturally cursive and unconstrained and, as a result, an automatic recognition o... more Arabic script is naturally cursive and unconstrained and, as a result, an automatic recognition of its handwriting is a challenging problem. The analysis of Arabic script is further complicated in comparison to Latin script due to obligatory dots/stokes that are placed above or below most letters. In this paper, we introduce a new approach that performs online Arabic word recognition on a continuous word-part level, while performing training on the letter level. In addition, we appropriately handle delayed strokes by first detecting them and then integrating them into the word-part body. Our current implementation is based on Hidden Markov Models (HMM) and correctly handles most of the Arabic script recognition difficulties. We have tested our implementation using various dictionaries and multiple writers and have achieved encouraging results for both writer-dependent and writer-independent recognition.

Research paper thumbnail of Using Scale-Space Anisotropic Smoothing for Text Line Extraction in Historical Documents

Lecture Notes in Computer Science, 2014

Text line extraction is vital prerequisite for various document processing tasks. This paper pres... more Text line extraction is vital prerequisite for various document processing tasks. This paper presents a novel approach for text line extraction which is based on Gaussian scale space and dedicated binarization that utilize the inherent structure of smoothed text document images. It enhances the text lines in the image using multi-scale anisotropic second derivative of Gaussian filter bank at the average height of the text line. It then applies a binarization, which is based on component-tree and is tailored towards line extraction. The final stage of the algorithm is based on an energy minimization framework for removing spurious text line and assigning connected components to lines. We have tested our approach on various datasets written in different languages at range of image quality and received high detection rates, which outperform state-of-the-art algorithms. Our MATLAB code is publicly available. 3

Research paper thumbnail of Hierarchical On-line Arabic Handwriting Recognition

2009 10th International Conference on Document Analysis and Recognition, 2009

In this paper, we present a multi-level recognizer for online Arabic handwriting. In Arabic scrip... more In this paper, we present a multi-level recognizer for online Arabic handwriting. In Arabic script (handwritten and printed), cursive writing-is not a style-it is an inherent part of the script. In addition, the connection between letters is done with almost no ligatures, which complicates segmenting a word into individual letters. In this work, we have adopted the holistic approach and avoided segmenting words into individual letters. To reduce the search space, we apply a series of filters in a hierarchical manner. The earlier filters perform light processing on a large number of candidates, and the later filters perform heavy processing on a small number of candidates. In the first filter, global features and delayed strokes patterns are used to reduce candidate word-part models. In the second filter, local features are used to guide a dynamic time warping (DTW) classification. The resulting k top ranked candidates are sent for shape-context based classifier, which determines the recognized word-part. In this work, we have modified the classic DTW to enable different costs for the different operations and control their behavior. We have performed several experimental tests and have received encouraging results.

Research paper thumbnail of Statistics Arabic Language Statistics for Efficient Script Recognition

Arabic script is naturally cursive and unconstrained. As a result, automatic recognition of its h... more Arabic script is naturally cursive and unconstrained. As a result, automatic recognition of its handwriting is a challenging problem. In comparison to Latin script, analysis of Arabic script is further complicated by both cursiveness and the obligatory dots or stokes that are placed above or below most letters. The naturally inherited cursiveness and the large number and positions of additional strokes discourage the segmentation-free approach of analysis because of the anticipated huge number of combinations needed to produce different words. At the same time, segmentation of Arabic script to individual characters is almost impossible and often, responsible to many misclassified items in Arabic script recognizers. This paper presents statistics on the Arabic language using a very large corpus of Arabic words. These statistical results, which could are used to improve the efficiency and accuracy of Arabic script recognizers also indicate that a holistic approach is computationally a...

Research paper thumbnail of Out-of-core algorithms for scientific visualization and computer graphics

IEEE Visualization, 2003

This course will focus on describing techniques for handling datasets larger than main memory in ... more This course will focus on describing techniques for handling datasets larger than main memory in scientific visualization and computer graphics. Recently, several external memory techniques have been developed for a wide variety of graphics and visualization problems, including surface simplification, volume rendering, isosurface generation, ray tracing, surface reconstruction, and so on. This work has had significant impact given that in recent years there has been a rapid increase in the raw size of datasets. Several ...

Research paper thumbnail of Digital Hebrew Paleography: Script Types and Modes

Journal of Imaging

Paleography is the study of ancient and medieval handwriting. It is essential for understanding, ... more Paleography is the study of ancient and medieval handwriting. It is essential for understanding, authenticating, and dating historical texts. Across many archives and libraries, many handwritten manuscripts are yet to be classified. Human experts can process a limited number of manuscripts; therefore, there is a need for an automatic tool for script type classification. In this study, we utilize a deep-learning methodology to classify medieval Hebrew manuscripts into 14 classes based on their script style and mode. Hebrew paleography recognizes six regional styles and three graphical modes of scripts. We experiment with several input image representations and network architectures to determine the appropriate ones and explore several approaches for script classification. We obtained the highest accuracy using hierarchical classification approach. At the first level, the regional style of the script is classified. Then, the patch is passed to the corresponding model at the second lev...

Research paper thumbnail of Deep learning for paleographic analysis of medieval Hebrew manuscripts

In this interdisciplinary project we apply deep learning models to classify script types and sub-... more In this interdisciplinary project we apply deep learning models to classify script types and sub-types in medieval Hebrew manuscripts. The model incorporates the the techniques and databases of Hebrew paleography and (with reservations) Hebrew codicology. Main theoretical base of our project is the SfarData dataset, that includes the full codicological descriptions and paleographical definitions of all dated medieval Hebrew manuscripts till the year 1540. In some exceptional cases, we go beyond this dataset framework. The major source of the data in terms of high definition photos of manuscripts is the Institute of Microfilmed Hebrew Manuscripts at the National Library of Israel that has undertaken the mission to collect copies of all extant Hebrew manuscripts from all over the world. We mostly use manuscripts from the National library of Israel, the British library, and the French National library. This multidisciplinary project brings together researchers from both fields, Humanities and Computer Science. Currently, one professor, one lecturer, one post-doc, and two doctoral students are participating in the project. This is a very exciting work in which there are no ready-made solutions for the various challenges. We collectively discuss ways to address these challenges and adapt our solution on the go. During the presentation, we will talk about how our project functions and how we strive to achieve a common result. The inevitable difficulties that we face during this collaboration include, inter alia, different research systems in Humanities and in Computer Sciences, lack of common terminology, different technical training, different requirements for publications and conferences, etc.

Research paper thumbnail of Unsupervised text line segmentation

ArXiv, 2020

We present an unsupervised text line segmentation method that is inspired by the relative varianc... more We present an unsupervised text line segmentation method that is inspired by the relative variance between text lines and spaces among text lines. Handwritten text line segmentation is important for the efficiency of further processing. A common method is to train a deep learning network for embedding the document image into an image of blob lines that are tracing the text lines. Previous methods learned such embedding in a supervised manner, requiring the annotation of many document images. This paper presents an unsupervised embedding of document image patches without a need for annotations. The main idea is that the number of foreground pixels over the text lines is relatively different from the number of foreground pixels over the spaces among text lines. Generating similar and different pairs relying on this principle definitely leads to outliers. However, as the results show, the outliers do not harm the convergence and the network learns to discriminate the text lines from th...

Research paper thumbnail of Experiment study on utilizing convolutional neural networks to recognize historical Arabic handwritten text

2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR), 2017

Deep learning is a form of hierarchical learning, it consists of multiple layers of representatio... more Deep learning is a form of hierarchical learning, it consists of multiple layers of representations that gradually transform data into high level concepts. Deep learning has been providing the state of the art results for various computer vision problems. However, a typical deep leaning algorithm needs a large amount of data to train a deep model and guarantee the models ability to generalize. It is not easy to generate large labeled datasets and it is one of the main barriers to apply deep learning for many problems. Data augmentation schemes were introduced to overcome this limitation, by extending small available labeled datasets. In this work we experiment with extending a small labeled dataset of Arabic continuous subwords by an orders of magnitude. The labeled dataset, which consist of handwritten Arabic subwords is used to synthesize a large collection of labeled dataset. The synthesized subwords are based on one or multiple writing styles from the original labeled dataset. We also experiment with generating various printed forms of subwords. We include only Naskh font, as most of the Arabic historical manuscripts were written in this type of font. We train several convolutional neural networks using handwritten, printed and synthesized datasets and obtain encouraging results.

Research paper thumbnail of Synthesizing versus Augmentation for Arabic Word Recognition with Convolutional Neural Networks

2018 IEEE 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR), 2018

In this paper, we present a sub-word recognition method for historical Arabic manuscripts, using ... more In this paper, we present a sub-word recognition method for historical Arabic manuscripts, using convolutional neural networks. We investigate the benefit of extending training set with synthetically created samples in comparison to augmentation. We show that annotating around ten pages of a manuscript and extending it, is sufficient for successful sub-word recognition in the whole manuscript. In addition, we show the contribution of using different combinations of training sets and compare their sub-word recognition performance in the whole manuscript.