Automatic Handwritten Indian Scripts Identification (original) (raw)
Since OCR engines are usually script-dependent, automatic text recognition in multi-script document requires a pre-processor module that identifies the scripts. Based on this motivation, in this paper, we present a word level handwritten Indian script identification technique. To handle this, words are first segmented by morphological dilation and performed connected component labelling. We then employ the Radon transform, discrete wavelet transform, statistical filters and discrete cosine transform to extract the directional multi-resolution spatial features. We tested the features by using linear discriminant analysis, support vector machine and K-nearest neighbour classifiers over 11 different major Indian scripts (including Roman) in bi-script and tri-script scenario. In our tests, we have achieved maximum accuracies of 98% and 96% for bi-script and tri-scipt respectively.
Sign up for access to the world's latest research.
checkGet notified about relevant papers
checkSave papers to use in your research
checkJoin the discussion with peers
checkTrack your impact