Sanjeev Budha - Academia.edu (original) (raw)
Uploads
Papers by Sanjeev Budha
The scanned documents are not always noise free. Thus the pre-processing is required to clean-up ... more The scanned documents are not always noise free. Thus the pre-processing is required to clean-up scanned documents. Detection of text areas is also an important for the proper segmentation and text information extraction. Before training the classifier training dataset preparation must be done. This process involves automated dataset generation and the manual labeling of character images. This report presents the works done until the time of second reporting of the Nepali OCR Project. This includes the improvement in the pre-processing steps, dataset preparation, and design of basic recognition prototype.
The scanned documents are not always noise free. Thus the pre-processing is required to clean-up ... more The scanned documents are not always noise free. Thus the pre-processing is required to clean-up scanned documents. Detection of text areas is also an important for the proper segmentation and text information extraction. Before training the classifier training dataset preparation must be done. This process involves automated dataset generation and the manual labeling of character images. This report presents the works done until the time of second reporting of the Nepali OCR Project. This includes the improvement in the pre-processing steps, dataset preparation, and design of basic recognition prototype.