A robust stamp detection framework on degraded documents (original) (raw)

Extraction from Document Images Based on Hough Transform Technique

2015

Text extraction in document images has been developing rapidly since 1990s and is an important research field in contentbased information indexing and retrieval, automatic annotation and structuring of document images. Extraction of text information involves detection, localization, tracking, extraction, enhancement, and recognition of the text from a given document images. However, variations of text due to differences in size, style, orientation, and alignment, as well as low image contrast and complex background make the problem of automatic text extraction extremely difficult and challenging job. A large number of techniques have been proposed to address this problem and the purpose of this paper is to classify and review Hough Transform techniques to extract text from document images. Hough Transform (HT) is recognized as a powerful tool for graphic element extraction from images due to its global vision and robustness in noisy or degraded environment. The method herein propose...

A new system of Word Spotting for manuscript retrieval based on Generalized Hough Transform

The new system that will be presented below offers the opportunity to identify all positions of a given word in a document. The system has been built on a subset of the Tunisian National Archive collection. Due to the high noise levels in historical documents, the great amount of variability in handwriting and the failure of automatic handwriting recognizers, the word spotting technique has been developed. In this research, GHT (Generelized Hough Transform) and word spotting are investigated for Tunisian historical handwritten document retrieval. The system applies the GHT technique on patterns that make up the word. These patterns will largely vote for similar patterns and will form voting clusters in Hough space. Then the system merges neighboring voting clusters. This clustering process has significantly enhanced word spotting. Experiments that have been conducted on a portion of Tunisian national archive collection show the advantage of the proposed system.