Geometric Layout Analysis Techniques for Document Image Understanding: a Review. TR 9703-09 (original) (raw)

Geometric Layout Analysis Techniques for Document Image Understanding: a Review

1998

Document Image Understanding (DIU) is an interesting research area with a large variety of challenging applications. Researchers have worked from decades on this topic, as witnessed by the scientific literature. The main purpose of the present report is to describe the current status of DIU with particular attention to two subprocesses: document skew angle estimation and page decomposition. Several algorithms proposed in the literature are synthetically described. They are included in a novel classification scheme. Some methods proposed for the evaluation of page decomposition algorithms are described. Critical discussions are reported about the current status of the field and about the open problems. Some considerations about the logical layout analysis are also reported.

Evaluation of Geometric Layout Analysis Techniques for Document Image Analysis

International Journal of Computer Applications, 2010

Document Image Analysis (DIA) is an interesting research area with a large variety of challenging applications. Document analysis is a component which decomposes a document image into several consistent items which represent coherent components of the document such as text-lines, photographs, graphics etc. without any knowledge of the specific format. A document image is composed of several blocks, each of which represents a coherent component of the document. One coherent component corresponds to a set of text lines with the same typeface and a consistent line spacing. The geometric structure means the geometric relationships between the blocks. This paper describes the current status of Document Image Analysis and Understanding techniques with particular attention to the evaluation of geometric layout analysis techniques. The textbased approach and A region-based approach are the two evaluation methods for page decomposition described in this paper.

Geometric Structure Analysis of Document Images: A Knowledge-Based Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000

AbstractÐGeometric structure analysis is a prerequisite to create electronic documents from logical components extracted from document images. This paper presents a knowledge-based method for sophisticated geometric structure analysis of technical journal pages. The proposed knowledge base encodes geometric characteristics that are not only common in technical journals but also publication-specific in the form of rules. The method takes the hybrid of top-down and bottom-up techniques and consists of two phases: region segmentation and identification. Generally, the result of the segmentation process does not have a one-to-one matching with composite layout components. Therefore, the proposed method identifies nontext objects, such as images, drawings, and tables, as well as text objects, such as text lines and equations, by splitting or grouping segmented regions into composite layout components. Experimental results with 372 images scanned from the IEEE Transactions on Pattern Analysis and Machine Intelligence show that the proposed method has performed geometric structure analysis successfully on more than 99 percent of the test images, resulting in impressive performance compared with previous works.

Over-Splitted and Merged for Geometry Document Layout Analysis

FAIR - NGHIÊN CỨU CƠ BẢN VÀ ỨNG DỤNG CÔNG NGHỆ THÔNG TIN 2015, 2016

Automatic transformation of paper documents into electronic forms requires geometry document layout analysis at the first stage. However, variations in character font sizes, text-line spacing, and layout structures have made it difficult to design a general purpose method. The use of some parameters has therefore been unavoidable in geometry document layout analysis algorithms. This lead to errors over-segmentation and under-segmentation of previous algorithms. This paper present a new approach to geometry document layout analysis. Our algorithm use a set of whitespace covering document background to reduce candidate zones. Some of them may be considered as over-segmented. The way bottom-up is used to group over-segmentation zones each other based on adaptive parameters. Finally, we proposed context analysis at textline level to segment document images into paragraphs. Experimental results on the ICDAR2009 competition data set shown that the proposed algorithm reduces vast amount of both over-segmentation and under-segmentation errors, thus boost the performance significantly comparing to the state-of-theart algorithms

BINYAS: a complex document layout analysis system

Multimedia Tools and Applications, 2020

Document layout analysis (DLA) is an irreplaceable prerequisite for the development of a comprehensive document image processing and analysis system. The main purpose of DLA is to segment an input document image into its constituent and coherent regions and identify their classes. In this paper, we propose a competent DLA system, named as BINYAS, based on the connected component (CC) and pixel analysis based approach. Here, we initially identify the regions and then classify these regions as paragraph, separator, graphic, image, table, chart, and inverted text etc. The proposed system is evaluated on four publicly available standard datasets, namely ICDAR 2009, 2015, 2017 and 2019 page segmentation competition datasets, and the performance is compared with many contemporary methods, which also include some well-known software products and deep learning based methods. Experimental results show that our method performs significantly better than state-of-the-art methods in terms of the evaluation metrics considered by the research community of this domain.

Adaptive Layout Analysis of Document Images

Lecture Notes in Computer Science, 2002

Layout analysis is the process of extracting a hierarchical structure describing the layout of a page. In the document processing system WISDOM++ the layout analysis is performed in two steps: firstly, the global analysis determines possible areas containing paragraphs, sections, columns, figures and tables, and secondly, the local analysis groups together blocks that possibly fall within the same area. The result of the local analysis process strongly depends on the quality of the results of the first step. In this paper we investigate the possibility of supporting the user during the correction of the results of the global analysis. This is done by allowing the user to correct the results of the global analysis and then by learning rules for layout correction from the sequence of user actions. Experimental results on a set of multi-page documents are reported.

Resolution independent skew and orientation detection for document images

2009

ABSTRACT In large scale scanning applications, orientation detection of the digitized page is necessary for the following procedures to work correctly. Several existing methods for orientation detection use the fact that in Roman script text, ascenders are more likely to occur than descenders. In this paper, we propose a different approach for page orientation detection that uses this information.