Introduction — dhSegment documentation (original) (raw)

dhSegment

What is dhSegment?

dhSegment system

dhSegment is a generic approach for Historical Document Processing. It relies on a Convolutional Neural Network to do the heavy lifting of predicting pixelwise characteristics. Then simple image processing operations are provided to extract the components of interest (boxes, polygons, lines, masks, …)

A few key facts:

What sort of training data do I need?

Each training sample consists in an image of a document and its corresponding parts to be predicted.

example image input example label

Additionally, a text file encoding the RGB values of the classes needs to be provided. In this case if we want the classes ‘background’, ‘document’ and ‘photograph’ to be respectively classes 0, 1, and 2 we need to encode their color line-by-line:

Use cases

Page Segmentation

page extraction use case

Dataset : READ-BAD [GruningLD+18] annotated by [TDW+17].

Layout Analysis

diva use case diva predictions use case

Dataset : DIVA-HisDB [SSE+16].

Document Segmentation

cini photo collection extraction use case

Dataset : Photo-collection from the Cini Foundation.

Tensorboard Integration

The TensorBoard integration allows to visualize your TensorFlow graph, plot metrics and show the images and predictions during the execution of the graph.

tensorboard example 1 tensorboard example 2 tensorboard example 3