Franck Lebourgeois | INSA Lyon (original) (raw)

Uploads

Papers by Franck Lebourgeois

Research paper thumbnail of Caractérisation des écritures médiévales par des méthodes statistiques basées sur les cooccurrences

Gazette du livre médiéval, 2011

Cette these a pour objet l'elaboration de methodologies d'analyse permettant de decrire e... more Cette these a pour objet l'elaboration de methodologies d'analyse permettant de decrire et de comparer les ecritures manuscrites anciennes, methodologies d'analyse globale ne necessitant pas segmentation. Elle propose de nouveaux descripteurs robustes bases sur des statistiques d'ordre 2, la contribution essentielle reposant sur la notion de cooccurrence generalisee qui mesure la loi de probabilite conjointe d'informations extraites des images ; c'est une extension de la cooccurrence des niveaux de gris, utilisee jusqu'a present pour caracteriser les textures qui nous a permis d'elaborer diverses cooccurrences, spatiales relatives aux orientations et aux courbures locales des formes, parametriques qui mesurent l'evolution d'une image subissant des transformations successives. Le nombre de descripteurs obtenu etant tres (trop) eleve, nous proposons des methodes concues a partir des plus recentes methodes d'analyse statistique multidimensionnelle de reduction de ce nombre. Ces demarches nous ont conduit a introduire la notion de matrices de cooccurrences propres qui contiennent l'information essentielle permettant de decrire finement les images avec un nombre reduit de descripteurs. Dans la partie applicative nous proposons des methodes de classification non supervisees d'ecritures medievales. Le nombre de groupes et leurs contenus dependent des parametres utilises et des methodes appliquees. Nous avons aussi developpe un moteur de recherche d'ecritures similaires. Dans le cadre du projet ANR-MCD Graphem, nous avons elabore des methodes permettant d'analyser et de suivre l'evolution des ecritures du Moyen Age.

Research paper thumbnail of La reconnaissance des structures

HAL (Le Centre pour la Communication Scientifique Directe), Nov 1, 2006

Research paper thumbnail of Content based image retrieval using gradient color fields

Research paper thumbnail of Document Image Analysis solutions for Digital libraries

HAL (Le Centre pour la Communication Scientifique Directe), Jan 21, 2004

Research paper thumbnail of AColDSS: Robust Unsupervised Automatic Color Segmentation System for Noisy Heterogeneous Document Images

European Project Space on Computer Vision, Graphics, Optics and Photonics, 2015

Research paper thumbnail of Extraction de texte à base de segmentation colorimétrique dans les images de presse

Nous presenterons, dans cet article, un nouveau systeme de segmentation colorimetrique adapte aux... more Nous presenterons, dans cet article, un nouveau systeme de segmentation colorimetrique adapte aux images de documents et permettant de filtrer les bruits de numerisation. Une methode d'extraction de texte simple et rapide basee sur les resultats de la separation colorimetrique sera, ensuite, utilisee afin d'ameliorer les performances d'un OCR particulierement populaire et performant.

Research paper thumbnail of Classification of Medieval Writings by New Statistical Measures

This paper presents new techniques of medieval manuscript text discrimination in order to assist ... more This paper presents new techniques of medieval manuscript text discrimination in order to assist paleographers to understand the ancient manuscripts. One of the purposes of paleography is to cluster medieval writings into families, to find relations between them, and find their historical period and/or location. This work aims to confirm paleographers’ classification of medieval writings. It also explores the occasion to show the viability to discriminate medieval writings by using image analysis. In this paper, we define the e-paleography as the assistance of the paleography science by computer vision. Our foremost idea is to select writing features which do not require image segmentation and layout analysis. Our method is based on the Spatial Grey-Level Dependence (SGLD) which measures the join probability between grey level values of pixels for each spatial relation. We prose a statistical measure witch generalizes SGLD, and we also propose the Spatial Curvature Dependence (SCD),...

Research paper thumbnail of Multilingual Scene Character Recognition System using Sparse Auto-Encoder for Efficient Local Features Representation in Bag of Features

ArXiv, 2018

The recognition of texts existing in camera-captured images has become an important issue for a g... more The recognition of texts existing in camera-captured images has become an important issue for a great deal of research during the past few decades. This give birth to Scene Character Recognition (SCR) which is an important step in scene text recognition pipeline. In this paper, we extended the Bag of Features (BoF)-based model using deep learning for representing features for accurate SCR of different languages. In the features coding step, a deep Sparse Auto-encoder (SAE)-based strategy was applied to enhance the representative and discriminative abilities of image features. This deep learning architecture provides more efficient features representation and therefore a better recognition accuracy. Our system was evaluated extensively on all the scene character datasets of five different languages. The experimental results proved the efficiency of our system for a multilingual SCR.

Research paper thumbnail of An Efficient New PDE-based Characters Reconstruction after Graphics Removal

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2016

Research paper thumbnail of DEBORA: Digital AccEss to BOoks of the RenAissance

International Journal of Document Analysis and Recognition (IJDAR), 2007

Research paper thumbnail of Resolution enhancement of textual images: a survey of single image-based methods

IET Image Processing, 2016

Super-resolution (SR) task has become an important research area due to the rapidly growing inter... more Super-resolution (SR) task has become an important research area due to the rapidly growing interest for high quality images in various computer vision and pattern recognition applications. This has led to the emergence of various SR approaches. According to the number of input images, two kinds of approaches could be distinguished: single or multi-input based approaches. Certainly, processing multiple inputs could lead to an interesting output, but this is not the case mainly for textual image processing. This study focuses on single image-based approaches. Most of the existing methods have been successfully applied on natural images. Nevertheless, their direct application on textual images is not enough efficient due to the specificities that distinguish these particular images from natural images. Therefore, SR approaches especially suited for textual images are proposed in the literature. Previous overviews of SR methods have been concentrated on natural images application with no real application on the textual ones. Thus, this study aims to tackle this lack by surveying methods that are mainly designed for enhancing low-resolution textual images. The authors further criticise these methods and discuss areas which promise improvements in such task. To the best of the authors’ knowledge, this survey is the first investigation in the literature.

Research paper thumbnail of Methods and results of modeling and transmission-line calculations of the superconducting dipole chains of CERN's LHC collider

PPPS-2001 Pulsed Power Plasma Science 2001. 28th IEEE International Conference on Plasma Science and 13th IEEE International Pulsed Power Conference. Digest of Papers (Cat. No.01CH37251)

Research paper thumbnail of Restoring Ink Bleed-Through Degraded Document Images Using a Recursive Unsupervised Classification Technique

Lecture Notes in Computer Science, 2006

Research paper thumbnail of Automatic Metadata Retrieval from Ancient Manuscripts

Lecture Notes in Computer Science, 2004

Research paper thumbnail of Omnilingual segmentation-free word spotting for ancient manuscripts indexation

Eighth International Conference on Document Analysis and Recognition (ICDAR'05), 2005

This article introduces a new word spotting method designed for ancient manuscripts. We take adva... more This article introduces a new word spotting method designed for ancient manuscripts. We take advantage of the robustness of the gradient feature and propose a new segmentation-free matching algorithm that tolerates spatial variations. We test our algorithm on ancient Latin manuscripts and on George Washington's manuscripts.

Research paper thumbnail of Document understanding using probabilistic relaxation: application on tables of contents of periodicals

Proceedings of Sixth International Conference on Document Analysis and Recognition

This paper describes a statistical model for a document understanding system, which uses both tex... more This paper describes a statistical model for a document understanding system, which uses both text attributes and document layouts. Probabilistic relaxation is used as a recognition scheme to find the hierarchical structure of the logical layout. This approach, commonly used for pixels classification in image analysis, can be applied to classify text blocks into logical classes according to local compatibility

Research paper thumbnail of Skeletonization by Gradient Regularization and Diffusion

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2, 2007

Research paper thumbnail of Serialized k-Means for Adaptative Color Image Segmentation

Lecture Notes in Computer Science, 2004

Research paper thumbnail of Skeletonization by Gradient Diffusion and Regularization

2007 IEEE International Conference on Image Processing, 2007

Research paper thumbnail of Segmentation Of Broken Characters Using Pattern Matching

Nowadays the research on OCR system focuses on corrupted and damaged characters from printed and ... more Nowadays the research on OCR system focuses on corrupted and damaged characters from printed and handwritten documents. Many researches have been done on touching charac-ters but only few on broken characters. This paper presents a new method to reconstruct printed characters extracted as many connected components. Our approach is based on the pat-tern similarity between broken characters and perfect ones from the same printed document. In the first step, we use a multi-segmentation algorithm to extract all possible connected compo-nents from a document image digitized in grayscale, and then we order them by their size. The correctly segmented characters are supposed to be bigger than the parts of miss-recognized ones. We compute a similarity measure between all connected components, in decreasing order of their size. Then we localize the broken characters by using the bounding box of the correct pattern which have the best match.

Research paper thumbnail of Caractérisation des écritures médiévales par des méthodes statistiques basées sur les cooccurrences

Gazette du livre médiéval, 2011

Cette these a pour objet l'elaboration de methodologies d'analyse permettant de decrire e... more Cette these a pour objet l'elaboration de methodologies d'analyse permettant de decrire et de comparer les ecritures manuscrites anciennes, methodologies d'analyse globale ne necessitant pas segmentation. Elle propose de nouveaux descripteurs robustes bases sur des statistiques d'ordre 2, la contribution essentielle reposant sur la notion de cooccurrence generalisee qui mesure la loi de probabilite conjointe d'informations extraites des images ; c'est une extension de la cooccurrence des niveaux de gris, utilisee jusqu'a present pour caracteriser les textures qui nous a permis d'elaborer diverses cooccurrences, spatiales relatives aux orientations et aux courbures locales des formes, parametriques qui mesurent l'evolution d'une image subissant des transformations successives. Le nombre de descripteurs obtenu etant tres (trop) eleve, nous proposons des methodes concues a partir des plus recentes methodes d'analyse statistique multidimensionnelle de reduction de ce nombre. Ces demarches nous ont conduit a introduire la notion de matrices de cooccurrences propres qui contiennent l'information essentielle permettant de decrire finement les images avec un nombre reduit de descripteurs. Dans la partie applicative nous proposons des methodes de classification non supervisees d'ecritures medievales. Le nombre de groupes et leurs contenus dependent des parametres utilises et des methodes appliquees. Nous avons aussi developpe un moteur de recherche d'ecritures similaires. Dans le cadre du projet ANR-MCD Graphem, nous avons elabore des methodes permettant d'analyser et de suivre l'evolution des ecritures du Moyen Age.

Research paper thumbnail of La reconnaissance des structures

HAL (Le Centre pour la Communication Scientifique Directe), Nov 1, 2006

Research paper thumbnail of Content based image retrieval using gradient color fields

Research paper thumbnail of Document Image Analysis solutions for Digital libraries

HAL (Le Centre pour la Communication Scientifique Directe), Jan 21, 2004

Research paper thumbnail of AColDSS: Robust Unsupervised Automatic Color Segmentation System for Noisy Heterogeneous Document Images

European Project Space on Computer Vision, Graphics, Optics and Photonics, 2015

Research paper thumbnail of Extraction de texte à base de segmentation colorimétrique dans les images de presse

Nous presenterons, dans cet article, un nouveau systeme de segmentation colorimetrique adapte aux... more Nous presenterons, dans cet article, un nouveau systeme de segmentation colorimetrique adapte aux images de documents et permettant de filtrer les bruits de numerisation. Une methode d'extraction de texte simple et rapide basee sur les resultats de la separation colorimetrique sera, ensuite, utilisee afin d'ameliorer les performances d'un OCR particulierement populaire et performant.

Research paper thumbnail of Classification of Medieval Writings by New Statistical Measures

This paper presents new techniques of medieval manuscript text discrimination in order to assist ... more This paper presents new techniques of medieval manuscript text discrimination in order to assist paleographers to understand the ancient manuscripts. One of the purposes of paleography is to cluster medieval writings into families, to find relations between them, and find their historical period and/or location. This work aims to confirm paleographers’ classification of medieval writings. It also explores the occasion to show the viability to discriminate medieval writings by using image analysis. In this paper, we define the e-paleography as the assistance of the paleography science by computer vision. Our foremost idea is to select writing features which do not require image segmentation and layout analysis. Our method is based on the Spatial Grey-Level Dependence (SGLD) which measures the join probability between grey level values of pixels for each spatial relation. We prose a statistical measure witch generalizes SGLD, and we also propose the Spatial Curvature Dependence (SCD),...

Research paper thumbnail of Multilingual Scene Character Recognition System using Sparse Auto-Encoder for Efficient Local Features Representation in Bag of Features

ArXiv, 2018

The recognition of texts existing in camera-captured images has become an important issue for a g... more The recognition of texts existing in camera-captured images has become an important issue for a great deal of research during the past few decades. This give birth to Scene Character Recognition (SCR) which is an important step in scene text recognition pipeline. In this paper, we extended the Bag of Features (BoF)-based model using deep learning for representing features for accurate SCR of different languages. In the features coding step, a deep Sparse Auto-encoder (SAE)-based strategy was applied to enhance the representative and discriminative abilities of image features. This deep learning architecture provides more efficient features representation and therefore a better recognition accuracy. Our system was evaluated extensively on all the scene character datasets of five different languages. The experimental results proved the efficiency of our system for a multilingual SCR.

Research paper thumbnail of An Efficient New PDE-based Characters Reconstruction after Graphics Removal

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2016

Research paper thumbnail of DEBORA: Digital AccEss to BOoks of the RenAissance

International Journal of Document Analysis and Recognition (IJDAR), 2007

Research paper thumbnail of Resolution enhancement of textual images: a survey of single image-based methods

IET Image Processing, 2016

Super-resolution (SR) task has become an important research area due to the rapidly growing inter... more Super-resolution (SR) task has become an important research area due to the rapidly growing interest for high quality images in various computer vision and pattern recognition applications. This has led to the emergence of various SR approaches. According to the number of input images, two kinds of approaches could be distinguished: single or multi-input based approaches. Certainly, processing multiple inputs could lead to an interesting output, but this is not the case mainly for textual image processing. This study focuses on single image-based approaches. Most of the existing methods have been successfully applied on natural images. Nevertheless, their direct application on textual images is not enough efficient due to the specificities that distinguish these particular images from natural images. Therefore, SR approaches especially suited for textual images are proposed in the literature. Previous overviews of SR methods have been concentrated on natural images application with no real application on the textual ones. Thus, this study aims to tackle this lack by surveying methods that are mainly designed for enhancing low-resolution textual images. The authors further criticise these methods and discuss areas which promise improvements in such task. To the best of the authors’ knowledge, this survey is the first investigation in the literature.

Research paper thumbnail of Methods and results of modeling and transmission-line calculations of the superconducting dipole chains of CERN's LHC collider

PPPS-2001 Pulsed Power Plasma Science 2001. 28th IEEE International Conference on Plasma Science and 13th IEEE International Pulsed Power Conference. Digest of Papers (Cat. No.01CH37251)

Research paper thumbnail of Restoring Ink Bleed-Through Degraded Document Images Using a Recursive Unsupervised Classification Technique

Lecture Notes in Computer Science, 2006

Research paper thumbnail of Automatic Metadata Retrieval from Ancient Manuscripts

Lecture Notes in Computer Science, 2004

Research paper thumbnail of Omnilingual segmentation-free word spotting for ancient manuscripts indexation

Eighth International Conference on Document Analysis and Recognition (ICDAR'05), 2005

This article introduces a new word spotting method designed for ancient manuscripts. We take adva... more This article introduces a new word spotting method designed for ancient manuscripts. We take advantage of the robustness of the gradient feature and propose a new segmentation-free matching algorithm that tolerates spatial variations. We test our algorithm on ancient Latin manuscripts and on George Washington's manuscripts.

Research paper thumbnail of Document understanding using probabilistic relaxation: application on tables of contents of periodicals

Proceedings of Sixth International Conference on Document Analysis and Recognition

This paper describes a statistical model for a document understanding system, which uses both tex... more This paper describes a statistical model for a document understanding system, which uses both text attributes and document layouts. Probabilistic relaxation is used as a recognition scheme to find the hierarchical structure of the logical layout. This approach, commonly used for pixels classification in image analysis, can be applied to classify text blocks into logical classes according to local compatibility

Research paper thumbnail of Skeletonization by Gradient Regularization and Diffusion

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2, 2007

Research paper thumbnail of Serialized k-Means for Adaptative Color Image Segmentation

Lecture Notes in Computer Science, 2004

Research paper thumbnail of Skeletonization by Gradient Diffusion and Regularization

2007 IEEE International Conference on Image Processing, 2007

Research paper thumbnail of Segmentation Of Broken Characters Using Pattern Matching

Nowadays the research on OCR system focuses on corrupted and damaged characters from printed and ... more Nowadays the research on OCR system focuses on corrupted and damaged characters from printed and handwritten documents. Many researches have been done on touching charac-ters but only few on broken characters. This paper presents a new method to reconstruct printed characters extracted as many connected components. Our approach is based on the pat-tern similarity between broken characters and perfect ones from the same printed document. In the first step, we use a multi-segmentation algorithm to extract all possible connected compo-nents from a document image digitized in grayscale, and then we order them by their size. The correctly segmented characters are supposed to be bigger than the parts of miss-recognized ones. We compute a similarity measure between all connected components, in decreasing order of their size. Then we localize the broken characters by using the bounding box of the correct pattern which have the best match.