Henry Baird - Academia.edu (original) (raw)
Uploads
Papers by Henry Baird
IEEE Transactions on Pattern Analysis and Machine Intelligence, Jan 1, 1987
We describe the current state of a system that recognizes printed text of various fonts and sizes... more We describe the current state of a system that recognizes printed text of various fonts and sizes for the Roman alphabet. The system combines several techniques in order to improve the overall recognition rate. Thinning and shape extraction are performed directly on a graph of the run-leiigth encoding of a binary image. The resulting strokes and other shapes are mapped, using a shape-clustering approach, into binary features which are then fed into a statistical Bayesian classifier. Large-scale trials have shown better than 97 percent top choice correct performance on mixtures of six dissimilar fonts, and over 99 percent on most single fonts, over a range of point sizes. Certain remaining confusion classes are disambiguated through contour analysis, and characters suspected of being merged are broken aiid reclassified. Finally, layout and linguistic context are applied. The results are illustrated by sample pages.
A combination of x-ray fluorescence and image processing has been shown to recover text character... more A combination of x-ray fluorescence and image processing has been shown to recover text characters written in iron gall ink on parchment, even when obscured by gold paint. Several leaves of the Archimedes Palimpsest were imaged using rapid-scan, x-ray fluorescence imaging performed at the Stanford Synchrotron Radiation Lightsource of the SLAC National Accelerator Laboratory. A simple linear show-through model is shown to successfully separate different layers of text in the x-ray images, making the text easier to read by the ...
IEEE Transactions on Pattern Analysis and Machine Intelligence, Jan 1, 1987
We describe the current state of a system that recognizes printed text of various fonts and sizes... more We describe the current state of a system that recognizes printed text of various fonts and sizes for the Roman alphabet. The system combines several techniques in order to improve the overall recognition rate. Thinning and shape extraction are performed directly on a graph of the run-leiigth encoding of a binary image. The resulting strokes and other shapes are mapped, using a shape-clustering approach, into binary features which are then fed into a statistical Bayesian classifier. Large-scale trials have shown better than 97 percent top choice correct performance on mixtures of six dissimilar fonts, and over 99 percent on most single fonts, over a range of point sizes. Certain remaining confusion classes are disambiguated through contour analysis, and characters suspected of being merged are broken aiid reclassified. Finally, layout and linguistic context are applied. The results are illustrated by sample pages.
A combination of x-ray fluorescence and image processing has been shown to recover text character... more A combination of x-ray fluorescence and image processing has been shown to recover text characters written in iron gall ink on parchment, even when obscured by gold paint. Several leaves of the Archimedes Palimpsest were imaged using rapid-scan, x-ray fluorescence imaging performed at the Stanford Synchrotron Radiation Lightsource of the SLAC National Accelerator Laboratory. A simple linear show-through model is shown to successfully separate different layers of text in the x-ray images, making the text easier to read by the ...