Robert Haralick | Graduate Center of the City University of New York (original) (raw)
Papers by Robert Haralick
Machine Vision Applications, 2000
Automated left ventricle (LV) boundary delineation from contrast ventriculograms has been studied... more Automated left ventricle (LV) boundary delineation from contrast ventriculograms has been studied for decades. Unfortunately, no accurate methods h ave ever been reported. A new knowledge based multi-stage method to automatically delineate the LV boundary at end diastole and end systole is discussed in this paper. It has a mean absolute boundary error of about Zmm and an associated ejection
Proceedings. 1985 IEEE International Conference on Robotics and Automation, 1985
Computer Vision, Graphics, and Image Processing, 1986
... J non-empty and bounded implies (q } = Jq a J. Since S = Jq, we have (q) = S s J.376 ZHUANG A... more ... J non-empty and bounded implies (q } = Jq a J. Since S = Jq, we have (q) = S s J.376 ZHUANG AND ... EDJ'N, since dilation is commutative. ... In the morphology case, a large structuring element isgiven and a morphological erosion or dilation must be performed with it on hardware ...
Lecture Notes in Computer Science, 1997
A t wo-stage approach is discussed for reconstructing a dense digital elevation model DEM of the ... more A t wo-stage approach is discussed for reconstructing a dense digital elevation model DEM of the terrain from multiple pre-calibrated images taken by distinct cameras at di erent time under various illumination. First, the terrain DEM and orthoimage are obtained by independent voxel-based reconstruction of the terrain points using simple relations between the corresponding image gray v alues. As distinct from other approaches, possible occlusions and changing shadow l a youts are taken into account implicitly by evaluating a con dence of every reconstructed terrain point. Then, the reconstructed DEM is re ned by excluding occlusions of more con dent points by less con dent ones and smoothed with due account of the con dence values. Experiments with RADIUS model-board images show that the nal re ned and smoothed DEM gives a feasible approximation to the desired terrain.
Lecture Notes in Computer Science, 1998
In this paper, we discuss how we use variances of gray level spatial dependencies as textural fea... more In this paper, we discuss how we use variances of gray level spatial dependencies as textural features to retrieve images having some section in them that is like the user input image. Gray level co-occurrence matrices at ve distances and four orientations are computed to measure texture which is de ned as being speci ed by the statistical distribution of the spatial relationships of gray level properties. A likelihood ratio classi er and a nearest neighbor classi er are used to assign two images to the relevance class if they are similar and to the irrelevance class if they are not. A protocol that involves translating a K K frame throughout every image to automatically construct groundtruth image pairs is proposed and performance of the algorithm is evaluated accordingly. From experiments on a database of 300 512 512 grayscale images with 9,600 groundtruth image pairs, we were able to estimate a lower bound of 80% correct classi cation rate of assigning sub-image pairs we were sure were relevant, to the relevance class. We also argue that some of the assignments which we counted as incorrect are not in fact incorrect.
Lecture Notes in Computer Science, 1999
Feature vectors that are used to represent images exist in a very high dimensional space. Usually... more Feature vectors that are used to represent images exist in a very high dimensional space. Usually, a parametric characterization of the distribution of this space is impossible. It is generally assumed that the features are able to locate visually similar images close in the feature space so that non-parametric approaches, like the k-nearest neighbor search, can be used for retrieval. This paper introduces a graph-theoretic approach to image retrieval by formulating the database search as a graph clustering problem to increase the chances of retrieving similar images by not only ensuring that the retrieved images are close to the query image, but also adding another constraint that they should be close to each other in the feature space. Retrieval precision with and without clustering are compared for performance characterization. The average precision after clustering was 0.78, an improvement of 6.85% over the average precision before clustering.
[1992] Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis, 1992
ABSTRACT A method for estimating 2D image sequences using the generalized time-frequency represen... more ABSTRACT A method for estimating 2D image sequences using the generalized time-frequency representation (GTFR) is presented. The performance characterization is based on 20000 simulated noisy images. The results show that even with a signal to noise ratio (SNR) of 5 db, rectangular objects of at least 4 pixels in length and width can be detected, to within 2.5 pixel location accuracy. The misdetection rate is near zero, and the average false detection rate is 0.06 false objects per frame of 64×64 size in a 5 db SNR environment. With -1 db SNR, objects can be detected within 5 pixel accuracy. The misdetection rate for this case is near zero and the average false detection rate is 1.6 false objects per frame of 64×64 size. The method is also applied to an image sequence obtained from a 747 takeoff scene. The 747 takeoff speed was predicted to be 142 kts, which is well within the typical range of 140 to 150 kts
ICASSP '84. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1984
ABSTRACT A new algorithm is developed for solving the maximum entropy (ME) image reconstruction p... more ABSTRACT A new algorithm is developed for solving the maximum entropy (ME) image reconstruction problem. This approach involves solving a system of ordinary differential equations with appropriate initial conditions which can be computed easily. Instead of searching in the n+1-dimensional space as required for most ME algorithms, our approach involves solving a 1-dimensional search problem along a well-defined path. Moreover an efficient algorithm is developed to handle this search.
Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, 2000
Content-based image retrieval systems use low-level features like color and texture for image rep... more Content-based image retrieval systems use low-level features like color and texture for image representation. Given these representations as feature vectors, similarity between images is measured by computing distances in the feature space. Unfortunately, these low-level features cannot always capture the high-level concept of similarity in human perception. Relevance feedback tries to improve the performance by allowing iterative retrievals where the feedback information from the user is incorporated into the database search. We present a weighted distance approach that uses standard deviations of the features both for the whole database and also among the images selected as relevant by the user. These weights are used to iteratively refine the effects of different features in the database search. Retrieval performance is evaluated using average precision and progress computed on a database of approximately 10,000 images and an average performance improvement of 19% is obtained after the first iteration.
Lecture Notes in Computer Science, 2002
A document can be divided into zones on the basis of its content. For example, a zone can be eith... more A document can be divided into zones on the basis of its content. For example, a zone can be either text or non-text. Given the segmented document zones, correctly determining the zone content type is very important for the subsequent processes within any document image understanding system. This paper describes an algorithm for the determination of zone type of a given zone within an input document image. In our zone classification algorithm, zones are represented as feature vectors. Each feature vector consists of a set of 25 measurements of pre-defined properties. A probabilistic model, decision tree, is used to classify each zone on the basis of its feature vector. Two methods are used to optimize the decision tree classifier to eliminate the data over-fitting problem. To enrich our probabilistic model, we incorporate context constraints for certain zones within their neighboring zones. We also model zone class context constraints as a Hidden Markov Model and used Viterbi algorithm to obtain optimal classification results. The training, pruning and testing data set for the algorithm include 1, 600 images drawn from the UWCDROM-III document image database. With a total of 24, 177 zones within the data set, the cross-validation method was used in the performance evaluation of the classifier. The classifier is able to classify each given scientific and technical document zone into one of the nine classes, 2 text classes (of font size 4 − 18pt and font size 19 − 32 pt), math, table, halftone, map/drawing, ruling, logo, and others. A zone content classification performance evaluation protocol is proposed. Using this protocol, our algorithm accuracy is 98.45% with a mean false alarm rate of 0.50%.
Lecture Notes in Computer Science, 1996
This paper describes a protocol for systematically evaluating the performance of dashed-line dete... more This paper describes a protocol for systematically evaluating the performance of dashed-line detection algorithms. It includes a test image generator which creates random line patterns subject to prespecified constraints. The generator also outputs ground truth data for each line in the image. The output of the dashed line detection algorithm is then compared to these ground truths and evaluated using a set of criteria.
Series in Machine Perception and Artificial Intelligence, 1999
Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93), 1993
Two sources of document degradation are modeled: i) perspective distortion that occurs while phot... more Two sources of document degradation are modeled: i) perspective distortion that occurs while photocopying or scanning thick, bound documents, and ii) degradation due to perturbations in the optical scanning and digitization process: speckle, blurr, jitter, thresholding. Perspective distortion is modeled by studying the underlying perspective geometry of the optical system of photocopiers and scanners. An illumination model is described to account for the nonlinear intensity change occuring across a page in a perspective-distorted document. The optical distortion process is modeled morphologically. First, a distance transform on the foreground is performed, followed by a random inversion of binary pixels where the probability of flip is a function of the distance of the pixel to the boundary of the foreground. Correlating the flipped pixels is modeled by a morphological closing operation
Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1997
This paper presents a statistical approach for detecting corners from chain encoded digital arcs.... more This paper presents a statistical approach for detecting corners from chain encoded digital arcs. An arc point is declared as a corner if the estimated parameters of the two fitted lines of the two arc segments immediately to the right and left of the arc point are statistically significantly different. The corner detection algorithm consists of two steps: corner detection and optimization. While corner detection involves statistically identifying the most likely corner points along an arc sequence, corner optimization deals with improving the locational errors of the detected corners.
Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348), 1999
Proceedings IEEE International Symposium on Bio-Informatics and Biomedical Engineering, 2000
Automated left ventricle (LV) boundary delineation from left ventriculograms has been studied for... more Automated left ventricle (LV) boundary delineation from left ventriculograms has been studied for decades. Unfortunately, no methods in terms of the accuracy about volume and ejection fraction have ever been reported. A new knowledge based multi-stage method to automatically delineate the LV boundary at end diastole and end systole is discussed in this paper: It has a mean absolute boundary
Image retrieval algorithms use distances between feature vectors to compute similarities between ... more Image retrieval algorithms use distances between feature vectors to compute similarities between images. An important step between feature extraction and distance computation is feature normalization. Popular distance measures like the Euclidean distance implicitly assign more weighting to features with large ranges than those with small ranges. This paper describes six normalization methods to make all features have approximately the same effect in the computation of similarity. We investigate the effectiveness of these normalization methods in combination with the commonly used city-block (L 1 ) and Euclidean (L 2 ) distances in image similarity and retrieval. Experiments on a database of approximately 10,000 images show that the normalization method indeed has a significant effect on retrieval performance and studying the feature distributions and using the results of this study significantly improves the results compared to making only general assumptions.
Machine Vision Applications, 1994
version (from foreground to background and vice- versa) that occurs independently at each pixel S... more version (from foreground to background and vice- versa) that occurs independently at each pixel Scanned documents are noisy. Recently, (KHP93, due to light intensity fluctuations and threshold- KHP94, BaiSO), document degradation models were ing level, and (ii) the blurring that occurs due to proposed that model the local distortion introduced the point-spread function of the optical system the during the
Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001
... The following parts describe the automatic table ground truth generation procedure. ... softw... more ... The following parts describe the automatic table ground truth generation procedure. ... software [ 101 con-verts DVI file to a TIFF(Tagged Image File Format) file and a so-called character ground truth file which contains the bounding box coordinates, the type and size of ...
Pattern Recognition, 1979
Machine Vision Applications, 2000
Automated left ventricle (LV) boundary delineation from contrast ventriculograms has been studied... more Automated left ventricle (LV) boundary delineation from contrast ventriculograms has been studied for decades. Unfortunately, no accurate methods h ave ever been reported. A new knowledge based multi-stage method to automatically delineate the LV boundary at end diastole and end systole is discussed in this paper. It has a mean absolute boundary error of about Zmm and an associated ejection
Proceedings. 1985 IEEE International Conference on Robotics and Automation, 1985
Computer Vision, Graphics, and Image Processing, 1986
... J non-empty and bounded implies (q } = Jq a J. Since S = Jq, we have (q) = S s J.376 ZHUANG A... more ... J non-empty and bounded implies (q } = Jq a J. Since S = Jq, we have (q) = S s J.376 ZHUANG AND ... EDJ'N, since dilation is commutative. ... In the morphology case, a large structuring element isgiven and a morphological erosion or dilation must be performed with it on hardware ...
Lecture Notes in Computer Science, 1997
A t wo-stage approach is discussed for reconstructing a dense digital elevation model DEM of the ... more A t wo-stage approach is discussed for reconstructing a dense digital elevation model DEM of the terrain from multiple pre-calibrated images taken by distinct cameras at di erent time under various illumination. First, the terrain DEM and orthoimage are obtained by independent voxel-based reconstruction of the terrain points using simple relations between the corresponding image gray v alues. As distinct from other approaches, possible occlusions and changing shadow l a youts are taken into account implicitly by evaluating a con dence of every reconstructed terrain point. Then, the reconstructed DEM is re ned by excluding occlusions of more con dent points by less con dent ones and smoothed with due account of the con dence values. Experiments with RADIUS model-board images show that the nal re ned and smoothed DEM gives a feasible approximation to the desired terrain.
Lecture Notes in Computer Science, 1998
In this paper, we discuss how we use variances of gray level spatial dependencies as textural fea... more In this paper, we discuss how we use variances of gray level spatial dependencies as textural features to retrieve images having some section in them that is like the user input image. Gray level co-occurrence matrices at ve distances and four orientations are computed to measure texture which is de ned as being speci ed by the statistical distribution of the spatial relationships of gray level properties. A likelihood ratio classi er and a nearest neighbor classi er are used to assign two images to the relevance class if they are similar and to the irrelevance class if they are not. A protocol that involves translating a K K frame throughout every image to automatically construct groundtruth image pairs is proposed and performance of the algorithm is evaluated accordingly. From experiments on a database of 300 512 512 grayscale images with 9,600 groundtruth image pairs, we were able to estimate a lower bound of 80% correct classi cation rate of assigning sub-image pairs we were sure were relevant, to the relevance class. We also argue that some of the assignments which we counted as incorrect are not in fact incorrect.
Lecture Notes in Computer Science, 1999
Feature vectors that are used to represent images exist in a very high dimensional space. Usually... more Feature vectors that are used to represent images exist in a very high dimensional space. Usually, a parametric characterization of the distribution of this space is impossible. It is generally assumed that the features are able to locate visually similar images close in the feature space so that non-parametric approaches, like the k-nearest neighbor search, can be used for retrieval. This paper introduces a graph-theoretic approach to image retrieval by formulating the database search as a graph clustering problem to increase the chances of retrieving similar images by not only ensuring that the retrieved images are close to the query image, but also adding another constraint that they should be close to each other in the feature space. Retrieval precision with and without clustering are compared for performance characterization. The average precision after clustering was 0.78, an improvement of 6.85% over the average precision before clustering.
[1992] Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis, 1992
ABSTRACT A method for estimating 2D image sequences using the generalized time-frequency represen... more ABSTRACT A method for estimating 2D image sequences using the generalized time-frequency representation (GTFR) is presented. The performance characterization is based on 20000 simulated noisy images. The results show that even with a signal to noise ratio (SNR) of 5 db, rectangular objects of at least 4 pixels in length and width can be detected, to within 2.5 pixel location accuracy. The misdetection rate is near zero, and the average false detection rate is 0.06 false objects per frame of 64×64 size in a 5 db SNR environment. With -1 db SNR, objects can be detected within 5 pixel accuracy. The misdetection rate for this case is near zero and the average false detection rate is 1.6 false objects per frame of 64×64 size. The method is also applied to an image sequence obtained from a 747 takeoff scene. The 747 takeoff speed was predicted to be 142 kts, which is well within the typical range of 140 to 150 kts
ICASSP '84. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1984
ABSTRACT A new algorithm is developed for solving the maximum entropy (ME) image reconstruction p... more ABSTRACT A new algorithm is developed for solving the maximum entropy (ME) image reconstruction problem. This approach involves solving a system of ordinary differential equations with appropriate initial conditions which can be computed easily. Instead of searching in the n+1-dimensional space as required for most ME algorithms, our approach involves solving a 1-dimensional search problem along a well-defined path. Moreover an efficient algorithm is developed to handle this search.
Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, 2000
Content-based image retrieval systems use low-level features like color and texture for image rep... more Content-based image retrieval systems use low-level features like color and texture for image representation. Given these representations as feature vectors, similarity between images is measured by computing distances in the feature space. Unfortunately, these low-level features cannot always capture the high-level concept of similarity in human perception. Relevance feedback tries to improve the performance by allowing iterative retrievals where the feedback information from the user is incorporated into the database search. We present a weighted distance approach that uses standard deviations of the features both for the whole database and also among the images selected as relevant by the user. These weights are used to iteratively refine the effects of different features in the database search. Retrieval performance is evaluated using average precision and progress computed on a database of approximately 10,000 images and an average performance improvement of 19% is obtained after the first iteration.
Lecture Notes in Computer Science, 2002
A document can be divided into zones on the basis of its content. For example, a zone can be eith... more A document can be divided into zones on the basis of its content. For example, a zone can be either text or non-text. Given the segmented document zones, correctly determining the zone content type is very important for the subsequent processes within any document image understanding system. This paper describes an algorithm for the determination of zone type of a given zone within an input document image. In our zone classification algorithm, zones are represented as feature vectors. Each feature vector consists of a set of 25 measurements of pre-defined properties. A probabilistic model, decision tree, is used to classify each zone on the basis of its feature vector. Two methods are used to optimize the decision tree classifier to eliminate the data over-fitting problem. To enrich our probabilistic model, we incorporate context constraints for certain zones within their neighboring zones. We also model zone class context constraints as a Hidden Markov Model and used Viterbi algorithm to obtain optimal classification results. The training, pruning and testing data set for the algorithm include 1, 600 images drawn from the UWCDROM-III document image database. With a total of 24, 177 zones within the data set, the cross-validation method was used in the performance evaluation of the classifier. The classifier is able to classify each given scientific and technical document zone into one of the nine classes, 2 text classes (of font size 4 − 18pt and font size 19 − 32 pt), math, table, halftone, map/drawing, ruling, logo, and others. A zone content classification performance evaluation protocol is proposed. Using this protocol, our algorithm accuracy is 98.45% with a mean false alarm rate of 0.50%.
Lecture Notes in Computer Science, 1996
This paper describes a protocol for systematically evaluating the performance of dashed-line dete... more This paper describes a protocol for systematically evaluating the performance of dashed-line detection algorithms. It includes a test image generator which creates random line patterns subject to prespecified constraints. The generator also outputs ground truth data for each line in the image. The output of the dashed line detection algorithm is then compared to these ground truths and evaluated using a set of criteria.
Series in Machine Perception and Artificial Intelligence, 1999
Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93), 1993
Two sources of document degradation are modeled: i) perspective distortion that occurs while phot... more Two sources of document degradation are modeled: i) perspective distortion that occurs while photocopying or scanning thick, bound documents, and ii) degradation due to perturbations in the optical scanning and digitization process: speckle, blurr, jitter, thresholding. Perspective distortion is modeled by studying the underlying perspective geometry of the optical system of photocopiers and scanners. An illumination model is described to account for the nonlinear intensity change occuring across a page in a perspective-distorted document. The optical distortion process is modeled morphologically. First, a distance transform on the foreground is performed, followed by a random inversion of binary pixels where the probability of flip is a function of the distance of the pixel to the boundary of the foreground. Correlating the flipped pixels is modeled by a morphological closing operation
Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1997
This paper presents a statistical approach for detecting corners from chain encoded digital arcs.... more This paper presents a statistical approach for detecting corners from chain encoded digital arcs. An arc point is declared as a corner if the estimated parameters of the two fitted lines of the two arc segments immediately to the right and left of the arc point are statistically significantly different. The corner detection algorithm consists of two steps: corner detection and optimization. While corner detection involves statistically identifying the most likely corner points along an arc sequence, corner optimization deals with improving the locational errors of the detected corners.
Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348), 1999
Proceedings IEEE International Symposium on Bio-Informatics and Biomedical Engineering, 2000
Automated left ventricle (LV) boundary delineation from left ventriculograms has been studied for... more Automated left ventricle (LV) boundary delineation from left ventriculograms has been studied for decades. Unfortunately, no methods in terms of the accuracy about volume and ejection fraction have ever been reported. A new knowledge based multi-stage method to automatically delineate the LV boundary at end diastole and end systole is discussed in this paper: It has a mean absolute boundary
Image retrieval algorithms use distances between feature vectors to compute similarities between ... more Image retrieval algorithms use distances between feature vectors to compute similarities between images. An important step between feature extraction and distance computation is feature normalization. Popular distance measures like the Euclidean distance implicitly assign more weighting to features with large ranges than those with small ranges. This paper describes six normalization methods to make all features have approximately the same effect in the computation of similarity. We investigate the effectiveness of these normalization methods in combination with the commonly used city-block (L 1 ) and Euclidean (L 2 ) distances in image similarity and retrieval. Experiments on a database of approximately 10,000 images show that the normalization method indeed has a significant effect on retrieval performance and studying the feature distributions and using the results of this study significantly improves the results compared to making only general assumptions.
Machine Vision Applications, 1994
version (from foreground to background and vice- versa) that occurs independently at each pixel S... more version (from foreground to background and vice- versa) that occurs independently at each pixel Scanned documents are noisy. Recently, (KHP93, due to light intensity fluctuations and threshold- KHP94, BaiSO), document degradation models were ing level, and (ii) the blurring that occurs due to proposed that model the local distortion introduced the point-spread function of the optical system the during the
Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001
... The following parts describe the automatic table ground truth generation procedure. ... softw... more ... The following parts describe the automatic table ground truth generation procedure. ... software [ 101 con-verts DVI file to a TIFF(Tagged Image File Format) file and a so-called character ground truth file which contains the bounding box coordinates, the type and size of ...
Pattern Recognition, 1979