Alan Bovik - Academia.edu (original) (raw)

Papers by Alan Bovik

Research paper thumbnail of Acknowledgement of priority spectral properties of moving L-estimates of independent data

Journal of the Franklin Institute, 1988

Abstract Professor H.A. David has kindly pointed out that an expression for the joint probability... more Abstract Professor H.A. David has kindly pointed out that an expression for the joint probability distribution function of order statistics from overlapping samples, similar to equation (3) in our paper, had been previously given in the reference cited below. This expression could also have been used in deriving the results obtained later in our paper.

Research paper thumbnail of Entropy estimation for segmentation of multi-spectral chromosome images

Research paper thumbnail of Multimodal Interactive Continuous Scoring of Subjective 3D Video Quality of Experience

IEEE Transactions on Multimedia, 2000

Research paper thumbnail of Maximum-likelihood techniques for joint segmentation-classification of multispectral chromosome images

IEEE Transactions on Medical Imaging, 2000

Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore... more Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore combinatorial labeling technique (M-FISH) was developed wherein each class of chromosomes binds with a different combination of fluorophores. This results in a multispectral image, where each class of chromosomes has distinct spectral components. In this paper, we develop new methods for automatic chromosome identification by exploiting the multispectral information in M-FISH chromosome images and by jointly performing chromosome segmentation and classification. We (1) develop a maximum-likelihood hypothesis test that uses multispectral information, together with conventional criteria, to select the best segmentation possibility; (2) use this likelihood function to combine chromosome segmentation and classification into a robust chromosome identification system; and (3) show that the proposed likelihood function can also be used as a reliable indicator of errors in segmentation, errors in classification, and chromosome anomalies, which can be indicators of radiation damage, cancer, and a wide variety of inherited diseases. We show that the proposed multispectral joint segmentation-classification method outperforms past grayscale segmentation methods when decomposing touching chromosomes. We also show that it outperforms past M-FISH classification techniques that do not use segmentation information.

Research paper thumbnail of AM-FM texture segmentation in electron microscopic muscle imaging

IEEE Transactions on Medical Imaging, 2000

Research paper thumbnail of Smoothing low-SNR molecular images via anisotropic median-diffusion

IEEE Transactions on Medical Imaging, 2002

Research paper thumbnail of Snakules: A Model-Based Active Contour Algorithm for the Annotation of Spicules on Mammography

IEEE Transactions on Medical Imaging, 2000

Research paper thumbnail of Color Compensation of Multicolor FISH Images

IEEE Transactions on Medical Imaging, 2000

Research paper thumbnail of Feature Normalization via Expectation Maximization and Unsupervised Nonparametric Classification For M-FISH Chromosome Images

IEEE Transactions on Medical Imaging, 2000

Research paper thumbnail of Localized measurement of emergent image frequencies by Gabor wavelets

IEEE Transactions on Information Theory, 2000

Abstract-The measurement of instantaneous or locally-oc-curring signal frequencies is focal to a ... more Abstract-The measurement of instantaneous or locally-oc-curring signal frequencies is focal to a wide variety of variably-dimensioned signal processing applications. The task is par-ticularly well-motivated for analyzing globally nonstationary, locally coherent signals having a ...

Research paper thumbnail of Indexes for Three-Class Classification Performance Assessment—An Empirical Comparison

IEEE Transactions on Information Technology in Biomedicine, 2000

Research paper thumbnail of Passive Multimodal 2-D+3-D Face Recognition Using Gabor Features and Landmark Distances

IEEE Transactions on Information Forensics and Security, 2000

Research paper thumbnail of Modeling the Time—Varying Subjective Quality of HTTP Video Streams With Rate Adaptations

IEEE Transactions on Image Processing, 2000

Newly developed hypertext transfer protocol (HTTP)-based video streaming technologies enable flex... more Newly developed hypertext transfer protocol (HTTP)-based video streaming technologies enable flexible rate-adaptation under varying channel conditions. Accurately predicting the users' quality of experience (QoE) for rate-adaptive HTTP video streams is thus critical to achieve efficiency. An important aspect of understanding and modeling QoE is predicting the up-to-the-moment subjective quality of a video as it is played, which is difficult due to hysteresis effects and nonlinearities in human behavioral responses. This paper presents a Hammerstein-Wiener model for predicting the time-varying subjective quality (TVSQ) of rate-adaptive videos. To collect data for model parameterization and validation, a database of longer duration videos with time-varying distortions was built and the TVSQs of the videos were measured in a large-scale subjective study. The proposed method is able to reliably predict the TVSQ of rate adaptive videos. Since the Hammerstein-Wiener model has a very simple structure, the proposed method is suitable for online TVSQ prediction in HTTP-based streaming.

Research paper thumbnail of Saliency Prediction on Stereoscopic Videos

IEEE Transactions on Image Processing, 2000

We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chr... more We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type. The model also accounts for perceptual factors, such as the nonuniform resolution of the human eye, stereoscopic limits imposed by Panum's fusional area, and the predicted degree of (dis) comfort felt, when viewing the 3D video. The high-level analysis involves classification of each 3D video scene by type with regard to estimated camera motion and the motions of objects in the videos. Decisions regarding the relative saliency of objects or regions are supported by data obtained through a series of eye-tracking experiments. The algorithm developed from the model elements operates by finding and segmenting salient 3D space-time regions in a video, then calculating the saliency strength of each segment using measured attributes of motion, disparity, texture, and the predicted degree of visual discomfort experienced. The saliency energy of both segmented objects and frames are weighted using models of human foveation and Panum's fusional area yielding a single predictor of 3D saliency.

Research paper thumbnail of Comments on "Subband coding of images using asymmetrical filterbanks

IEEE Transactions on Image Processing, 2000

In the above paper, Egger and Li presented a set of two-channel filterbanks, asymmetrical filterb... more In the above paper, Egger and Li presented a set of two-channel filterbanks, asymmetrical filterbanks (AFB's), for image coding applications. The basic properties of these filters are linear-phase, perfect reconstruction, asymmetric lengths for dual filters, and maximum regularity. In this correspondence, we point out that the proposed AFB's are not new in the sense that the proposed construction is equivalent to the factorization of Lagrange halfband filters, which has been reported by other researchers. In addition, we correct an error in the formulation of constructing AFBs in their paper.

Research paper thumbnail of Color and Depth Priors in Natural Images

IEEE Transactions on Image Processing, 2000

Natural scene statistics have played an increasingly important role in both our understanding of ... more Natural scene statistics have played an increasingly important role in both our understanding of the function and evolution of the human vision system, and in the development of modern image processing applications. Because range (egocentric distance) is arguably the most important thing a visual system must compute (from an evolutionary perspective), the joint statistics between image information (color and luminance) and range information are of particular interest. It seems obvious that where there is a depth discontinuity, there must be a higher probability of a brightness or color discontinuity too. This is true, but the more interesting case is in the other direction--because image information is much more easily computed than range information, the key conditional probabilities are those of finding a range discontinuity given an image discontinuity. Here, the intuition is much weaker; the plethora of shadows and textures in the natural environment imply that many image discontinuities must exist without corresponding changes in range. In this paper, we extend previous work in two ways--we use as our starting point a very high quality data set of coregistered color and range values collected specifically for this purpose, and we evaluate the statistics of perceptually relevant chromatic information in addition to luminance, range, and binocular disparity information. The most fundamental finding is that the probabilities of finding range changes do in fact depend in a useful and systematic way on color and luminance changes; larger range changes are associated with larger image changes. Second, we are able to parametrically model the prior marginal and conditional distributions of luminance, color, range, and (computed) binocular disparity. Finally, we provide a proof of principle that this information is useful by showing that our distribution models improve the performance of a Bayesian stereo algorithm on an independent set of input images. To summarize, we show that there is useful information about range in very low-level luminance and color information. To a system sensitive to this statistical information, it amounts to an additional (and only recently appreciated) depth cue, and one that is trivial to compute from the image data. We are confident that this information is robust, in that we go to great effort and expense to collect very high quality raw data. Finally, we demonstrate the practical utility of these findings by using them to improve the performance of a Bayesian stereo algorithm.

Research paper thumbnail of Optimizing Multiscale SSIM for Compression via MLDS

IEEE Transactions on Image Processing, 2000

Research paper thumbnail of Visually Weighted Compressive Sensing: Measurement and Reconstruction

IEEE Transactions on Image Processing, 2000

Compressive sensing (CS) makes it possible to more naturally create compact representations of da... more Compressive sensing (CS) makes it possible to more naturally create compact representations of data with respect to a desired data rate. Through wavelet decomposition, smooth and piecewise smooth signals can be represented as sparse and compressible coefficients. These coefficients can then be effectively compressed via the CS. Since a wavelet transform divides image information into layered blockwise wavelet coefficients over spatial and frequency domains, visual improvement can be attained by an appropriate perceptually weighted CS scheme. We introduce such a method in this paper and compare it with the conventional CS. The resulting visual CS model is shown to deliver improved visual reconstructions.

Research paper thumbnail of Nonlinear image estimation using piecewise and local image models

IEEE Transactions on Image Processing, 1998

Research paper thumbnail of Stereoscopic ranging by matching image modulations

IEEE Transactions on Image Processing, 1999

Research paper thumbnail of Acknowledgement of priority spectral properties of moving L-estimates of independent data

Journal of the Franklin Institute, 1988

Abstract Professor H.A. David has kindly pointed out that an expression for the joint probability... more Abstract Professor H.A. David has kindly pointed out that an expression for the joint probability distribution function of order statistics from overlapping samples, similar to equation (3) in our paper, had been previously given in the reference cited below. This expression could also have been used in deriving the results obtained later in our paper.

Research paper thumbnail of Entropy estimation for segmentation of multi-spectral chromosome images

Research paper thumbnail of Multimodal Interactive Continuous Scoring of Subjective 3D Video Quality of Experience

IEEE Transactions on Multimedia, 2000

Research paper thumbnail of Maximum-likelihood techniques for joint segmentation-classification of multispectral chromosome images

IEEE Transactions on Medical Imaging, 2000

Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore... more Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore combinatorial labeling technique (M-FISH) was developed wherein each class of chromosomes binds with a different combination of fluorophores. This results in a multispectral image, where each class of chromosomes has distinct spectral components. In this paper, we develop new methods for automatic chromosome identification by exploiting the multispectral information in M-FISH chromosome images and by jointly performing chromosome segmentation and classification. We (1) develop a maximum-likelihood hypothesis test that uses multispectral information, together with conventional criteria, to select the best segmentation possibility; (2) use this likelihood function to combine chromosome segmentation and classification into a robust chromosome identification system; and (3) show that the proposed likelihood function can also be used as a reliable indicator of errors in segmentation, errors in classification, and chromosome anomalies, which can be indicators of radiation damage, cancer, and a wide variety of inherited diseases. We show that the proposed multispectral joint segmentation-classification method outperforms past grayscale segmentation methods when decomposing touching chromosomes. We also show that it outperforms past M-FISH classification techniques that do not use segmentation information.

Research paper thumbnail of AM-FM texture segmentation in electron microscopic muscle imaging

IEEE Transactions on Medical Imaging, 2000

Research paper thumbnail of Smoothing low-SNR molecular images via anisotropic median-diffusion

IEEE Transactions on Medical Imaging, 2002

Research paper thumbnail of Snakules: A Model-Based Active Contour Algorithm for the Annotation of Spicules on Mammography

IEEE Transactions on Medical Imaging, 2000

Research paper thumbnail of Color Compensation of Multicolor FISH Images

IEEE Transactions on Medical Imaging, 2000

Research paper thumbnail of Feature Normalization via Expectation Maximization and Unsupervised Nonparametric Classification For M-FISH Chromosome Images

IEEE Transactions on Medical Imaging, 2000

Research paper thumbnail of Localized measurement of emergent image frequencies by Gabor wavelets

IEEE Transactions on Information Theory, 2000

Abstract-The measurement of instantaneous or locally-oc-curring signal frequencies is focal to a ... more Abstract-The measurement of instantaneous or locally-oc-curring signal frequencies is focal to a wide variety of variably-dimensioned signal processing applications. The task is par-ticularly well-motivated for analyzing globally nonstationary, locally coherent signals having a ...

Research paper thumbnail of Indexes for Three-Class Classification Performance Assessment—An Empirical Comparison

IEEE Transactions on Information Technology in Biomedicine, 2000

Research paper thumbnail of Passive Multimodal 2-D+3-D Face Recognition Using Gabor Features and Landmark Distances

IEEE Transactions on Information Forensics and Security, 2000

Research paper thumbnail of Modeling the Time—Varying Subjective Quality of HTTP Video Streams With Rate Adaptations

IEEE Transactions on Image Processing, 2000

Newly developed hypertext transfer protocol (HTTP)-based video streaming technologies enable flex... more Newly developed hypertext transfer protocol (HTTP)-based video streaming technologies enable flexible rate-adaptation under varying channel conditions. Accurately predicting the users' quality of experience (QoE) for rate-adaptive HTTP video streams is thus critical to achieve efficiency. An important aspect of understanding and modeling QoE is predicting the up-to-the-moment subjective quality of a video as it is played, which is difficult due to hysteresis effects and nonlinearities in human behavioral responses. This paper presents a Hammerstein-Wiener model for predicting the time-varying subjective quality (TVSQ) of rate-adaptive videos. To collect data for model parameterization and validation, a database of longer duration videos with time-varying distortions was built and the TVSQs of the videos were measured in a large-scale subjective study. The proposed method is able to reliably predict the TVSQ of rate adaptive videos. Since the Hammerstein-Wiener model has a very simple structure, the proposed method is suitable for online TVSQ prediction in HTTP-based streaming.

Research paper thumbnail of Saliency Prediction on Stereoscopic Videos

IEEE Transactions on Image Processing, 2000

We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chr... more We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type. The model also accounts for perceptual factors, such as the nonuniform resolution of the human eye, stereoscopic limits imposed by Panum's fusional area, and the predicted degree of (dis) comfort felt, when viewing the 3D video. The high-level analysis involves classification of each 3D video scene by type with regard to estimated camera motion and the motions of objects in the videos. Decisions regarding the relative saliency of objects or regions are supported by data obtained through a series of eye-tracking experiments. The algorithm developed from the model elements operates by finding and segmenting salient 3D space-time regions in a video, then calculating the saliency strength of each segment using measured attributes of motion, disparity, texture, and the predicted degree of visual discomfort experienced. The saliency energy of both segmented objects and frames are weighted using models of human foveation and Panum's fusional area yielding a single predictor of 3D saliency.

Research paper thumbnail of Comments on "Subband coding of images using asymmetrical filterbanks

IEEE Transactions on Image Processing, 2000

In the above paper, Egger and Li presented a set of two-channel filterbanks, asymmetrical filterb... more In the above paper, Egger and Li presented a set of two-channel filterbanks, asymmetrical filterbanks (AFB's), for image coding applications. The basic properties of these filters are linear-phase, perfect reconstruction, asymmetric lengths for dual filters, and maximum regularity. In this correspondence, we point out that the proposed AFB's are not new in the sense that the proposed construction is equivalent to the factorization of Lagrange halfband filters, which has been reported by other researchers. In addition, we correct an error in the formulation of constructing AFBs in their paper.

Research paper thumbnail of Color and Depth Priors in Natural Images

IEEE Transactions on Image Processing, 2000

Natural scene statistics have played an increasingly important role in both our understanding of ... more Natural scene statistics have played an increasingly important role in both our understanding of the function and evolution of the human vision system, and in the development of modern image processing applications. Because range (egocentric distance) is arguably the most important thing a visual system must compute (from an evolutionary perspective), the joint statistics between image information (color and luminance) and range information are of particular interest. It seems obvious that where there is a depth discontinuity, there must be a higher probability of a brightness or color discontinuity too. This is true, but the more interesting case is in the other direction--because image information is much more easily computed than range information, the key conditional probabilities are those of finding a range discontinuity given an image discontinuity. Here, the intuition is much weaker; the plethora of shadows and textures in the natural environment imply that many image discontinuities must exist without corresponding changes in range. In this paper, we extend previous work in two ways--we use as our starting point a very high quality data set of coregistered color and range values collected specifically for this purpose, and we evaluate the statistics of perceptually relevant chromatic information in addition to luminance, range, and binocular disparity information. The most fundamental finding is that the probabilities of finding range changes do in fact depend in a useful and systematic way on color and luminance changes; larger range changes are associated with larger image changes. Second, we are able to parametrically model the prior marginal and conditional distributions of luminance, color, range, and (computed) binocular disparity. Finally, we provide a proof of principle that this information is useful by showing that our distribution models improve the performance of a Bayesian stereo algorithm on an independent set of input images. To summarize, we show that there is useful information about range in very low-level luminance and color information. To a system sensitive to this statistical information, it amounts to an additional (and only recently appreciated) depth cue, and one that is trivial to compute from the image data. We are confident that this information is robust, in that we go to great effort and expense to collect very high quality raw data. Finally, we demonstrate the practical utility of these findings by using them to improve the performance of a Bayesian stereo algorithm.

Research paper thumbnail of Optimizing Multiscale SSIM for Compression via MLDS

IEEE Transactions on Image Processing, 2000

Research paper thumbnail of Visually Weighted Compressive Sensing: Measurement and Reconstruction

IEEE Transactions on Image Processing, 2000

Compressive sensing (CS) makes it possible to more naturally create compact representations of da... more Compressive sensing (CS) makes it possible to more naturally create compact representations of data with respect to a desired data rate. Through wavelet decomposition, smooth and piecewise smooth signals can be represented as sparse and compressible coefficients. These coefficients can then be effectively compressed via the CS. Since a wavelet transform divides image information into layered blockwise wavelet coefficients over spatial and frequency domains, visual improvement can be attained by an appropriate perceptually weighted CS scheme. We introduce such a method in this paper and compare it with the conventional CS. The resulting visual CS model is shown to deliver improved visual reconstructions.

Research paper thumbnail of Nonlinear image estimation using piecewise and local image models

IEEE Transactions on Image Processing, 1998

Research paper thumbnail of Stereoscopic ranging by matching image modulations

IEEE Transactions on Image Processing, 1999