Alan Bovik - Academia.edu (original) (raw)
Papers by Alan Bovik
Journal of the Franklin Institute, 1988
Abstract Professor H.A. David has kindly pointed out that an expression for the joint probability... more Abstract Professor H.A. David has kindly pointed out that an expression for the joint probability distribution function of order statistics from overlapping samples, similar to equation (3) in our paper, had been previously given in the reference cited below. This expression could also have been used in deriving the results obtained later in our paper.
IEEE Transactions on Multimedia, 2000
IEEE Transactions on Medical Imaging, 2000
Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore... more Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore combinatorial labeling technique (M-FISH) was developed wherein each class of chromosomes binds with a different combination of fluorophores. This results in a multispectral image, where each class of chromosomes has distinct spectral components. In this paper, we develop new methods for automatic chromosome identification by exploiting the multispectral information in M-FISH chromosome images and by jointly performing chromosome segmentation and classification. We (1) develop a maximum-likelihood hypothesis test that uses multispectral information, together with conventional criteria, to select the best segmentation possibility; (2) use this likelihood function to combine chromosome segmentation and classification into a robust chromosome identification system; and (3) show that the proposed likelihood function can also be used as a reliable indicator of errors in segmentation, errors in classification, and chromosome anomalies, which can be indicators of radiation damage, cancer, and a wide variety of inherited diseases. We show that the proposed multispectral joint segmentation-classification method outperforms past grayscale segmentation methods when decomposing touching chromosomes. We also show that it outperforms past M-FISH classification techniques that do not use segmentation information.
IEEE Transactions on Medical Imaging, 2000
IEEE Transactions on Medical Imaging, 2002
IEEE Transactions on Medical Imaging, 2000
IEEE Transactions on Medical Imaging, 2000
IEEE Transactions on Medical Imaging, 2000
IEEE Transactions on Information Theory, 2000
Abstract-The measurement of instantaneous or locally-oc-curring signal frequencies is focal to a ... more Abstract-The measurement of instantaneous or locally-oc-curring signal frequencies is focal to a wide variety of variably-dimensioned signal processing applications. The task is par-ticularly well-motivated for analyzing globally nonstationary, locally coherent signals having a ...
IEEE Transactions on Information Technology in Biomedicine, 2000
IEEE Transactions on Information Forensics and Security, 2000
IEEE Transactions on Image Processing, 2000
Newly developed hypertext transfer protocol (HTTP)-based video streaming technologies enable flex... more Newly developed hypertext transfer protocol (HTTP)-based video streaming technologies enable flexible rate-adaptation under varying channel conditions. Accurately predicting the users' quality of experience (QoE) for rate-adaptive HTTP video streams is thus critical to achieve efficiency. An important aspect of understanding and modeling QoE is predicting the up-to-the-moment subjective quality of a video as it is played, which is difficult due to hysteresis effects and nonlinearities in human behavioral responses. This paper presents a Hammerstein-Wiener model for predicting the time-varying subjective quality (TVSQ) of rate-adaptive videos. To collect data for model parameterization and validation, a database of longer duration videos with time-varying distortions was built and the TVSQs of the videos were measured in a large-scale subjective study. The proposed method is able to reliably predict the TVSQ of rate adaptive videos. Since the Hammerstein-Wiener model has a very simple structure, the proposed method is suitable for online TVSQ prediction in HTTP-based streaming.
IEEE Transactions on Image Processing, 2000
We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chr... more We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type. The model also accounts for perceptual factors, such as the nonuniform resolution of the human eye, stereoscopic limits imposed by Panum's fusional area, and the predicted degree of (dis) comfort felt, when viewing the 3D video. The high-level analysis involves classification of each 3D video scene by type with regard to estimated camera motion and the motions of objects in the videos. Decisions regarding the relative saliency of objects or regions are supported by data obtained through a series of eye-tracking experiments. The algorithm developed from the model elements operates by finding and segmenting salient 3D space-time regions in a video, then calculating the saliency strength of each segment using measured attributes of motion, disparity, texture, and the predicted degree of visual discomfort experienced. The saliency energy of both segmented objects and frames are weighted using models of human foveation and Panum's fusional area yielding a single predictor of 3D saliency.
IEEE Transactions on Image Processing, 2000
In the above paper, Egger and Li presented a set of two-channel filterbanks, asymmetrical filterb... more In the above paper, Egger and Li presented a set of two-channel filterbanks, asymmetrical filterbanks (AFB's), for image coding applications. The basic properties of these filters are linear-phase, perfect reconstruction, asymmetric lengths for dual filters, and maximum regularity. In this correspondence, we point out that the proposed AFB's are not new in the sense that the proposed construction is equivalent to the factorization of Lagrange halfband filters, which has been reported by other researchers. In addition, we correct an error in the formulation of constructing AFBs in their paper.
IEEE Transactions on Image Processing, 2000
Natural scene statistics have played an increasingly important role in both our understanding of ... more Natural scene statistics have played an increasingly important role in both our understanding of the function and evolution of the human vision system, and in the development of modern image processing applications. Because range (egocentric distance) is arguably the most important thing a visual system must compute (from an evolutionary perspective), the joint statistics between image information (color and luminance) and range information are of particular interest. It seems obvious that where there is a depth discontinuity, there must be a higher probability of a brightness or color discontinuity too. This is true, but the more interesting case is in the other direction--because image information is much more easily computed than range information, the key conditional probabilities are those of finding a range discontinuity given an image discontinuity. Here, the intuition is much weaker; the plethora of shadows and textures in the natural environment imply that many image discontinuities must exist without corresponding changes in range. In this paper, we extend previous work in two ways--we use as our starting point a very high quality data set of coregistered color and range values collected specifically for this purpose, and we evaluate the statistics of perceptually relevant chromatic information in addition to luminance, range, and binocular disparity information. The most fundamental finding is that the probabilities of finding range changes do in fact depend in a useful and systematic way on color and luminance changes; larger range changes are associated with larger image changes. Second, we are able to parametrically model the prior marginal and conditional distributions of luminance, color, range, and (computed) binocular disparity. Finally, we provide a proof of principle that this information is useful by showing that our distribution models improve the performance of a Bayesian stereo algorithm on an independent set of input images. To summarize, we show that there is useful information about range in very low-level luminance and color information. To a system sensitive to this statistical information, it amounts to an additional (and only recently appreciated) depth cue, and one that is trivial to compute from the image data. We are confident that this information is robust, in that we go to great effort and expense to collect very high quality raw data. Finally, we demonstrate the practical utility of these findings by using them to improve the performance of a Bayesian stereo algorithm.
IEEE Transactions on Image Processing, 2000
IEEE Transactions on Image Processing, 2000
Compressive sensing (CS) makes it possible to more naturally create compact representations of da... more Compressive sensing (CS) makes it possible to more naturally create compact representations of data with respect to a desired data rate. Through wavelet decomposition, smooth and piecewise smooth signals can be represented as sparse and compressible coefficients. These coefficients can then be effectively compressed via the CS. Since a wavelet transform divides image information into layered blockwise wavelet coefficients over spatial and frequency domains, visual improvement can be attained by an appropriate perceptually weighted CS scheme. We introduce such a method in this paper and compare it with the conventional CS. The resulting visual CS model is shown to deliver improved visual reconstructions.
IEEE Transactions on Image Processing, 1998
IEEE Transactions on Image Processing, 1999
Journal of the Franklin Institute, 1988
Abstract Professor H.A. David has kindly pointed out that an expression for the joint probability... more Abstract Professor H.A. David has kindly pointed out that an expression for the joint probability distribution function of order statistics from overlapping samples, similar to equation (3) in our paper, had been previously given in the reference cited below. This expression could also have been used in deriving the results obtained later in our paper.
IEEE Transactions on Multimedia, 2000
IEEE Transactions on Medical Imaging, 2000
Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore... more Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore combinatorial labeling technique (M-FISH) was developed wherein each class of chromosomes binds with a different combination of fluorophores. This results in a multispectral image, where each class of chromosomes has distinct spectral components. In this paper, we develop new methods for automatic chromosome identification by exploiting the multispectral information in M-FISH chromosome images and by jointly performing chromosome segmentation and classification. We (1) develop a maximum-likelihood hypothesis test that uses multispectral information, together with conventional criteria, to select the best segmentation possibility; (2) use this likelihood function to combine chromosome segmentation and classification into a robust chromosome identification system; and (3) show that the proposed likelihood function can also be used as a reliable indicator of errors in segmentation, errors in classification, and chromosome anomalies, which can be indicators of radiation damage, cancer, and a wide variety of inherited diseases. We show that the proposed multispectral joint segmentation-classification method outperforms past grayscale segmentation methods when decomposing touching chromosomes. We also show that it outperforms past M-FISH classification techniques that do not use segmentation information.
IEEE Transactions on Medical Imaging, 2000
IEEE Transactions on Medical Imaging, 2002
IEEE Transactions on Medical Imaging, 2000
IEEE Transactions on Medical Imaging, 2000
IEEE Transactions on Medical Imaging, 2000
IEEE Transactions on Information Theory, 2000
Abstract-The measurement of instantaneous or locally-oc-curring signal frequencies is focal to a ... more Abstract-The measurement of instantaneous or locally-oc-curring signal frequencies is focal to a wide variety of variably-dimensioned signal processing applications. The task is par-ticularly well-motivated for analyzing globally nonstationary, locally coherent signals having a ...
IEEE Transactions on Information Technology in Biomedicine, 2000
IEEE Transactions on Information Forensics and Security, 2000
IEEE Transactions on Image Processing, 2000
Newly developed hypertext transfer protocol (HTTP)-based video streaming technologies enable flex... more Newly developed hypertext transfer protocol (HTTP)-based video streaming technologies enable flexible rate-adaptation under varying channel conditions. Accurately predicting the users' quality of experience (QoE) for rate-adaptive HTTP video streams is thus critical to achieve efficiency. An important aspect of understanding and modeling QoE is predicting the up-to-the-moment subjective quality of a video as it is played, which is difficult due to hysteresis effects and nonlinearities in human behavioral responses. This paper presents a Hammerstein-Wiener model for predicting the time-varying subjective quality (TVSQ) of rate-adaptive videos. To collect data for model parameterization and validation, a database of longer duration videos with time-varying distortions was built and the TVSQs of the videos were measured in a large-scale subjective study. The proposed method is able to reliably predict the TVSQ of rate adaptive videos. Since the Hammerstein-Wiener model has a very simple structure, the proposed method is suitable for online TVSQ prediction in HTTP-based streaming.
IEEE Transactions on Image Processing, 2000
We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chr... more We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type. The model also accounts for perceptual factors, such as the nonuniform resolution of the human eye, stereoscopic limits imposed by Panum's fusional area, and the predicted degree of (dis) comfort felt, when viewing the 3D video. The high-level analysis involves classification of each 3D video scene by type with regard to estimated camera motion and the motions of objects in the videos. Decisions regarding the relative saliency of objects or regions are supported by data obtained through a series of eye-tracking experiments. The algorithm developed from the model elements operates by finding and segmenting salient 3D space-time regions in a video, then calculating the saliency strength of each segment using measured attributes of motion, disparity, texture, and the predicted degree of visual discomfort experienced. The saliency energy of both segmented objects and frames are weighted using models of human foveation and Panum's fusional area yielding a single predictor of 3D saliency.
IEEE Transactions on Image Processing, 2000
In the above paper, Egger and Li presented a set of two-channel filterbanks, asymmetrical filterb... more In the above paper, Egger and Li presented a set of two-channel filterbanks, asymmetrical filterbanks (AFB's), for image coding applications. The basic properties of these filters are linear-phase, perfect reconstruction, asymmetric lengths for dual filters, and maximum regularity. In this correspondence, we point out that the proposed AFB's are not new in the sense that the proposed construction is equivalent to the factorization of Lagrange halfband filters, which has been reported by other researchers. In addition, we correct an error in the formulation of constructing AFBs in their paper.
IEEE Transactions on Image Processing, 2000
Natural scene statistics have played an increasingly important role in both our understanding of ... more Natural scene statistics have played an increasingly important role in both our understanding of the function and evolution of the human vision system, and in the development of modern image processing applications. Because range (egocentric distance) is arguably the most important thing a visual system must compute (from an evolutionary perspective), the joint statistics between image information (color and luminance) and range information are of particular interest. It seems obvious that where there is a depth discontinuity, there must be a higher probability of a brightness or color discontinuity too. This is true, but the more interesting case is in the other direction--because image information is much more easily computed than range information, the key conditional probabilities are those of finding a range discontinuity given an image discontinuity. Here, the intuition is much weaker; the plethora of shadows and textures in the natural environment imply that many image discontinuities must exist without corresponding changes in range. In this paper, we extend previous work in two ways--we use as our starting point a very high quality data set of coregistered color and range values collected specifically for this purpose, and we evaluate the statistics of perceptually relevant chromatic information in addition to luminance, range, and binocular disparity information. The most fundamental finding is that the probabilities of finding range changes do in fact depend in a useful and systematic way on color and luminance changes; larger range changes are associated with larger image changes. Second, we are able to parametrically model the prior marginal and conditional distributions of luminance, color, range, and (computed) binocular disparity. Finally, we provide a proof of principle that this information is useful by showing that our distribution models improve the performance of a Bayesian stereo algorithm on an independent set of input images. To summarize, we show that there is useful information about range in very low-level luminance and color information. To a system sensitive to this statistical information, it amounts to an additional (and only recently appreciated) depth cue, and one that is trivial to compute from the image data. We are confident that this information is robust, in that we go to great effort and expense to collect very high quality raw data. Finally, we demonstrate the practical utility of these findings by using them to improve the performance of a Bayesian stereo algorithm.
IEEE Transactions on Image Processing, 2000
IEEE Transactions on Image Processing, 2000
Compressive sensing (CS) makes it possible to more naturally create compact representations of da... more Compressive sensing (CS) makes it possible to more naturally create compact representations of data with respect to a desired data rate. Through wavelet decomposition, smooth and piecewise smooth signals can be represented as sparse and compressible coefficients. These coefficients can then be effectively compressed via the CS. Since a wavelet transform divides image information into layered blockwise wavelet coefficients over spatial and frequency domains, visual improvement can be attained by an appropriate perceptually weighted CS scheme. We introduce such a method in this paper and compare it with the conventional CS. The resulting visual CS model is shown to deliver improved visual reconstructions.
IEEE Transactions on Image Processing, 1998
IEEE Transactions on Image Processing, 1999