Aggelos Katsaggelos | Northwestern University (original) (raw)
Papers by Aggelos Katsaggelos
Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing, 1997
In the future multimedia technology will be able to provide video frame rates equal to or better ... more In the future multimedia technology will be able to provide video frame rates equal to or better than 30 frames-per-second FPS. Until that time the hearing impaired community will be using band-limited communication systems over un-shielded twisted pair copper wiring. As a result multimedia communication systems will use a coder decoder CODEC to compress the video and audio signals for transmission. For these systems to be usable by the hearing impaired community, the algorithms within the CODEC have to be designed to account for the perceptual boundaries of the hearing impaired. In this paper we investigate the perceptual boundaries of speechreading and multimedia technology, which are the constraints that e ect speechreading performance. We analyze and draw conclusions on the relationship between viseme groupings, accuracy of viseme recognition, and presentation rate. These results are critical in the design of multimedia systems for the hearing impaired.
Applications of Digital Image Processing XXXI, 2008
ABSTRACT Emerging communications trends point to streaming video as a new form of content deliver... more ABSTRACT Emerging communications trends point to streaming video as a new form of content delivery. These systems are implemented over wired systems, such as cable or ethernet, and wireless networks, cell phones, and portable game systems. These communications systems require sophisticated methods of compression and error-resilience encoding to enable communications across band-limited and noisy delivery channels. Additionally, the transmitted video data must be of high enough quality to ensure a ...
IEEE International Conference on Acoustics Speech and Signal Processing, 2002
A framework for recovering high-resolution information from a sequence of sub-sampled and compres... more A framework for recovering high-resolution information from a sequence of sub-sampled and compressed observations is presented. Compression schemes that describe a video sequence through a combination of motion vectors and transform coefficients are the focus (e.g. the MPEG and ITU family of standards), and we consider the influence of both the motion vectors and transform coefficients within the reconstruction algorithm. A Bayesian approach is utilized to incorporate the information, and results show a discernable improvement in resolution, as compared to standard interpolation methods.
The International Series in Engineering and Computer Science
The problem of recovering a high-resolution frame from a sequence of low-resolution and compresse... more The problem of recovering a high-resolution frame from a sequence of low-resolution and compressed images is considered. The presence of the compression system complicates the recovery problem, as the operation reduces the amount of frequency aliasing in the low-resolution frames and introduces a non-linear noise process. Increasing the resolution of the decoded frames can still be addressed in a recovery framework though, but the method must also include knowledge of the underlying compression system. Furthermore, improving the spatial resolution of the decoded sequence is no longer the only goal of the recovery algorithm. Instead, the technique is also required to attenuate compression artifacts.
2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07, 2007
This paper addresses the problem of joint encoder optimization and channel coding for realtime vi... more This paper addresses the problem of joint encoder optimization and channel coding for realtime video transmission over wireless channels. An efficient solution is proposed to optimally select macroblock modes and quantizers as well as channel coding rates. The proposed optimization algorithm fully considers error resilience, forward error correction and error concealment. Experimental results demonstrate the effectiveness of the proposed approach.
Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269), 1998
Proceedings of International Conference on Image Processing, 1997
In this paper a model-based multi-view image generation system for video c onferencing is present... more In this paper a model-based multi-view image generation system for video c onferencing is presented. The system assumes that a 3-D model of the person in front of the camera is available. It extracts texture f r om a speaking person sequence images and maps it to the static 3-D model during the video c onference session. Since only the incrementally updated texture information is transmitted during the whole session, the bandwidth requirement is very small. Based on the experimental results one can conclude that the proposed system is very promising for practical applications.
Visual Communications and Image Processing '96, 1996
In this paper, we present a fast and optimal method for the lossy encoding of object boundaries w... more In this paper, we present a fast and optimal method for the lossy encoding of object boundaries which a r e given as 8-connect chain codes. We a p p r o ximate the boundary by a polygon and consider the problem of nding the polygon which can be encoded with the smallest number of bits for a given maximum distortion. To this end, we derive a fast and optimal scheme which is based on a shortest path algorithm for a weighted directed acyclic graph. We further investigate the dual problem of nding the polygonal approximation which leads to the smallest maximum distortion for a given bit rate. We present an iterative s c heme which employs the above mentioned shortest path algorithm and prove that it converges to the optimal solution. We then extend the proposed algorithm to the encoding of multiple object boundaries and introduce a vertex encoding scheme which is a combination of an 8-connect chain code and a run-length code. We present results of the proposed algorithm using objects from the \Miss America" sequence.
Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429), 2003
In object-based video, the encoding of the video data is decoupled into the encoding of shape, mo... more In object-based video, the encoding of the video data is decoupled into the encoding of shape, motion and texture information, which enables certain functionalities like content-based interactivity and scalability. However, the problem of how to jointly encode these separate signals to reach the best coding efficiency has never been solved thoroughly. In this paper, we present an operational ratedistortion optimal bit allocation scheme that provides a solution to this problem. Our approach is based on the Lagrangian relaxation and dynamic programming. Experimental results indicate that the proposed optimal encoding approach has considerable gains over an ad-hoc method without optimization. Furthermore the proposed algorithm is much more efficient than exhaustive search.
2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698), 2003
MPEG-4 is the first multimedia standard that supports the decoupling of a video object into objec... more MPEG-4 is the first multimedia standard that supports the decoupling of a video object into object shape and object texture information, which consequently brings up the optimal encoding problem for object-based video. In this paper, we present an operational rate-distortion optimal bit allocation scheme between shape and texture for MPEG-4 encoding. Our approach is based on the Lagrange multiplier method, while the adoption of dynamic programming techniques enables its higher efficiency over the exhaustive search algorithm. Our work will not only benefit the further study of joint shape and texture encoding, but also make possible the deeper study of optimal joint source-channel coding of object-based video.
IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference, 2008
Abstract Video transport over multi-hop wireless networks has received significant research inter... more Abstract Video transport over multi-hop wireless networks has received significant research interests recently. The majority of the research efforts in this field have been conducted taking the approach of cross-layer optimization. However, video content and user perceived quality have been largely ignored in existing work. In this paper, we integrate video content analysis into video transport over wireless mesh networks (WMN). A content-aware quality-driven cross-layer optimization framework is proposed to achieve the best end-to-end user ...
2010 17th IEEE International Conference on Electronics, Circuits and Systems, 2010
The compression of video can reduce the accuracy of automated tracking algorithms. This is proble... more The compression of video can reduce the accuracy of automated tracking algorithms. This is problematic for centralized applications such as transportation surveillance systems, where remotely cap tured and compressed video is transmitted to a central location for tracking. In typical systems, the majority of communications band width is spent on representing events such as capture noise or local changes to lighting. We propose a pre-and post-processing algo rithm that identifies and removes such events of low tracking interest, significantly reducing the bitrate required to transmit remotely cap tured video while maintaining comparable tracking accuracy. Using the H.264/AVC video coding standard and a commonly used state of-the-art tracker we show that our algorithm allows for up to 90% bitrate savings while maintaining comparable tracking accuracy.
2011 18th IEEE International Conference on Image Processing, 2011
We propose a tracking-aware system that removes video components of low tracking interest and opt... more We propose a tracking-aware system that removes video components of low tracking interest and optimizes the quantization during compression of frequency coefficients, particularly those that most influence trackers, significantly reducing bitrate while maintaining comparable tracking accuracy. We utilize tracking accuracy as our compression criterion in lieu of mean squared error metrics. The process of optimizing quantization tables suitable for automated tracking can be executed online or offline. The online implementation initializes the encoding procedure for a specific scene, but introduces delay. On the other hand, the offline procedure produces globally optimum quantization tables where the optimization occurs for a collection of video sequences. Our proposed system is designed with low processing power and memory requirements in mind, and as such can be deployed on remote nodes. Using H.264/AVC video coding and a commonly used state-of-the-art tracker we show that while maintaining comparable tracking accuracy our system allows for over 50% bitrate savings on top of existing savings from previous work.
2011 18th IEEE International Conference on Image Processing, 2011
Abstract The compression of video and subsequent partial loss of the compressed bitstream can dra... more Abstract The compression of video and subsequent partial loss of the compressed bitstream can dramatically reduce the accuracy of automated tracking algorithms. This is problematic for centralized applications such as transportation surveillance systems, where remotely captured and compressed video is transmitted over lossy wireless links to a central location for tracking. We propose a low-complexity method for protecting compressed video against channel loss such that the tracking accuracy of decoded and concealed video is ...
We present a new image restoration method based on modelling the coefficients of an overcomplete ... more We present a new image restoration method based on modelling the coefficients of an overcomplete wavelet response to natural images with a mixture of two Gaussian distributions, having non-zero and zero mean respectively, and reflecting the assumption that this response is close to be sparse. Including the observation model, the resulting procedure iterates between image reconstruction from the hard-thresholding of the response to the current estimate and a fast blur compensation step. Results indicate that our method compares favorably with current wavelet-based restoration methods.
Lecture Notes in Computer Science, 2003
This paper deals with the problem of reconstructing a highresolution image from an incomplete set... more This paper deals with the problem of reconstructing a highresolution image from an incomplete set of undersampled, blurred and noisy images shifted with subpixel displacement. We derive mathematical expressions for the calculation of the maximum a posteriori estimate of the high resolution image and the estimation of the parameters involved in the model. We also examine the role played by the prior model when this incomplete set of low resolution images is used. The performance of the method is tested experimentally.
2004 IEEE International Conference on Acoustics, Speech, and Signal Processing
Abstract Block matching has been used for motion estimation and motion compensation in MPEG stand... more Abstract Block matching has been used for motion estimation and motion compensation in MPEG standards for years. While it has an acceptable performance in describing motion between frames, it requires quite a few bits to represent the motion vectors. In certain circumstances, the use of whole frame affine motion models would perform equally well or even better than block matching in terms of motion accuracy, while it results in the coding of only 6 parameters. In this paper, we modify an MPEG-4 codec by adding:(1) 6 affine ...
2008 15th IEEE International Conference on Image Processing, 2008
In this paper we propose a novel algorithm for super resolution based on total variation prior an... more In this paper we propose a novel algorithm for super resolution based on total variation prior and variational distribution approximations. We formulate the problem using a hierarchical Bayesian model where the reconstructed high resolution image and the model parameters are estimated simultaneously from the low resolution observations. The algorithm resulting from this formulation utilizes variational inference and provides approximations to the posterior distributions of the latent variables. Due to the simultaneous parameter estimation, the algorithm is fully automated so parameter tuning is not required. Experimental results show that the proposed approach outperforms some of the state-of-the-art super resolution algorithms.
2007 IEEE International Conference on Image Processing, 2007
IEEE Transactions on Image Processing, 1995
This paper considers the concept of robust estimation in regularized image restoration. Robust fu... more This paper considers the concept of robust estimation in regularized image restoration. Robust functionals are employed for the representation of both the noise and the signal statistics. Such functionals allow the efficient suppression of a wide variety of noise processes and permit the reconstruction of sharper edges than their quadratic counterparts. A new class of robust entropic functionals is introduced, which operates only on the high-frequency content of the signal and reflects sharp deviations in the signal distribution. This class of functionals can also incorporate prior structural information regarding the original image, in a way similar to the maximum information principle. The convergence properties of robust iterative algorithms are studied for continuously and noncontinuously differentiable functionals. The definition of the robust approach is completed by introducing a method for the optimal selection of the regularization parameter. This method utilizes the structure of robust estimators that lack analytic specification. The properties of robust algorithms are demonstrated through restoration examples in different noise environments.
Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing, 1997
In the future multimedia technology will be able to provide video frame rates equal to or better ... more In the future multimedia technology will be able to provide video frame rates equal to or better than 30 frames-per-second FPS. Until that time the hearing impaired community will be using band-limited communication systems over un-shielded twisted pair copper wiring. As a result multimedia communication systems will use a coder decoder CODEC to compress the video and audio signals for transmission. For these systems to be usable by the hearing impaired community, the algorithms within the CODEC have to be designed to account for the perceptual boundaries of the hearing impaired. In this paper we investigate the perceptual boundaries of speechreading and multimedia technology, which are the constraints that e ect speechreading performance. We analyze and draw conclusions on the relationship between viseme groupings, accuracy of viseme recognition, and presentation rate. These results are critical in the design of multimedia systems for the hearing impaired.
Applications of Digital Image Processing XXXI, 2008
ABSTRACT Emerging communications trends point to streaming video as a new form of content deliver... more ABSTRACT Emerging communications trends point to streaming video as a new form of content delivery. These systems are implemented over wired systems, such as cable or ethernet, and wireless networks, cell phones, and portable game systems. These communications systems require sophisticated methods of compression and error-resilience encoding to enable communications across band-limited and noisy delivery channels. Additionally, the transmitted video data must be of high enough quality to ensure a ...
IEEE International Conference on Acoustics Speech and Signal Processing, 2002
A framework for recovering high-resolution information from a sequence of sub-sampled and compres... more A framework for recovering high-resolution information from a sequence of sub-sampled and compressed observations is presented. Compression schemes that describe a video sequence through a combination of motion vectors and transform coefficients are the focus (e.g. the MPEG and ITU family of standards), and we consider the influence of both the motion vectors and transform coefficients within the reconstruction algorithm. A Bayesian approach is utilized to incorporate the information, and results show a discernable improvement in resolution, as compared to standard interpolation methods.
The International Series in Engineering and Computer Science
The problem of recovering a high-resolution frame from a sequence of low-resolution and compresse... more The problem of recovering a high-resolution frame from a sequence of low-resolution and compressed images is considered. The presence of the compression system complicates the recovery problem, as the operation reduces the amount of frequency aliasing in the low-resolution frames and introduces a non-linear noise process. Increasing the resolution of the decoded frames can still be addressed in a recovery framework though, but the method must also include knowledge of the underlying compression system. Furthermore, improving the spatial resolution of the decoded sequence is no longer the only goal of the recovery algorithm. Instead, the technique is also required to attenuate compression artifacts.
2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07, 2007
This paper addresses the problem of joint encoder optimization and channel coding for realtime vi... more This paper addresses the problem of joint encoder optimization and channel coding for realtime video transmission over wireless channels. An efficient solution is proposed to optimally select macroblock modes and quantizers as well as channel coding rates. The proposed optimization algorithm fully considers error resilience, forward error correction and error concealment. Experimental results demonstrate the effectiveness of the proposed approach.
Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269), 1998
Proceedings of International Conference on Image Processing, 1997
In this paper a model-based multi-view image generation system for video c onferencing is present... more In this paper a model-based multi-view image generation system for video c onferencing is presented. The system assumes that a 3-D model of the person in front of the camera is available. It extracts texture f r om a speaking person sequence images and maps it to the static 3-D model during the video c onference session. Since only the incrementally updated texture information is transmitted during the whole session, the bandwidth requirement is very small. Based on the experimental results one can conclude that the proposed system is very promising for practical applications.
Visual Communications and Image Processing '96, 1996
In this paper, we present a fast and optimal method for the lossy encoding of object boundaries w... more In this paper, we present a fast and optimal method for the lossy encoding of object boundaries which a r e given as 8-connect chain codes. We a p p r o ximate the boundary by a polygon and consider the problem of nding the polygon which can be encoded with the smallest number of bits for a given maximum distortion. To this end, we derive a fast and optimal scheme which is based on a shortest path algorithm for a weighted directed acyclic graph. We further investigate the dual problem of nding the polygonal approximation which leads to the smallest maximum distortion for a given bit rate. We present an iterative s c heme which employs the above mentioned shortest path algorithm and prove that it converges to the optimal solution. We then extend the proposed algorithm to the encoding of multiple object boundaries and introduce a vertex encoding scheme which is a combination of an 8-connect chain code and a run-length code. We present results of the proposed algorithm using objects from the \Miss America" sequence.
Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429), 2003
In object-based video, the encoding of the video data is decoupled into the encoding of shape, mo... more In object-based video, the encoding of the video data is decoupled into the encoding of shape, motion and texture information, which enables certain functionalities like content-based interactivity and scalability. However, the problem of how to jointly encode these separate signals to reach the best coding efficiency has never been solved thoroughly. In this paper, we present an operational ratedistortion optimal bit allocation scheme that provides a solution to this problem. Our approach is based on the Lagrangian relaxation and dynamic programming. Experimental results indicate that the proposed optimal encoding approach has considerable gains over an ad-hoc method without optimization. Furthermore the proposed algorithm is much more efficient than exhaustive search.
2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698), 2003
MPEG-4 is the first multimedia standard that supports the decoupling of a video object into objec... more MPEG-4 is the first multimedia standard that supports the decoupling of a video object into object shape and object texture information, which consequently brings up the optimal encoding problem for object-based video. In this paper, we present an operational rate-distortion optimal bit allocation scheme between shape and texture for MPEG-4 encoding. Our approach is based on the Lagrange multiplier method, while the adoption of dynamic programming techniques enables its higher efficiency over the exhaustive search algorithm. Our work will not only benefit the further study of joint shape and texture encoding, but also make possible the deeper study of optimal joint source-channel coding of object-based video.
IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference, 2008
Abstract Video transport over multi-hop wireless networks has received significant research inter... more Abstract Video transport over multi-hop wireless networks has received significant research interests recently. The majority of the research efforts in this field have been conducted taking the approach of cross-layer optimization. However, video content and user perceived quality have been largely ignored in existing work. In this paper, we integrate video content analysis into video transport over wireless mesh networks (WMN). A content-aware quality-driven cross-layer optimization framework is proposed to achieve the best end-to-end user ...
2010 17th IEEE International Conference on Electronics, Circuits and Systems, 2010
The compression of video can reduce the accuracy of automated tracking algorithms. This is proble... more The compression of video can reduce the accuracy of automated tracking algorithms. This is problematic for centralized applications such as transportation surveillance systems, where remotely cap tured and compressed video is transmitted to a central location for tracking. In typical systems, the majority of communications band width is spent on representing events such as capture noise or local changes to lighting. We propose a pre-and post-processing algo rithm that identifies and removes such events of low tracking interest, significantly reducing the bitrate required to transmit remotely cap tured video while maintaining comparable tracking accuracy. Using the H.264/AVC video coding standard and a commonly used state of-the-art tracker we show that our algorithm allows for up to 90% bitrate savings while maintaining comparable tracking accuracy.
2011 18th IEEE International Conference on Image Processing, 2011
We propose a tracking-aware system that removes video components of low tracking interest and opt... more We propose a tracking-aware system that removes video components of low tracking interest and optimizes the quantization during compression of frequency coefficients, particularly those that most influence trackers, significantly reducing bitrate while maintaining comparable tracking accuracy. We utilize tracking accuracy as our compression criterion in lieu of mean squared error metrics. The process of optimizing quantization tables suitable for automated tracking can be executed online or offline. The online implementation initializes the encoding procedure for a specific scene, but introduces delay. On the other hand, the offline procedure produces globally optimum quantization tables where the optimization occurs for a collection of video sequences. Our proposed system is designed with low processing power and memory requirements in mind, and as such can be deployed on remote nodes. Using H.264/AVC video coding and a commonly used state-of-the-art tracker we show that while maintaining comparable tracking accuracy our system allows for over 50% bitrate savings on top of existing savings from previous work.
2011 18th IEEE International Conference on Image Processing, 2011
Abstract The compression of video and subsequent partial loss of the compressed bitstream can dra... more Abstract The compression of video and subsequent partial loss of the compressed bitstream can dramatically reduce the accuracy of automated tracking algorithms. This is problematic for centralized applications such as transportation surveillance systems, where remotely captured and compressed video is transmitted over lossy wireless links to a central location for tracking. We propose a low-complexity method for protecting compressed video against channel loss such that the tracking accuracy of decoded and concealed video is ...
We present a new image restoration method based on modelling the coefficients of an overcomplete ... more We present a new image restoration method based on modelling the coefficients of an overcomplete wavelet response to natural images with a mixture of two Gaussian distributions, having non-zero and zero mean respectively, and reflecting the assumption that this response is close to be sparse. Including the observation model, the resulting procedure iterates between image reconstruction from the hard-thresholding of the response to the current estimate and a fast blur compensation step. Results indicate that our method compares favorably with current wavelet-based restoration methods.
Lecture Notes in Computer Science, 2003
This paper deals with the problem of reconstructing a highresolution image from an incomplete set... more This paper deals with the problem of reconstructing a highresolution image from an incomplete set of undersampled, blurred and noisy images shifted with subpixel displacement. We derive mathematical expressions for the calculation of the maximum a posteriori estimate of the high resolution image and the estimation of the parameters involved in the model. We also examine the role played by the prior model when this incomplete set of low resolution images is used. The performance of the method is tested experimentally.
2004 IEEE International Conference on Acoustics, Speech, and Signal Processing
Abstract Block matching has been used for motion estimation and motion compensation in MPEG stand... more Abstract Block matching has been used for motion estimation and motion compensation in MPEG standards for years. While it has an acceptable performance in describing motion between frames, it requires quite a few bits to represent the motion vectors. In certain circumstances, the use of whole frame affine motion models would perform equally well or even better than block matching in terms of motion accuracy, while it results in the coding of only 6 parameters. In this paper, we modify an MPEG-4 codec by adding:(1) 6 affine ...
2008 15th IEEE International Conference on Image Processing, 2008
In this paper we propose a novel algorithm for super resolution based on total variation prior an... more In this paper we propose a novel algorithm for super resolution based on total variation prior and variational distribution approximations. We formulate the problem using a hierarchical Bayesian model where the reconstructed high resolution image and the model parameters are estimated simultaneously from the low resolution observations. The algorithm resulting from this formulation utilizes variational inference and provides approximations to the posterior distributions of the latent variables. Due to the simultaneous parameter estimation, the algorithm is fully automated so parameter tuning is not required. Experimental results show that the proposed approach outperforms some of the state-of-the-art super resolution algorithms.
2007 IEEE International Conference on Image Processing, 2007
IEEE Transactions on Image Processing, 1995
This paper considers the concept of robust estimation in regularized image restoration. Robust fu... more This paper considers the concept of robust estimation in regularized image restoration. Robust functionals are employed for the representation of both the noise and the signal statistics. Such functionals allow the efficient suppression of a wide variety of noise processes and permit the reconstruction of sharper edges than their quadratic counterparts. A new class of robust entropic functionals is introduced, which operates only on the high-frequency content of the signal and reflects sharp deviations in the signal distribution. This class of functionals can also incorporate prior structural information regarding the original image, in a way similar to the maximum information principle. The convergence properties of robust iterative algorithms are studied for continuously and noncontinuously differentiable functionals. The definition of the robust approach is completed by introducing a method for the optimal selection of the regularization parameter. This method utilizes the structure of robust estimators that lack analytic specification. The properties of robust algorithms are demonstrated through restoration examples in different noise environments.