Jincheng Huang - Academia.edu (original) (raw)
Papers by Jincheng Huang
1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451), 1999
Along with the advance in multimedia and internet technology, a h uge amount of data, including d... more Along with the advance in multimedia and internet technology, a h uge amount of data, including digital video and audio, are generated daily. Tools for e cient indexing and retrieval are indispensable. With multi-modal information present in the data, e ective i n tegration is necessary and is still a challenging problem. In this paper, we p r e s e n t four di erent methods for integrating audio and visual information for video classi cation based on Hidden Markov Model. Our results have shown signi cant i m p r o vement o ver using single modality.
Proceedings of 1997 IEEE International Symposium on Circuits and Systems. Circuits and Systems in the Information Age ISCAS '97, 1997
Content-based video segmentation and classification is a key to the success of future multimedia ... more Content-based video segmentation and classification is a key to the success of future multimedia databases. Research in this area in the past several years has focused on the use of speech recognition and image analysis techniques. As a complimentary effort to prior research, we have focused on the use of motion and audio characteristics. Fundamental to both segmentation and classification tasks is the characterization by certain features of a given video segment. In this paper, we describe several audio and motion features that have been found to be effective in distinguishing motion and audio characteristics of different types of scenes.
Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing
Understanding of the scene content of a video sequence is very important for content-based indexi... more Understanding of the scene content of a video sequence is very important for content-based indexing and retrieval of multimedia databases. Research in this area in the past several years has focused on the use of speech recognition and image analysis techniques. As a complimentary effort to the prior work, we have focused on using the associated audio information (mainly the nonspeech portion) for video scene analysis. As an example, we consider the problem of discriminating five types of TV programs, namely commercials, basketball games, football games, news reports, and weather forecasts. A set of low-level audio features are proposed for characterizing semantic contents of short audio clips. The linear separability of different classes under the proposed feature space is examined using a clustering analysis. The effective features are identified by evaluating the intracluster and intercluster scattering matrices of the feature space. Using these features, a neural net classifier was successful in separating the above five types of TV programs. By evaluating the changes between the feature vectors of adjacent clips, we also can identify scene breaks in an audio sequence quite accurately. These results demonstrate the capability of the proposed audio features for characterizing the semantic content of an audio sequence. where s n (i) is the i-th sample in the n-th frame audio signal and N is the frame length.
… and Expo, 2000. ICME 2000. 2000 …, 2000
Video classification and segmentation are fundamental steps for efficient accessing, retrieval an... more Video classification and segmentation are fundamental steps for efficient accessing, retrieval and browsing of large amounts of video data. We have developed a scene classification scheme using a hidden Markov model (HMM) based classifier. By utilizing the temporal behaviors of different scene classes, the HMM classifier can effectively classify video segments into one of the pre-defined scene classes. In this paper, we describe two approaches for joint video classification and segmentation based on a HMM, which works ...
Journal of Electronic Imaging, Jul 1, 1998
An emerging trend in the banking industry is to digitize checks for storage and transmission. An ... more An emerging trend in the banking industry is to digitize checks for storage and transmission. An immediate requirement for efficient storage and transmission is check image compression. General purpose compression algorithms such as JPEG and wavelet-based methods produce annoying ringing or blocking artifacts at high compression ratios. In this paper, a layered approach to check image compression is proposed, with which a check image is represented in several layers. The first layer describes the foreground map; the ...
Image Processing, IEEE Transactions on, 1999
A feature correction two-stage vector quantization (FC2VQ) algorithm was previously developed to ... more A feature correction two-stage vector quantization (FC2VQ) algorithm was previously developed to compress gray-scale photo identification (ID) pictures. This algorithm is extended to color images in this work. Three options are compared, which apply the FC2VQ algorithm in RGB, YCbCr, and Karhunen-Loeve transform (KLT) color spaces, respectively. The RGB-FC2VQ algorithm is found to yield better image quality than KLT-FC2VQ or YCbCr-FC2VQ at similar bit rates. With the RGB-FC2VQ algorithm, a 128× 128 ...
Multimedia Signal Processing, 1998 …, Dec 7, 1998
This paper describes a technique for classifying TV broadcast video using a hidden Markov model (... more This paper describes a technique for classifying TV broadcast video using a hidden Markov model (HMM). Here we consider the problem of discriminating five types of TV programs, namely commercials, basketball games, football games, news reports, and weather forecasts. Eight frame-based audio features are used to characterize the low-level audio properties, and fourteen clip-based audio features are extracted based on these frame-based features to characterize the high-level audio properties. For each type of these five ...
1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451), 1999
Along with the advance in multimedia and internet technology, a h uge amount of data, including d... more Along with the advance in multimedia and internet technology, a h uge amount of data, including digital video and audio, are generated daily. Tools for e cient indexing and retrieval are indispensable. With multi-modal information present in the data, e ective i n tegration is necessary and is still a challenging problem. In this paper, we p r e s e n t four di erent methods for integrating audio and visual information for video classi cation based on Hidden Markov Model. Our results have shown signi cant i m p r o vement o ver using single modality.
Proceedings of 1997 IEEE International Symposium on Circuits and Systems. Circuits and Systems in the Information Age ISCAS '97, 1997
Content-based video segmentation and classification is a key to the success of future multimedia ... more Content-based video segmentation and classification is a key to the success of future multimedia databases. Research in this area in the past several years has focused on the use of speech recognition and image analysis techniques. As a complimentary effort to prior research, we have focused on the use of motion and audio characteristics. Fundamental to both segmentation and classification tasks is the characterization by certain features of a given video segment. In this paper, we describe several audio and motion features that have been found to be effective in distinguishing motion and audio characteristics of different types of scenes.
Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing
Understanding of the scene content of a video sequence is very important for content-based indexi... more Understanding of the scene content of a video sequence is very important for content-based indexing and retrieval of multimedia databases. Research in this area in the past several years has focused on the use of speech recognition and image analysis techniques. As a complimentary effort to the prior work, we have focused on using the associated audio information (mainly the nonspeech portion) for video scene analysis. As an example, we consider the problem of discriminating five types of TV programs, namely commercials, basketball games, football games, news reports, and weather forecasts. A set of low-level audio features are proposed for characterizing semantic contents of short audio clips. The linear separability of different classes under the proposed feature space is examined using a clustering analysis. The effective features are identified by evaluating the intracluster and intercluster scattering matrices of the feature space. Using these features, a neural net classifier was successful in separating the above five types of TV programs. By evaluating the changes between the feature vectors of adjacent clips, we also can identify scene breaks in an audio sequence quite accurately. These results demonstrate the capability of the proposed audio features for characterizing the semantic content of an audio sequence. where s n (i) is the i-th sample in the n-th frame audio signal and N is the frame length.
… and Expo, 2000. ICME 2000. 2000 …, 2000
Video classification and segmentation are fundamental steps for efficient accessing, retrieval an... more Video classification and segmentation are fundamental steps for efficient accessing, retrieval and browsing of large amounts of video data. We have developed a scene classification scheme using a hidden Markov model (HMM) based classifier. By utilizing the temporal behaviors of different scene classes, the HMM classifier can effectively classify video segments into one of the pre-defined scene classes. In this paper, we describe two approaches for joint video classification and segmentation based on a HMM, which works ...
Journal of Electronic Imaging, Jul 1, 1998
An emerging trend in the banking industry is to digitize checks for storage and transmission. An ... more An emerging trend in the banking industry is to digitize checks for storage and transmission. An immediate requirement for efficient storage and transmission is check image compression. General purpose compression algorithms such as JPEG and wavelet-based methods produce annoying ringing or blocking artifacts at high compression ratios. In this paper, a layered approach to check image compression is proposed, with which a check image is represented in several layers. The first layer describes the foreground map; the ...
Image Processing, IEEE Transactions on, 1999
A feature correction two-stage vector quantization (FC2VQ) algorithm was previously developed to ... more A feature correction two-stage vector quantization (FC2VQ) algorithm was previously developed to compress gray-scale photo identification (ID) pictures. This algorithm is extended to color images in this work. Three options are compared, which apply the FC2VQ algorithm in RGB, YCbCr, and Karhunen-Loeve transform (KLT) color spaces, respectively. The RGB-FC2VQ algorithm is found to yield better image quality than KLT-FC2VQ or YCbCr-FC2VQ at similar bit rates. With the RGB-FC2VQ algorithm, a 128× 128 ...
Multimedia Signal Processing, 1998 …, Dec 7, 1998
This paper describes a technique for classifying TV broadcast video using a hidden Markov model (... more This paper describes a technique for classifying TV broadcast video using a hidden Markov model (HMM). Here we consider the problem of discriminating five types of TV programs, namely commercials, basketball games, football games, news reports, and weather forecasts. Eight frame-based audio features are used to characterize the low-level audio properties, and fourteen clip-based audio features are extracted based on these frame-based features to characterize the high-level audio properties. For each type of these five ...