Vector Quantization Research Papers - Academia.edu (original) (raw)

... 870 JZC Lai Image and Vision Computing 15 (1997) 867871 Table 2 The execution time of enci ading a real image (tiffany) for 3 different algorithms' Number of codewords Available encoding algorithm (seconds) EAWFC (seconds) E... more

... 870 JZC Lai Image and Vision Computing 15 (1997) 867871 Table 2 The execution time of enci ading a real image (tiffany) for 3 different algorithms' Number of codewords Available encoding algorithm (seconds) EAWFC (seconds) E c AWFS with = 3 (seconds) Speedup ...

A simple yet complex approach to modern sophistication. In this project we used the MFCC approach to build a unique and accurate coefficients extracting processor to extract feature from the voice stored in the database, then on the next... more

A simple yet complex approach to modern sophistication.
In this project we used the MFCC approach to build a unique and accurate coefficients extracting processor to extract feature from the voice stored in the database, then on the next stage we again emoployed the use of MFCC to extract coefficients and used vector quantization to match with the user database. For vector Quantization we used Linde-Buzo Grey Algorithm..
Code is available on demand. Send me a private message if you want it.

Compression is the art of representing the information in a compact form rather than in its original or uncompressed form. In other words, using the data compression, the size of a particular file can be reduced. This is very useful when... more

Compression is the art of representing the information in a compact form rather than in its original or uncompressed form. In other words, using the data compression, the size of a particular file can be reduced. This is very useful when processing, storing or transferring a huge file, which needs lots of resources. If the algorithms used to encrypt work properly, there should be a significant difference between the original file and the compressed file. When the data compression is used in a data transmission application, speed is the primary goal. The speed of the transmission depends on the number of bits sent, the time required for the encoder to generate the coded message, and the time required for the decoder to recover the original ensemble. In a data storage application, the degree of compression is the primary concern. Compression can be classified as either lossy or lossless.
Image compression is a key technology in the transmission and storage of digital images because of vast data associated with them. This research suggests an effective approach for image compression using Stationary Wavelet Transform (SWT) and Vector Quantization which is a Linde Buzo Gray (LBG) vector quantization in order to compressed input images in four phases; namely preprocessing, image transformation, zigzag scan, and lossy/lossless compression. Preprocessing phase takes images as input, so that the proposed approach resize the image in accordance with the measured rate of different sizes to (8 × 8) And then converted from (RGB) to (gray scale). Image transformation phase received the resizable gray scale images and produced transformed images using SWT. Zigzag scan phase takes as an input the transformed images in 2D matrix and produced images in 1D matrix. Finally, in lossy/lossless compression phase takes 1D matrix and apply LBG vector quantization as lossy compression techniques and other lossless compression techniques such as Huffman coding and arithmetic coding. The result of our approach gives the highest possible compression ratio and less time possible than other compression approaches. Our approach is useful in the internet image compression.

This paper presents the architecture and VHDL design of a Two Dimensional Discrete Cosine Transform (2D-DCT) with Quantization and zigzag arrangement. This architecture is used as the core and path in JPEG image compression hardware. The... more

This paper presents the architecture and VHDL design of a Two Dimensional Discrete Cosine Transform (2D-DCT) with Quantization and zigzag arrangement. This architecture is used as the core and path in JPEG image compression hardware. The 2D-DCT calculation is made using the 2D-DCT Separability property, such that the whole architecture is divided into two 1D-DCT calculations by using a transpose buffer. Architecture for Quantization and zigzag process is also described in this paper. The quantization process is done using division operation. This design aimed to be implemented in Spartan-3E XC3S500 FPGA. The 2D-DCT architecture uses 1891 Slices, 51I/O pins, and 8 multipliers of one Xilinx Spartan-3E XC3S500E FPGA reaches an operating frequency of 101.35 MHz One input block with 8 x 8 elements of 8 bits each is processed in 6604 ns and pipeline latency is 140 clock cycles .

Multi-user wireless systems with multiple antennas can drastically increase the capac- ity while maintaining the quality of service requirements. The best performance of these systems is obtained at the presence of instantaneous channel... more

Multi-user wireless systems with multiple antennas can drastically increase the capac-
ity while maintaining the quality of service requirements. The best performance of these
systems is obtained at the presence of instantaneous channel knowledge. Since uplink-
downlink channel reciprocity does not hold in frequency division duplex and broadband
time division duplex systems, efficient channel quantization becomes important. This
thesis focuses on different quantization techniques in a linearly precoded multi-user wire-
less system.
Our work provides three major contributions. First, we come up with an end-to-end
transceiver design, incorporating precoder, receive combining and feedback policy, that
works well at low feedback overhead. Second, we provide optimal bit allocation across the
gain and shape of a complex vector to reduce the quantization error and investigate its
effect in the multiuser wireless system. Third, we design an adaptive differential quantizer
that reduces feedback overhead by utilizing temporal correlation of the channels in a time
varying scenario.

The pyramid image structure can be naturally adapted for progressive image transmission over low-speed channels and hierarchical image retrieving in computerized image archiving. An efficient pyramid image coding system using quadrature... more

The pyramid image structure can be naturally adapted for progressive image transmission over low-speed channels and hierarchical image retrieving in computerized image archiving. An efficient pyramid image coding system using quadrature mirror filters to form the image-pyramids is proposed in this paper. Characteristics of the image-pyramids are presented. Since the Laplacian pyramids of most nature images contain sparse and spatially concentrated data, a combined run-length coding for zero-valued elements and entropy coding for elements larger than a certain threshold is employed. The textural features in the Laplacian pyramids suggest that coding techniques pursuing spatial correlation may be advantageous. Therefore, vector quantization is chosen to code the Laplacian pyramids. Simulation results have shown that simple vector quantization accomplished significant bit-rate reduction over scalar quantization. The proposed system has also shown good-quality reproduction at bit rates lower than 1 bit/pixel.

The color satellite image compression technique by vector quantization can be improved either by acting directly on the step of constructing the dictionary or by acting on the quantization step of the input vectors. In this paper, an... more

The color satellite image compression technique by vector quantization can be improved either by acting directly on the step of constructing the dictionary or by acting on the quantization step of the input vectors. In this paper, an improvement of the second step has been proposed. The knearest neighbor algorithm was used on each axis separately. The three classifications, considered as three independent sources of information, are combined in the framework of the evidence theory. The best code vector is then selected, after the image is quantized, Huffman schemes compression is applied for encoding and decoding.

Emotion detection is a new research era in health informatics and forensic technology. Besides having some challenges, voice based emotion recognition is getting popular, as the situation where the facial image is not available, the voice... more

Emotion detection is a new research era in health informatics and forensic technology. Besides having some challenges, voice based emotion recognition is getting popular, as the situation where the facial image is not available, the voice is the only way to detect the emotional or psychiatric condition of a person. However, the voice signal is so dynamic even in a short-time frame so that, a voice of the same person can differ within a very subtle period of time. Therefore, in this research basically two key criterion have been considered; firstly, this is clear that there is a necessity to partition the training data according to the emotional stage of each individual speaker. Secondly, rather than using the entire voice signal, short time significant frames can be used, which would be enough to identify the emotional condition of the speaker. In this research, Cepstral Coefficient (CC) has been used as voice feature and a fixed valued k-means clustered method has been used for feature classification. The value of k will depend on the number of emotional situations in human physiology is being an evaluation. Consequently, the value of k does not necessarily consider the volume of experimental dataset. In this experiment, three emotional conditions: happy, angry and sad have been detected from eight female and seven male voice signals. This methodology has increased the emotion detection accuracy rate significantly comparing to some recent works and also reduced the CPU time of cluster formation and matching.

This paper studies an application of turbo codes to compressed image/video transmission and presents an approach to improving error control performance through joint channel and source decoding (JCSD). The proposed approach to JCSD... more

This paper studies an application of turbo codes to compressed image/video transmission and presents an approach to improving error control performance through joint channel and source decoding (JCSD). The proposed approach to JCSD includes error-free source information feedback, error-detected source information feedback, and the use of channel soft values (CSV) for source signal postprocessing. These feedback schemes are based on a modification of the extrinsic information passed between the constituent maximum a posteriori probability (MAP) decoders in a turbo decoder. The modification is made according to the source information obtained from the source signal processor. The CSVs are considered as reliability information on the hard decisions and are further used for error recovery in the reconstructed signals. Applications of this joint decoding technique to different visual source coding schemes, such as spatial vector quantization, JPEG coding, and MPEG coding, are examined. Experimental results show that up to 0.6 dB of channel SNR reduction can be achieved by the joint decoder without increasing computational cost for various channel coding rates.

We propose a new feature, namely, pitch-synchronous discrete cosine transform (PS-DCT), for the task of speaker identification. These features are obtained directly from the voiced segments of the speech signal, without any preemphasis or... more

We propose a new feature, namely, pitch-synchronous discrete cosine transform (PS-DCT), for the task of speaker identification. These features are obtained directly from the voiced segments of the speech signal, without any preemphasis or windowing. The feature vectors are vector quantized, to create one separate codebook for each speaker during training. The performance of the PS-DCT features is shown to be good, and hence it can be used to supplement other features for the speaker identification task. Speaker identification is also performed using Mel-frequency cepstral coefficient (MFCC) features and combined with the proposed features to improve its performance. For this pilot study, 30 speakers (14 female and 16 male) have been picked up randomly from the TIMIT database for the speaker identification task. On this data, both the proposed features and MFCC give an identification accuracy of 90% and 96.7% for codebook sizes of 16 and 32, respectively, and the combined features achieve 100% performance. Apart from the speaker identification task, this work also shows the capability of DCT to capture discriminative information from the speech signal with minimal pre-processing.

This paper presents a novel Vector Quantizer codebook design using Fuzzy Possibilistic C Means (FPCM) clustering technique for image compression using Wavelet Packet. The idea is to achieve higher compression ratio based on clustering the... more

This paper presents a novel Vector Quantizer codebook design using Fuzzy Possibilistic C Means (FPCM) clustering technique for image compression using Wavelet Packet. The idea is to achieve higher compression ratio based on clustering the wavelet coefficients of each Wavelet Packet Tree (WPT) bands. The methodology applied here is to apply WPT to the whole image. The sub blocks are decomposed into two level Wavelet Packet Tree where the coefficients of LL band’s(approximation and details), LH band’s(approximation), HL band’s(approximation), HH band’s (approximation) are clustered using FPCM. The centroids of each cluster is arranged in the form of a codebook and indexed. The index values are coded and then transmitted across. The image is reconstructed using the inverse WPT followed by rearranging and the subsequent encoder. The results show that the psycho-visual fidelity criteria (both subjective and objective measures) of the proposed FPCM technique do better than the other existing methods.