Efficient methods for high quality low bit rate wideband speech coding (original) (raw)

Techniques for high-quality ACELP coding of wideband speech

2001

We present in this paper new methods for achieving highquality wideband speech at low rates using the ACELP algorithm. Several innovations are introduced to optimize the quality and minimize the complexity of the coder. A multi-rate wideband speech encoding algorithm based on these techniques was recently selected by 3GPP as the standard for AMR-WB, and is currently one of the candidates for the ITU-T wideband speech coder standard at around 16 kbit/sec. This standard was jointly developed by VoiceAge and Nokia.

Techniques for improving the performance of CELP type speech coders

1991

Techniques for improving the performance of CELP (code excited linear prediction) type speech coders while maintaining reasonable computational complexity are explored. A harmonic noise weighting function which enhances the perceptual quality of the processed speech is introduced. The combination of harmonic noise weighting and subsample resolution pitch significantly improves the coder performance for voiced speech. A 6.9 kb/s VSELP speech coder which incorporates subsample resolution pitch and harmonic noise weighting is described. Complexity reduction techniques are discussed which allow the coder to be implemented using a single fixed point digital signal processor

Perceptually based and embedded wideband CELP coding of speech

1999

This paper presents a novel multi-band CELP coder with the following characteristics: wideband coding (6.5 kHz), variable bit rate (VBR) coding (10-24 kbps), low-delay (10 ms), embeddibility, and perceptually based dynamic bit allocation. The excitation signal of the linear prediction filter is the vector sum of eight off-line pre-filtered bandpass excitation vectors. The eight excitation codebooks are tree structured, providing embeddibility and variable bit rate. The dynamic allocation of the bitstream among the different bands is based on the perceptual importance of each band. The multi-band and perceptual structure of the coding scheme results in graceful degradation with decreasing bit rates both in quiet and in the presence of background noise.

The adaptive multirate wideband speech codec (AMR-WB

IEEE Transactions on Speech and Audio Processing, 2002

This paper describes the Adaptive Multirate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services. The AMR-WB speech codec algorithm was selected in December 2000 and the corresponding specifications were approved in March 2001. The AMR-WB codec was also selected by the International Telecommunication Union-Telecommunication Sector (ITU-T) in July 2001 in the standardization activity for wideband speech coding around 16 kb/s and was approved in January 2002 as Recommendation G.722.2

Implementation of Low Complexity CELP Coder and Performance Evaluation in terms of Speech Quality

International Journal of Computer Applications, 2012

The critical issues that are serving as constraints in wireless communication particularly in mobile communication are bandwidth, storage memory and power. The speech transmission in wireless networks is associated with the reduction of extra information present in signal in such a way to preserve the quality and intelligibility of speech. To remove the redundancy and transmit the speech with acceptable quality, speech compression algorithms are deployed. Because of this reason the speech coding is and will be the most important research issue. This paper addresses the implementation of CELP coder having low computational complexity with acceptable speech quality and preserves the intelligibility. The coder is assessed in terms of quality for different kinds of speakers using PESQ, PSNR,Frequency Weighted SNRseg, and SNRseg.

A 2.4-kbps variable-bit-rate ADP-CELP speech coder

Electronics and Communications in Japan (Part III: Fundamental Electronic Science), 2000

This paper presents a variable bit rate ADP-CELP (Adaptive Density Pulse Code Excited Linear Prediction) coder that selects one of four kinds of coding structure in each frame based on short time speech characteristics. To improve speech quality and reduce the average bit rate, we have developed a speech/non-speech classification method using spectrum envelope variation, which is robust for background noise. In addition, we propose an efficient pitch lag coding technique. The technique interpolates consecutive frame pitch lags and quantizes a vector of relative pitch lags consisting of variation between an estimated pitch lag and a target pitch lag in plural subframes. The average bit rate of the proposed coder was approximately 2.4 kbps for speech sources with activity factor of 60%. Our subjective testing indicates the quality of the propcsed coder exceeds that of the Japanese digital cellular standard with rate of 3.45 kbps.

ADVANCES IN SOURCE-CONTROLLED VARIABLE BIT RATE WIDEBAND SPEECH CODING

2000

This paper presents novel techniques for source controlled variable rate wideband speech coding. These techniques have been used in the variable-rate multimode wideband (VMR-WB) speech codec recently selected by 3GPP2 for wideband (WB) speech telephony, streaming, and multimedia messaging services in the cdma2000 third generation wireless system. The coding algorithm contains several innovations that enable very good performance at average bit rates as low as 4.0 kbit/s in typical conversational operating conditions. These innovations include: Efficient noise suppression algorithm, signal classification and rate selection algorithm that enables high quality operation at low average bit rates, efficient post-processing techniques tailored for wideband signals, and novel frame erasure concealment techniques including supplementary information for reconstruction of lost onsets and improving decoder convergence. Further, the coder utilizes efficient coding types optimized for different classes of speech signal including a generic coding type based on AMR-WB for transients and onsets, voiced coding type optimized for stable voiced signals and utilizing novel signal modification procedure resulting in good wideband quality at 6.2 kbit/s, unvoiced coding types optimized for unvoiced segments, and efficient comfort noise generation coding. The article describes in detail some of the codec novel features.

Low‐Bit‐Rate Speech Coding

2003

This article is focused on speech coding methods for achieving communication quality speech at bit rates of 4 kbit/s and lower. The speech coding techniques are based on an all-pole model of the vocal tract which may be implemented in the time domain with appropriately selected excitation functions or else may be fit to a spectral analysis of the speech signal. Three main types of coders are described below. Code-excited linear prediction (CELP) coders select their excitation from waveform codebooks using analysis-by-synthesis closed-loop techniques, which need to be supplemented by speech classification and open-loop parametric techniques for keeping up with quality at lower rates. The prototypical sinusoidal coder (SC) has a bank of oscillators for signal synthesis, driven by a model of the magnitude spectrum. However, phase regeneration is important in enhancing speech reconstruction at low rates. Waveform interpolation (WI) coders afford a wider timefrequency footprint for the representation of the excitation, showing a good potential for achieving toll quality at bit rates below 4 kbit/s.

Design and description of CS-ACELP: a toll quality 8 kb/s speech coder

IEEE Transactions on Speech and Audio Processing, 1998

This paper describes the 8 kb/s speech coding algorithm G.729 which has been recently standardized by ITU-T. The algorithm is based on a conjugate-structure algebraic CELP (CS-ACELP) coding technique and uses 10 ms speech frames. The codec delivers toll-quality speech (equivalent to 32 kb/s ADPCM) for most operating conditions. This paper describes the coder structure in detail and discusses the reasons behind certain design choices. A 16-b fixed-point version has been developed as part of Recommendation G.729 and a summary of the subjective test results based on a real-time implementation of this version are presented.