ebook img

TS 126 090 - V3.1.0 - Universal Mobile Telecommunications System (UMTS); Mandatory Speech Codec speech processing functions AMR speech codec; Transcoding functions (3G TS 26.090 version 3.1.0 Release 1999) PDF

62 Pages·0.5 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview TS 126 090 - V3.1.0 - Universal Mobile Telecommunications System (UMTS); Mandatory Speech Codec speech processing functions AMR speech codec; Transcoding functions (3G TS 26.090 version 3.1.0 Release 1999)

ETSI TS 126 090 V3.1.0 (2000-01) TechnicalSpecification Universal Mobile Telecommunications System (UMTS); Mandatory Speech Codec speech processing functions AMR speech codec; Transcoding functions (3G TS 26.090 version 3.1.0 Release 1999) (3GTS26.090version3.1.0Release1999) Page1 ETSITS126090V3.1.0(2000-01) Reference DTS/TSGS-0426090U Keywords UMTS ETSI Postaladdress F-06921SophiaAntipolisCedex-FRANCE Officeaddress 650RoutedesLucioles-SophiaAntipolis Valbonne-FRANCE Tel.:+33492944200 Fax:+33493654716 SiretN°34862356200017-NAF742C Associationàbutnonlucratifenregistréeàla Sous-PréfecturedeGrasse(06)N°7803/88 Internet [email protected] IndividualcopiesofthisETSIdeliverable canbedownloadedfrom http://www.etsi.org Ifyoufinderrorsinthepresentdocument,sendyour commentto:[email protected] Importantnotice ThisETSIdeliverablemaybemadeavailableinmorethanoneelectronicversionorinprint.Inanycaseofexistingor perceiveddifferenceincontentsbetweensuchversions,thereferenceversionisthePortableDocumentFormat(PDF). Incaseofdispute,thereferenceshallbetheprintingonETSIprintersofthePDFversionkeptonaspecificnetwork drivewithinETSISecretariat. CopyrightNotification Nopartmaybereproducedexceptasauthorizedbywrittenpermission. Thecopyrightandtheforegoingrestrictionextendtoreproductioninallmedia. ©EuropeanTelecommunicationsStandardsInstitute2000. Allrightsreserved. ETSI (3GTS26.090version3.1.0Release1999) Page2 ETSITS126090V3.1.0(2000-01) Intellectual Property Rights IPRsessentialorpotentiallyessentialtothepresentdocumentmayhavebeendeclaredtoETSI.Theinformation pertainingtotheseessentialIPRs,ifany,ispubliclyavailableforETSImembersandnon-members,andcanbefound inSR000314:"IntellectualPropertyRights(IPRs);Essential,orpotentiallyEssential,IPRsnotifiedtoETSIinrespect ofETSIstandards",whichisavailablefromtheETSISecretariat.LatestupdatesareavailableontheETSIWebserver (http://www.etsi.org/ipr). PursuanttotheETSIIPRPolicy,noinvestigation,includingIPRsearches,hasbeencarriedoutbyETSI.Noguarantee canbegivenastotheexistenceofotherIPRsnotreferencedinSR000314(ortheupdatesontheETSIWebserver) whichare,ormaybe,ormaybecome,essentialtothepresentdocument. Foreword ThisTechnicalSpecification(TS)hasbeenproducedbytheETSI3rdGenerationPartnershipProject(3GPP). Thepresentdocumentmayrefertotechnicalspecificationsorreportsusingtheir3GPPidentitiesorGSMidentities. TheseshouldbeinterpretedasbeingreferencestothecorrespondingETSIdeliverables.Themappingofdocument identitiesisasfollows: For3GPPdocuments: 3GTS|TRnn.nnn"<title>"(withorwithouttheprefix3G) isequivalentto ETSITS|TR1nnnnn"[Digitalcellulartelecommunicationssystem(Phase2+)(GSM);]UniversalMobile TelecommunicationsSystem;<title> ForGSMdocumentidentitiesoftype"GSMxx.yy",e.g.GSM01.04,thecorrespondingETSIdocumentidentitymaybe foundintheCrossReferenceListonwww.etsi.org/key ETSI Page 3 TS 26.090 : December 1999 Contents Foreword............................................................................................................................................................5 1 Scope.......................................................................................................................................................6 2 Normative references...............................................................................................................................6 3 Definitions, symbols and abbreviations...................................................................................................7 3.1 Definitions..........................................................................................................................................................7 3.2 Symbols..............................................................................................................................................................8 3.3 Abbreviations...................................................................................................................................................14 4 Outline description................................................................................................................................14 4.1 Functional description of audio parts...............................................................................................................14 4.2 Preparation of speech samples..........................................................................................................................15 4.2.1 PCM format conversion..............................................................................................................................15 4.3 Principles of the adaptive multi-rate speech encoder.......................................................................................16 4.4 Principles of the adaptive multi-rate speech decoder.......................................................................................18 4.5 Sequence and subjective importance of encoded parameters...........................................................................19 5 Functional description of the encoder...................................................................................................19 5.1 Pre-processing (all modes)...............................................................................................................................19 5.2 Linear prediction analysis and quantization.....................................................................................................19 12.2 kbit/s mode 19 10.2, 7.95, 7.40, 6.70, 5.90, 5.15, 4.75 kbit/s modes.........................................................................................................19 5.2.1 Windowing and auto-correlation computation............................................................................................20 12.2 kbit/s mode 20 10.2, 7.95, 7.40, 6.70, 5.90, 5.15, 4.75 kbit/s modes.........................................................................................................21 5.2.2 Levinson-Durbin algorithm (all modes)......................................................................................................21 5.2.3 LP to LSP conversion (all modes)..............................................................................................................22 5.2.4 LSP to LP conversion (all modes)..............................................................................................................23 5.2.5 Quantization of the LSP coefficients..........................................................................................................24 12.2 kbit/s mode 24 10.2, 7.95, 7.40, 6.70, 5.90, 5.15, 4.75 kbit/s modes........................................................................................................25 5.2.6 Interpolation of the LSPs............................................................................................................................25 12.2 kbit/s mode 25 10.2, 7.95, 7.40, 6.70, 5.90, 5.15, 4.75 kbit/s modes........................................................................................................26 5.2.7 Monitoring resonance in the LPC spectrum (all modes).............................................................................26 5.3 Open-loop pitch analysis..................................................................................................................................27 12.2 kbit/s mode 27 10.2 kbit/s mode 28 7.95, 7.40, 6.70, 5.90 kbit/s modes....................................................................................................................................29 5.15, 4.75 kbit/s modes......................................................................................................................................................29 5.4 Impulse response computation (all modes)......................................................................................................30 5.5 Target signal computation (all modes).............................................................................................................30 5.6 Adaptive codebook...........................................................................................................................................31 5.6.1 Adaptive codebook search..........................................................................................................................31 12.2 kbit/s mode 31 7.95 kbit/s mode 32 10.2, 7.40 kbit/s mode.......................................................................................................................................................33 6.70, 5.90 kbit/s modes......................................................................................................................................................33 5.15, 4.75 kbit/s modes......................................................................................................................................................34 5.6.2 Adaptive codebook gain control (all modes)............................................................................................................35 5.7 Algebraic codebook..........................................................................................................................................35 5.7.1 Algebraic codebook structure.....................................................................................................................35 12.2 kbit/s mode 35 10.2 kbit/s mode 36 7.95, 7.40 kbit/s modes......................................................................................................................................................36 6.70 kbit/s mode 36 5.90 kbit/s mode 37 Page 4 TS 26.090 : December 1999 5.15, 4.75 kbit/s modes......................................................................................................................................................37 5.7.2 Algebraic codebook search.........................................................................................................................37 12.2 kbit/s mode 39 10.2 kbit/s mode 39 7.95, 7.40 kbit/s modes......................................................................................................................................................40 6.70 kbit/s mode 40 5.90 kbit/s mode 40 5.15, 4.75 kbit/s modes......................................................................................................................................................41 5.8 Quantization of the adaptive and fixed codebook gains...................................................................................41 5.8.1 Adaptive codebook gain limitation in quantization....................................................................................41 5.8.2 Quantization of codebook gains..................................................................................................................41 Prediction of the fixed codebook gain (all modes)............................................................................................................41 12.2 kbit/s mode 42 10.2 kbit/s mode 42 7.95 kbit/s mode 42 7.40 kbit/s mode 43 6.70 kbit/s mode 43 5.90, 5.15 kbit/s modes......................................................................................................................................................43 4.75 kbit/s mode 43 5.8.3 Update past quantized adaptive codebook gain buffer (all modes)..........................................................................43 5.9 Memory update (all modes)..............................................................................................................................44 4.75 kbit/s mode 44 6 Functional description of the decoder...................................................................................................44 6.1 Decoding and speech synthesis........................................................................................................................44 6.2 Post-processing.................................................................................................................................................48 6.2.1 Adaptive post-filtering (all modes).............................................................................................................48 12.2, 10.2 kbit/s modes......................................................................................................................................................49 7.95, 7.40, 6.70, 5.90, 5.15, 4.75 kbit/s modes.................................................................................................................49 6.2.2 High-pass filtering and up-scaling (all modes)...........................................................................................49 7 Detailed bit allocation of the adaptive multi-rate codec........................................................................49 8 Homing sequences.................................................................................................................................54 8.1 Functional description......................................................................................................................................54 8.2 Definitions........................................................................................................................................................54 8.3 Encoder homing................................................................................................................................................55 8.4 Decoder homing...............................................................................................................................................55 9 Bibliography..........................................................................................................................................59 Annex A: Change history......................................................................................................................60 History.............................................................................................................................................................61 Page 5 TS 26.090 : December 1999 Foreword This Technical Specification has been produced by the 3GPP. The present document describes the detailed mapping of the narrowband telephony speech service employing the Adaptive Multi-Rate (AMR) speech coder within the 3GPP system. The contents of the present document are subject to continuing work within the TSG and may change following formal TSG approval. Should the TSG modify the contents of this TS, it will be re-released by the TSG with an identifying change of release date and an increase in version number as follows: Version 3.y.z where: x the first digit: 1 presented to TSG for information; 2 presented to TSG for approval; 3 Indicates TSG approved document under change control. y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc. z the third digit is incremented when editorial only changes have been incorporated in the specification; Page 6 TS 26.090 : December 1999 1 Scope This Telecommunication Standard (TS) describes the detailed mapping from input blocks of 160 speech samples in 13-bit uniform PCM format to encoded blocks of 95, 103, 118, 134, 148, 159, 204, and 244 bits and from encoded blocks of 95, 103, 118, 134, 148, 159, 204, and 244 bits to output blocks of 160 reconstructed speech samples. The sampling rate is 8 000 samples/s leading to a bit rate for the encoded bit stream of 4.75, 5.15, 5.90, 6.70, 7.40, 7.95, 10.2 or 12.2 kbit/s. The coding scheme for the multi-rate coding modes is the so-called Algebraic Code Excited Linear Prediction Coder, hereafter referred to as ACELP. The multi-rate ACELP coder is referred to as MR- ACELP. In the case of discrepancy between the requirements described in this TS and the fixed point computational description (ANSI-C code) of these requirements contained in [4], the description in [4] will prevail. The ANSI-C code is not described in this TS, see [4] for a description of the ANSI-C code. The transcoding procedure specified in this TS is mandatory for systems using the AMR speech codec. 2 Normative references This TS incorporates by dated and undated reference, provisions from other publications. These normative references are cited in the appropriate places in the text and the publications are listed hereafter. For dated references, subsequent amendments to or revisions of any of these publications apply to this TS only when incorporated in it by amendment or revision. For undated references, the latest edition of the publication referred to applies. [1] GSM 03.50: " Digital cellular telecommunications system (Phase 2); Transmission planning aspects of the speech service in the GSM Public Land Mobile Network (PLMN) system" [2] TS 26.101 : "AMR Speech Codec; Frame structure". [3] TS 26.094: "AMR Speech Codec; Voice Activity Detection (VAD)". [4] TS 26.073: "AMR Speech Codec; ANSI-C code". [5] TS 26.074: "AMR Speech Codec; Test sequences". [6] ITU-T Recommendation G.711 (1988): "Coding of analogue signals by pulse code modulation Pulse code modulation (PCM) of voice frequencies". [7] ITU-T Recommendation G.726: "40, 32, 24, 16 kbit/s adaptive differential pulse code modulation (ADPCM)". Page 7 TS 26.090 : December 1999 3 Definitions, symbols and abbreviations 3.1 Definitions For the purposes of this TS, the following definitions apply: adaptive codebook:The adaptive codebook contains excitation vectors that are adapted for every subframe. The adaptive codebook is derived from the long-term filter state. The lag value can be viewed as an index into the adaptive codebook. adaptive postfilter: This filter is applied to the output of the short-term synthesis filter to enhance the perceptual quality of the reconstructed speech. In the adaptive multi-rate codec, the adaptive postfilter is a cascade of two filters: a formant postfilter and a tilt compensation filter. algebraic codebook: A fixed codebook where algebraic code is used to populate the excitation vectors (innovation vectors). The excitation contains a small number of nonzero pulses with predefined interlaced sets of positions.. anti-sparseness processing: An adaptive post-processing procedure applied to the fixed codebook vector in order to reduce perceptual artifacts from a sparse fixed codebook vector. closed-loop pitch analysis: This is the adaptive codebook search, i.e., a process of estimating the pitch (lag) value from the weighted input speech and the long term filter state. In the closed-loop search, the lag is searched using error minimization loop (analysis-by-synthesis). In the adaptive multi- rate codec, closed-loop pitch search is performed for every subframe. direct form coefficients: One of the formats for storing the short term filter parameters. In the adaptive multi- rate codec, all filters which are used to modify speech samples use direct form coefficients. fixed codebook: The fixed codebook contains excitation vectors for speech synthesis filters. The contents of the codebook are non-adaptive (i.e., fixed). In the adaptive multi-rate codec, the fixed codebook is implemented using an algebraic codebook. fractional lags: A set of lag values having sub-sample resolution. In the adaptive multi-rate codec a sub-sample resolution of 1/6th or 1/3rd of a sample is used. frame: A time interval equal to 20 ms (160 samples at an 8 kHz sampling rate). integer lags: A set of lag values having whole sample resolution. interpolating filter:An FIR filter used to produce an estimate of subsample resolution samples, given an input sampled with integer sample resolution. inverse filter: This filter removes the short term correlation from the speech signal. The filter models an inverse frequency response of the vocal tract. lag: The long term filter delay. This is typically the true pitch period, or its multiple or sub-multiple. Line Spectral Frequencies: (see Line Spectral Pair) Line Spectral Pair: Transformation of LPC parameters. Line Spectral Pairs are obtained by decomposing the inverse filter transfer function A(z) to a set of two transfer functions, one having even symmetry and the other having odd symmetry. The Line Spectral Pairs (also called as Line Spectral Frequencies) are the roots of these polynomials on the z-unit circle. Page 8 TS 26.090 : December 1999 LP analysis window: For each frame, the short term filter coefficients are computed using the high pass filtered speech samples within the analysis window. In the adaptive multi-rate codec, the length of the analysis window is always 240 samples. For each frame, two asymmetric windows are used to generate two sets of LP coefficient in the 12.2 kbit/s mode. For the other modes, only a single asymmetric window is used to generate a single set of LP coefficients. In the 12.2 kbit/s mode, no samples of the future frames are used (no lookahead). The other modes use a 5 ms lookahead. LP coefficients: Linear Prediction (LP) coefficients (also referred as Linear Predictive Coding (LPC) coefficients) is a generic descriptive term for the short term filter coefficients. mode: When used alone, refers to the source codec mode, i.e., to one of the source codecs employed in the AMR codec. open-loop pitch search: A process of estimating the near optimal lag directly from the weighted speech input. This is done to simplify the pitch analysis and confine the closed-loop pitch search to a small number of lags around the open-loop estimated lags. In the adaptive multi-rate codec, an open-loop pitch search is performed in every other subframe. residual: The output signal resulting from an inverse filtering operation. short term synthesis filter: This filter introduces, into the excitation signal, short term correlation which models the impulse response of the vocal tract. perceptual weighting filter: This filter is employed in the analysis-by-synthesis search of the codebooks. The filter exploits the noise masking properties of the formants (vocal tract resonances) by weighting the error less in regions near the formant frequencies and more in regions away from them. subframe: A time interval equal to 5 ms (40 samples at 8 kHz sampling rate). vector quantization: A method of grouping several parameters into a vector and quantizing them simultaneously. zero input response: The output of a filter due to past inputs, i.e. due to the present state of the filter, given that an input of zeros is applied. zero state response:The output of a filter due to the present input, given that no past inputs have been applied, i.e., given that the state information in the filter is all zeroes. 3.2 Symbols For the purposes of this TS, the following symbols apply: ( ) A z The inverse filter with unquantized coefficients (cid:1)( ) A z The inverse filter with quantized coefficients 1 H(z)= (cid:1)( ) The speech synthesis filter with quantized coefficients A z a The unquantized linear prediction parameters (direct form coefficients) i (cid:1) a The quantified linear prediction parameters i m The order of the LP model 1 The long-term synthesis filter B(z) Page 9 TS 26.090 : December 1999 ( ) W z The perceptual weighting filter (unquantized coefficients) γ ,γ The perceptual weighting factors 1 2 F (z) Adaptive pre-filter E T The integer pitch lag nearest to the closed-loop fractional pitch lag of the subframe β The adaptive pre-filter coefficient (the quantified pitch gain) (cid:1) A(z /γ ) Hf(z)= A(cid:1)(z/γn) The formant postfilter d γ n Control coefficient for the amount of the formant post-filtering γ Control coefficient for the amount of the formant post-filtering d ( ) H z Tilt compensation filter t γ Control coefficient for the amount of the tilt compensation filtering t µ=γ k ' A tilt factor, with k 'being the first reflection coefficient t 1 1 ( ) h n The truncated impulse response of the formant postfilter f ( ) L The length of h n h f ( ) r (i) The auto-correlations of h n h f (cid:1)( ) A z γ The inverse filter (numerator) part of the formant postfilter n (cid:1)( ) 1 A z γ The synthesis filter (denominator) part of the formant postfilter d r(cid:1)(n) The residual signal of the inverse filter A(cid:1)(z γ ) n ( ) h n Impulse response of the tilt compensation filter t β (n) The AGC-controlled gain scaling factor of the adaptive postfilter sc α The AGC factor of the adaptive postfilter ( ) H z Pre-processing high-pass filter h1 w (n), w (n) LP analysis windows I II L (I) w (n) 1 Length of the first part of the LP analysis window I L (I) w (n) 2 Length of the second part of the LP analysis window I L (II) w (n) 1 Length of the first part of the LP analysis window II L (II) w (n) 2 Length of the second part of the LP analysis window II

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.