ETSI ETS 300 395-2 ed.2 (1998-02)
Terrestrial Trunked Radio (TETRA); Speech codec for full-rate traffic channel; Part 2: TETRA codec
Terrestrial Trunked Radio (TETRA); Speech codec for full-rate traffic channel; Part 2: TETRA codec
RE/TETRA-05032
Prizemni snopovni radio (TETRA) - Govorni kodek za kanal s polno hitrostjo - 2. del: Kodek TETRA
General Information
Standards Content (Sample)
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.Prizemni snopovni radio (TETRA) - Govorni kodek za kanal s polno hitrostjo - 2. del: Kodek TETRATerrestrial Trunked Radio (TETRA); Speech codec for full-rate traffic channel; Part 2: TETRA codec33.070.10Prizemni snopovni radio (TETRA)Terrestrial Trunked Radio (TETRA)33.020Telekomunikacije na splošnoTelecommunications in generalICS:Ta slovenski standard je istoveten z:ETS 300 395-2 E23SIST ETS 300 395-2:199en01-GHFHPEHU1993SIST ETS 300 395-2:199SLOVENSKI
STANDARD
EUROPEANETS 300 395-2TELECOMMUNICATIONFebruary 1998STANDARDSecond EditionSource: TETRAReference: RE/TETRA-05032ICS:33.020Key words:TETRA, codecTerrestrial Trunked Radio (TETRA);Speech codec for full-rate traffic channel;Part 2: TETRA codecETSIEuropean Telecommunications Standards InstituteETSI SecretariatPostal address: F-06921 Sophia Antipolis CEDEX - FRANCEOffice address: 650 Route des Lucioles - Sophia Antipolis - Valbonne - FRANCEX.400: c=fr, a=atlas, p=etsi, s=secretariat - Internet: secretariat@etsi.frTel.: +33 4 92 94 42 00 - Fax: +33 4 93 65 47 16Copyright Notification: No part may be reproduced except as authorized by written permission. The copyright and theforegoing restriction extend to reproduction in all media.© European Telecommunications Standards Institute 1998. All rights reserved.SIST ETS 300 395-2:1999
Page 2ETS 300 395-2: February 1998Whilst every care has been taken in the preparation and publication of this document, errors in content,typographical or otherwise, may occur. If you have comments concerning its accuracy, please write to"ETSI Editing and Committee Support Dept." at the address shown on the title page.SIST ETS 300 395-2:1999
Page 3ETS 300 395-2: February 1998ContentsForeword.71Scope.102Normative references.103Abbreviations.104Full rate codec.114.1Structure of the codec.114.2Functional description of the codec.134.2.1Pre- and post-processing.134.2.2Encoder.134.2.2.1Short-term prediction.144.2.2.2LP to LSP and LSP to LP conversion.154.2.2.3Quantization and interpolation of LP parameters.174.2.2.4Long-term prediction analysis.184.2.2.5Algebraic codebook: structure and search.194.2.2.6Quantization of the gains.224.2.2.7Detailed bit allocation.244.2.3Decoder.244.2.3.1Decoding process.254.2.3.1.1Decoding of LP filter parameters.254.2.3.1.2Decoding of the adaptive codebookvector.254.2.3.1.3Decoding of the innovation vector.264.2.3.1.4Decoding of the adaptive andinnovative codebook gains.264.2.3.1.5Computation of the reconstructedspeech.264.2.3.2Error concealment.265Channel coding for speech.275.1General.275.2Interfaces in the error control structure.275.3Notations.295.4Definition of sensitivity classes and error control codes.295.4.1Sensitivity classes.295.4.2CRC codes.295.4.316-state RCPC codes.315.4.3.1Encoding by the 16-state mother code of rate 1/3.315.4.3.2Puncturing of the mother code.315.5Error control scheme for normal speech traffic channel.325.5.1CRC code.325.5.2RCPC codes.325.5.2.1Puncturing scheme of the RCPC code of rate 8/12 (equalto 2/3).325.5.2.2Puncturing scheme of the RCPC code of rate 8/18.325.5.3Matrix Interleaving.335.6Error control scheme for speech traffic channel with frame stealing activated.345.6.1CRC code.345.6.2RCPC codes.355.6.2.1Puncturing scheme of the RCPC code of rate 8/17.365.6.3Interleaving.366Channel decoding for speech.366.1General.36SIST ETS 300 395-2:1999
Page 4ETS 300 395-2: February 19986.2Error control structure.367Codec performance.378Bit exact description of the TETRA codec.37Annex A (informative):Implementation of speech channel decoding.39A.1Algorithmic description of speech channel decoding.39A.1.1Definition of error control codes.39A.1.1.116-state RCPC codes.39A.1.1.1.1Obtaining the mother code from punctured code.39A.1.1.1.2Viterbi decoding of the 16-state mother code of the rate1/3.39A.1.1.2CRC codes.40A.1.1.3Type-4 bits.40A.1.2Error control scheme for normal speech traffic channel.40A.1.2.1Matrix Interleaving.40A.1.2.2RCPC codes.40A.1.2.2.1Puncturing scheme of the RCPC code of rate 8/12 (equalto 2/3).41A.1.2.2.2Puncturing scheme of the RCPC code of rate 8/18.41A.1.2.3CRC code.41A.1.2.4Speech parameters.41A.1.3Error control scheme for speech traffic channel with frame stealing activated.41A.1.3.1Interleaving.41A.1.3.2RCPC codes.41A.1.3.2.1Puncturing scheme of the RCPC code of rate 8/17.42A.1.3.3CRC code.42A.1.3.4Speech parameters.42A.2C Code for speech channel decoding.42Annex B (informative):Indexes.43B.1Index of C code routines.43B.2Index of files.46Annex C (informative):Bibliography.47Annex D (informative):Codec performance.48D.1General.48D.2Quality.48D.2.1Subjective speech quality.48D.2.1.1Description of characterization tests.48D.2.1.2Absolute speech quality.48D.2.1.3Effect of input level.48D.2.1.4Effect of input frequency characteristic.48D.2.1.5Effect of transmission errors.48D.2.1.6Effect of tandeming.49D.2.1.7Effect of acoustic background noise.49D.2.1.8Effect of vocal effort.49D.2.1.9Effect of frame stealing.49D.2.1.10Speaker and language dependency.49D.2.2Comparison with analogue FM.49D.2.2.1Analogue and digital systems results.49D.2.2.2All conditions.50D.2.2.3Input level.50D.2.2.4Error patterns.51D.2.2.5Background noise.51SIST ETS 300 395-2:1999
Page 5ETS 300 395-2: February 1998D.2.3Additional tests.51D.2.3.1Types of signals.51D.2.3.2Codec behaviour.51D.3Performance of the channel coding/decoding for speech.52D.3.1Classes of simulation environment conditions.52D.3.2Classes of equipment.52D.3.3Classes of bits.53D.3.4Channel conditions.53D.3.5Results for normal case.53D.4Complexity.54D.4.1Complexity analysis.54D.4.1.1Measurement methodology.54D.4.1.2TETRA basic operators.54D.4.1.3Worst case path for speech encoder.56D.4.1.4Worst case path for speech decoder.57D.4.1.5Condensed complexity values for encoder and decoder.58D.4.2DSP independence.59D.4.2.1Program control structure.59D.4.2.2Basic operator implementation.59D.4.2.3Additional operator implementation.59D.5Delay.59Annex E (informative):Results of the TETRA codec characterization listening and complexity tests.60E.1Characterization listening test.60E.1.1Experimental conditions.60E.1.2Tables of results.61E.2TETRA codec complexity study.75E.2.1Computational analysis results.75E.2.1.1TETRA speech encoder.75E.2.1.2TETRA speech decoder.83E.2.1.3TETRA channel encoder and decoder.86E.2.2Memory requirements analysis results.88E.2.2.1TETRA speech encoder.88E.2.2.2TETRA speech decoder.89E.2.2.3TETRA speech channel encoder.89E.2.2.4TETRA speech channel decoder.90Annex F (informative):Description of attached computer files.91F.1Directory C-WORD.91F.2Directory C-CODE.91History.92SIST ETS 300 395-2:1999
Page 6ETS 300 395-2: February 1998Blank pageSIST ETS 300 395-2:1999
Page 7ETS 300 395-2: February 1998ForewordThis European Telecommunication Standard (ETS) has been produced by Terrestrial Trunked Radio(TETRA) Project of the European Telecommunications Standards Institute (ETSI).The sole purpose of the copyright statement below is to protect the documentation of the standard itselfand not the technology which is described therein.ETSI states that the technology described herein is in the sole ownership of THOMSON-CSF (subject tothe right of it's associated partner) and the disclosure of the standard documentation cannot be construedas granting any right whatsoever by licence or otherwise on the said technology.THOMSON-CSF has undertaken to grant licences for the technology described in the standard, on fair,reasonable and non-discriminatory terms and conditions to the users of the present standard inaccordance with the ETSI IPR Policy. Such licences are subject to a licence agreement to be agreed uponand entered into with THOMSON-CSF.This ETS consists of four parts as follows:Part 1:"General description of speech functions";Part 2:"TETRA codec";Part 3:"Specific operating features";Part 4:"Codec conformance testing".Clause 4 provides a complete description of the full rate speech source encoder and decoder, whilstclause 5 describes the speech channel encoder, and clause 6 the speech channel decoder.Clause 7 describes the codec performance.Finally, clause 8 introduces the bit exact description of the codec. This description is given as anANSI C code, fixed point, bit exact. The whole C code corresponding to the TETRA codec is given incomputer files attached to this ETS, and are an integral part of this ETS.In addition to these clauses, five informative annexes are provided.Annex A describes a possible implementation of the speech channel decoding function.Annex B provides comprehensive indexes of all the routines and files included in the C code associatedwith this ETS.Annex C lists informative references relevant to the speech codec.Annex D describes the actual quality, performance and complexity aspects of the codec.SIST ETS 300 395-2:1999
Page 8ETS 300 395-2: February 1998Annex E reports detailed results from codec characterization listening and complexity tests.Annex F contains instructions for the use of the attached electronic files.Transposition datesDate of adoption of this ETS:23 January 1998Date of latest announcement of this ETS (doa):31 May 1998Date of latest publication of new National Standardor endorsement of this ETS (dop/e):30 November 1998Date of withdrawal of any conflicting National Standard (dow):30 November 1998SIST ETS 300 395-2:1999
Page 9ETS 300 395-2: February 1998Blank pageSIST ETS 300 395-2:1999
Page 10ETS 300 395-2: February 19981ScopeThis European Telecommunication Standard (ETS) contains the full specification of the speech codec foruse in the Terrestrial Trunked Radio (TETRA) system.2Normative referencesThis ETS incorporates by dated and undated reference, provisions from other publications.These normative references are cited at the appropriate places in the text and the publications are listedhereafter. For dated references, subsequent amendments to or revisions of any of these publicationsapply to this ETS only when incorporated in it by amendment or revision. For undated references the latestedition of the publication referred to applies.[1]ETS 300 392-2: "Radio Equipment and Systems (RES); Trans-EuropeanTrunked Radio (TETRA) system; Voice plus Data; Part 2: Air Interface".[2]CCITT Recommendation P.48 (1988): "Specifications for an IntermediateReference System".3AbbreviationsFor the purposes of this ETS, the following abbreviations apply:ACELPAlgebraic CELPANSIAmerican National Standards InstituteBERBit Error RatioBFIBad Frame IndicatorBSBase StationCELPCode-Excited Linear PredictiveCRCCyclic Redundancy CodeDSPDigital Signal ProcessorDTMFDual Tone Multiple FrequencyEQEQualizer testEPError PatternFIRFinite Impulse ResponseHTHilly TerrainIRSIntermediate Reference SystemLPLinear PredictionLPCLinear Predictive CodingLSFLine Spectral FrequencyLSPLine Spectral PairMERMessage Error RateMNRUMultiplicative Noise Reference UnitMOSMean Opinion ScoreMSMobile StationMSEMean Square ErrorPDFProbability Density FunctionPUEMProbability of Undetected Erroneous MessageRCPC Rate-Compatible Punctured ConvolutionalRFRadio FrequencyTDMTime Division MultiplexTUTypical UrbanVQVector QuantizationSIST ETS 300 395-2:1999
Page 11ETS 300 395-2: February 19984Full rate codec4.1Structure of the codecThe TETRA speech codec is based on the Code-Excited Linear Predictive (CELP) coding model. In thismodel, a block of N speech samples is synthesized by filtering an appropriate innovation sequence from acodebook, scaled by a gain factor gc, through two time varying filters. A simplified high level blockdiagram of this synthesis process, as implemented in the TETRA codec, is shown in figure 1.SHORT-TERM SYNTHESIS FILTER ALGEBRAIC Output Speech CODEBOOK ADAPTIVE CODEBOOK GAIN PREDICTION AND VQ Gains LPC Info T k Pitch delay Algebraic codebook index Excitation Past g p g c D E M U L T I P L E X Digital Input LONG-TERM SYNTHESIS FILTER Figure 1: High level block diagram of the TETRA speech synthesizerThe first filter is a long-term prediction filter (pitch filter) aiming at modelling the pseudo-periodicity in thespeech signal and the second is a short-term prediction filter modelling the speech spectral envelope.The long-term or pitch, synthesis filter is given by:()111BzgzpT=--(1)where T is the pitch delay and gp is the pitch gain. The pitch synthesis filter is implemented as an adaptivecodebook, where for delays less than the sub-frame length the past excitation is repeated.The short-term synthesis filter is given by:()()HzAzaziiip==+å-=1111(2)where aipi,,.,,=1 are the Linear Prediction (LP) parameters and p is the predictor order. In theTETRA codec p shall be 10.SIST ETS 300 395-2:1999
Page 12ETS 300 395-2: February 1998The TETRA encoder uses an analysis-by-synthesis technique to determine the pitch and excitationcodebook parameters. The simplified block diagram of the TETRA encoder is shown in figure 2.InputSpeechSHORT-TERMSYNTHESIS FILTERLPC InfoPitch delay (T)Codebook index (k)MULTIPLEXDigitalOutputALGEBRAICCODEBOOKADAPTIVECODEBOOKOPEN LOOPPITCH ANALYSISTkExcitationPastgpgcLPC infoGainsUnquantized LPC infoLPC ANALYSISQUANTIZATION& INTERPOLATIONPERCEPTUALWEIGHTINGT0GAIN VQMSE SEARCHPERCEPTUALWEIGHTINGFigure 2: High level block diagram of the TETRA speech encoderIn this analysis-by-synthesis technique, the synthetic speech is computed for all candidate innovationsequences retaining the particular sequence that produces the output closer to the original signalaccording to a perceptually weighted distortion measure. The perceptual weighting filter de-emphasizesthe error at the formant regions of the speech spectrum and is given by:()()()WzAzAz=g(3)where A(z) is the LP inverse filter (as in Equation (2)) and 01<£g. The value g1085=, shall be used.Both the weighting filter, ()Wz, and formant synthesis filter, ()Hz, shall use the quantized LPparameters.In the Algebraic CELP (ACELP) technique, special innovation codebooks having an algebraic structureare used. This algebraic structure has several advantages in terms of storage, search complexity, androbustness. The TETRA codec shall use a specific dynamic algebraic excitation codebook whereby thefixed excitation vectors are shaped by a dynamic shaping matrix (see annex C {1}). The shaping matrix isa function of the LP model ()Az, and its main role is to shape the excitation vectors in the frequencydomain so that their energies are concentrated in the important frequency bands. The shaping matrixused is a Toeplitz lower triangular matrix constructed from the impulse response of the filter:()()()FzAzAz=//gg12(4)SIST ETS 300 395-2:1999
Page 13ETS 300 395-2: February 1998where ()Az is the LP inverse filter. The values g1075=, and g2085=, shall be used.In the TETRA codec, 30 ms speech frames shall be used. It is required that the short-term predictionparameters (or LP parameters) are computed and transmitted every speech frame. The speech frameshall be divided into 4 sub-frames of 7,5 ms (60 samples). The pitch and algebraic codebook parametershave also to be transmitted every sub-frame.Table 1 gives the bit allocation for the TETRA codec. 137 bits shall be produced for each frame of 30 msresulting in a bit rate of 4 567 bit/s.Table 1: Bit allocation for the TETRA codecParameter1st subframe2nd subframe3rd subframe4th subframeTotal per frameLP filter26Pitch delay855523Algebraic code1616161664VQ of 2 gains666624Total137More details about the sequence of bits within the speech frame of 137 bits per 30 ms, with reference tothe speech parameters, can be found in subclause 4.2.2.7, table 3.4.2Functional description of the codec4.2.1Pre- and post-processingBefore starting the encoding process, the speech signal shall be pre-processed using the offsetcompensation filter:()Hzzzp=--æèççöø÷÷--121111a(5)where a = 32 735/32 768. In the time domain, this filter corresponds to:()()()()snsnsnsn''//=--+-2121a(6)where ()sn is the input signal and ()sn' is the pre-processed signal. The purpose of this pre-processingis firstly to remove the dc from the signal (offset compensation), and secondly, to scale down the inputsignal in order to avoid saturation of the synthesis filtering.At the decoder, the post-processing consists of scaling up the reconstructed signal (multiplication by 2with saturation control).4.2.2EncoderFigure 3 presents a detailed block diagram of the TETRA encoder illustrating the major parts of the codecas well as signal flow. On this figure, names appearing at the bottom of the various building blockscorrespond to the C code routines associated with this ETS.SIST ETS 300 395-2:1999
Page 14ETS 300 395-2: February 1998INTERPOLATE4 SUBFRAMESLSP
A(z)InputSpeechT0FIND BEST DELAYAND GAINCOMPENSATIONOFFSET AND DIVISIONBY 2ANDAUTOCORRELATIONWINDOWINGR [ ] LEVINSONDURBINR [ ]
A(z)A(z)
LSPWEIGHTEDSPEECHCOMPUTE(4 SUBFRAMES) FINDOPEN-LOOP PITCHCOMPUTE TARGETFOR ADAPTIVECODEBOOKADAPTIVECODEBOOKCOMPUTECONTRIBUTION COMPUTE TARGETFORINNOVATIONFIND BESTINNOVATIONAND GAINUPDATE FILTERMEMORIES FORNEXT SUBFRAMECOMPUTEEXCITATIONQUANTIZATIONIN ENERGYGAINSDOMAIN A(z)^A(z)^gains indexLSP indexpitch indexcode indexs(n)s'(n)x(n)x(n)Pre-processingLPC analysisOpen-looppitch searchAdaptivecodebooksearchInnovativecodebooksearchComputeerrorframeframesubPre_ProcessLag_WindowAutocorrLevin_32Az_LspLSPQUANTIZATIONClsp_334SUBFRAMESINTERPOLATIONFOR THE 4Int_Lpc4Lsp_AzInt_Lpc4Lsp_AzPond_AiResiduSyn_FiltPitch_Ol_DecSyn_FiltPitch_FrPred_LtG_Pitchxn2(n)D4i60_16G_CodeSyn_FiltEner_QuaFigure 3: Signal flow at the encoder4.2.2.1Short-term predictionShort-term prediction (LP or LPC analysis) shall be performed every 30 ms. The auto-correlationapproach shall be used with an asymmetric analysis window. The LP analysis window consists of twohalves of Hamming windows with different lengths. This window is given by:()()wnnLnLnLLnLLL=--æèçöø÷=-=+--æèçöø÷=+-054046101054046111112112,,cos,,.,,,cos,,.,pp(7)A 32 ms analysis window (corresponding to 256 samples with the sampling frequency of 8 kHz) shall beused with values L1216= and L240=. The window shall be positioned such that 40 samples are takenfrom the future frame (look-ahead of 40 samples).SIST ETS 300 395-2:1999
Page 15ETS 300 395-2: February 1998The auto-correlation of the windowed speech ()¢=snn,,.,0255, are computed by:()()()rksnsnkknk=¢å¢-==255010,,.,(8)and a 60 Hz bandwidth expansion has to be used by lag windowing the auto-correlation using the window(see annex C {2}):()wififilags=-æèçöø÷éëêêùûúú=exp,,.,12211002p(9)where f0 = 60 Hz is the bandwidth expansion and fs = 8 000 Hz is the sampling frequency. Further, ()r0is multiplied by 1,00005 which is equivalent to adding a noise floor at -43 dB. In the TETRA coder, this isalternatively performed by dividing the lag window as in equation (9) by 1,00005, resulting in ()wlag'01=and:()()wiwiilaglag'/,,.,==100005110(10)The modified auto-correlation: ()()()rkrkwkklag'',,.,==010(11)are used to obtain the LP filter coefficients akk,,.,=110, by solving the set of equations:()()arikriikk¢-å=-¢==110110,,.,(12)The set of equations in (12) shall be solved using the Levinson-Durbin algorithm (see annex C {3}).4.2.2.2LP to LSP and LSP to LP conversionThe LP filter coefficients of ()Az (akk,,.,=110) shall be converted to the Line Spectral Pair (LSP)representation (see annex C {4}) for quantization and interpolation purposes. For a 10th order LP filter, theLSPs are defined as the roots of the sum and difference polynomials:()()()FzAzzAz1111'=+--(13)and()()()FzAzzAz2111'=---(14)respectively. It can be proven that all roots of these polynomials are on the unit circle and they alternateeach other (see annex C {5}). ()¢Fz1 has a root z=-=1()wp and ()¢Fz2 has a root z==10()w.SIST ETS 300 395-2:1999
Page 16ETS 300 395-2: February 1998To eliminate these two roots, new polynomials are defined:()()()FzFzz1111=¢+-/(15)and()()()FzFzz2211=¢--/(16)Each polynomial has 5 conjugate roots on the unit circle ()eji±w, therefore, the polynomials can bewritten as:()()Fzqzzii11213912=-+Õ--=,,.,(17)and()()Fzqzzii212241012=-+Õ--=,,.,(18)where ()qii=cosw, with wi being the Line Spectral Frequencies (LSFs). They satisfy the orderingproperty 01210<<<<
Page 17ETS 300 395-2: February 1998The following recursive relation shall be used to compute ()fi1:for i=1 to 5()()()fiqfifii121112122=--+--for ji=-1 down to 1()()()()fjfjqfjfji112111212=--+--with initial values ()f101= and ()f110-=. The coefficients ()fi2 are computed similarly by replacingqi21- by qi2. Once the coefficients ()fi1 and ()fi2 are found, ()Fz1 and ()Fz2 are multiplied by11+-z and 11--z, respectively, to obtain ()¢Fz1 and ()¢Fz2; that is ()()()¢=+-fififi1111 and()()()¢=--=fififii222115,,.,. Finally the LP coefficients are found by()()afifiii=+=05051512,,,,., and ()()afifiii=---=05505551012,,,,.,. This is directlyderived from the relation ()()()()AzFzFz=¢+¢122/, and considering the fact that ()¢Fz1 and ()¢Fz2 aresymmetrical and anti-symmetrical polynomials, respectively.4.2.2.3Quantization and interpolation of LP parametersThe computed LP parameters have to be converted to LSPs and quantized with 26 bits using split-VQ.NOTE:Both the quantization and interpolation are performed on the LSPs in the cosinedomain; that is:()qffiiis==cos,,.,2110p(21)where fi is the line spectral frequencies in Hz and fs is the sampling frequency.The LSP vector q shall be split into three sub-vectors of length 3, 3, and 4. The first sub-vector{}qqq123,, shall be quantized with 8 bits while the sub-vectors {}qqq456,, and {}qqqq78910,,, shallbe each quantized with 9 bits. The search is performed using Mean Square Error (MSE) minimization inthe q domain with no LSP weighting.The quantized LP parameters are used for the fourth sub-frame, whereas the first three sub-frames use alinear interpolation of the parameters of the present and previous frames. The interpolation is performedon the LSPs in the q domain. Let qn be the quantized LSP vector at the present frame and qn-1 thequantized LSP vector at the past frame. The interpolated LSP vectors at each of the 4 sub-frames aregiven by:qqqqqqqqqqq11213140005005000=+=+=+=---,75,25,,,25,75nnnnnnn(22)The initial values of the past quantized LSP vector are given in Q15 by q-1 = {30 000, 26 000, 21 000,15 000, 8 000, 0, -8 000, -15 000, -21 000, -26 000}. (Divide by 215 to obtain the values in therange [-1,1]). The interpolated LSP vectors shall be used to compute a different LP filter at each sub-frame.SIST ETS 300 395-2:1999
Page 18ETS 300 395-2: February 19984.2.2.4Long-term prediction analysisThe aim of the long term prediction analysis or adaptive codebook search is to find the best pitchparameters, which are the delay and gain values for the pitch filter. The pitch filter shall be implementedusing the so-called adaptive codebook approach whereby the excitation is repeated for delays less thanthe sub-frame length (60). In this implementation the excitation is extended by the LP residual in thesearch stage to simplify the closed-loop search. In the first sub-frame, a fractional pitch delay is used withresolutions: 1/3 in the range 19841323- and integers only in the range [85 - 143]. For the othersub-frames, a pitch resolution of 1/3 is always used in the range TT12312354--+, where T1 is thenearest integer to the fractional pitch lag of the first sub-frame.To simplify the pitch analysis procedure, a two stage approach shall be used, comprising first an openloop pitch search followed by a closed loop search.The open loop pitch has to be computed once every speech frame (30 ms) using a weighted speechsignal ()snw. A pole-zero type weighting procedure shall be used to get ()snw. This procedure shall beperformed with the help of a shaping filter ()()AzAz/,//,095060 for which the un-quantized LPparameters shall be used.The open loop pitch search shall then be performed as follows. In a first step, 3 maxima of the correlation:()()Csjsjkkwwj=-å=220120(23)are found in the three ranges, [20 - 39], [40 - 79] and [80 - 142], respectively. The retained maximaCiki,,.,,=13 are normalized by dividing by ()snkiwin213-å=,,.,, respectively. The normalizedmaxima and corresponding delays are denoted by ()Rkiii,,,.=13. The winner among the threenormalized correlation is selected by favouring the delays in the lower ranges. That is, ki is selected ifRRii>+0851,. This procedure of dividing the delay range into 3 sections and favouring the lowersections is used to avoid choosing pitch multiples.NOTE 1:The past weighted speech samples are initialized to zero.Having found the open-loop pitch Top, a closed-loop pitch analysis has to be performed around theopen-loop pitch delay on a sub-frame basis. In the first sub-frame the range Top±2 boundedby [20 - 143] is searched. For the other sub-frames, closed-loop pitch analysis is performed around thepitch selected in the first sub-frame. As mentioned earlier, a pitch resolution of 1/3 is always used for theother sub-frames in the range TT12312354--+, where T1 is the integer part of the first sub-framepitch lag. The pitch delay shall be encoded with 8 bits in the first sub-frame while the relative delays of theother sub-frames shall be encoded with 5 bits per sub-frame.The closed loop pitch search shall be performed by minimizing the mean-square weighted error betweenthe original and synthesized speech. This is achieved by maximizing the term:()()()()tkxnyknnyknyknn==å=å059059,(24)SIST ETS 300 395-2:1999
Page 19ETS 300 395-2: February 1998where ()xn is the target for the adaptive codebook search given by the weighted input speech aftersubtracting the zero-input response of the weighted synthesis filter ()()HzWz and ()ynk is the pastfiltered excitation at delay k (the past excitation is initialized to zero).NOTE 2:The search range is limited around the open-loop pitch as explained earlier.For delays k<60 the excitation signal ()un is extended by the LP residual signal. Once the optimuminteger pitch delay is determined, the fractions -23, -13, 13, and 23 around that integer are tested.NOTE 3:For the first sub-frame, the fractions are tested only if the integer pitch lag is less than85.The fractional pitch search is performed by interpolating the normalized correlation in equation (24) andsearching for its maximum. Once the non-integer pitch is determined, the adaptive codebook vector ()vnis computed by interpolating the past excitation signal ()un. The interpolation shall be performed usingtwo FIR filters (Hamming windowed sinc functions); one for interpolating the term in equation (24) with thesinc truncated at ±12 (8 multiplications per fraction) and the other for interpolating the past excitation withthe sinc truncated at ±48 (32 multiplications per sample). The pitch gain is then found by:()()()()gxnynynynpnn=åå==059059,bounded by 012££gp,(25)where ()()()ynvnhn=* is the filtered adaptive codebook vector (zero-state response of ()()HzWz to()vn).NOTE 4:Only positive pitch gains are allowed since by maximizing the term in equation (24) thenegative correlations are eliminated.4.2.2.5Algebraic codebook: structure and searchA 16-bit algebraic codebook shall be used in the innovative codebook search, the aim of which is to findthe best innovation and gain parameters. The innovation vector contains, at most, four non-zero pulses.The 4 pulses can assume the amplitudes and positions given in the following table:Table 2Codebook parametersPositions of the pulsesCodebookbit allocationPulse amplitude: +1,41420, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30,32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 585Pulse amplitude: -12, 10, 18, 26, 34, 42, 50, 583Pulse amplitude: +14, 12, 20, 28, 36, 44, 52, (60)3Pulse amplitude: -16, 14, 22, 30, 38, 46, 54, (62)3Global sign flag1Shift flag1The pulses shall have fixed amplitudes of +1,4142, -1, +1 and -1, respectively. The first pulse positionshall be encoded with 5 bits while the positions of the other pulses shall be encoded with 3 bits. Thepositions of all pulses can be simultaneously shifted by one, to occupy odd positions. One bit shall beused to encode this shift and a global sign bit shall be used to invert all pulses simultaneously, giving atotal of 16 bits.SIST ETS 300 395-2:1999
Page 20ETS 300 395-2: February 1998NOTE 1:From table 2, it is possible to position the last two pulses outside the sub-frame whichindicates that these pulses are not present.The codebook is searched by minimizing the mean squared error between the weighted input speech andthe weighted synthesis speech. The target signal used in the closed-loop pitch search is updated bysubtracting the adaptive codebook contribution. That is, the target for the innovation is computed using:()()()xnxngynnp2059=-=,,.(26)where ()()()ynvnhn=* is the filtered adaptive codebook vector, with ()hn being the impulse responseof the weighted synthesis filter ()()()HzWzAz=1//g.As described in subclause 4.1 the algebraic codebook is dynamically shaped to enhance the importantfrequency regions. The used shaping matrix is a lower triangular convolution matrix consisting of theimpulse response of the filter ()Fz in equation (4). Thus the shaping can be performed as a filteringprocess. To maintain the simplicity of the algebraic codebook search, the filter ()Fz is combined with theweighted synthesis filter ()()HzWz and the impulse response ()¢hn of the combined filter is computed(see annex C {1}). If ck is the algebraic codeword at index k, then the algebraic codebook is searched bymaximizing the term:()tekkktkktkCccc==22dF(27)where H is a lower triangular Toeplitz convolution matrix with diagonal ()¢h0 and lower diagonals()()¢¢hh159,., and dH=tx2 is the backward filtered target vector and F=HHt.The algebraic structure of the codebook allows for very fast search procedures since the innovation vectorck contains only 4 non-zero pulses. The search shall be performed in 4 nested loops, corresponding toeach pulse positions, where in each loop the contribution of a new pulse is added. The correlation inequation (27) is given by:()()()()Cadmdmdmdm=-+-0123(28)and the energy is given by:()()()()()()()()()()effffffffff=+-++-+-+-ammmmammmmammmmmmammmmmm200110122021233031323222222,,,,,,,,,,(29)where mi is the position of the ith pulse and a = 1,4142.NOTE 2:The codebook gain is given by:gCc=e(30)A focused search approach shall be used to further simplify the search procedure.SIST ETS 300 395-2:1999
Page 21ETS 300 395-2: February 1998In this approach pre-computed thresholds are tested before entering the last two loops and the loops areentered only if these thresholds are exceeded. The maximum number of times the loops can be entered isfixed so that a low percentage of the codebook is searched.The two thresholds are computed based on the correlation C. The maximum absolute correlation due tothe contribution of the first two pulses, max2, and that due to the contribution of the first three pulses,max3, are found prior to the codebook search.The third loop is entered only if the absolute correlation (due to two pulses) exceeds k22max, andsimilarly, the fourth loop is entered only if the absolute correlation (due to three pulses) exceeds k33max,where 0123£ first threshold) continue with 3rd pulse;For the po
...
SLOVENSKI STANDARD
01-julij-1999
Prizemni snopovni radio (TETRA) - Govorni kodek za kanal s polno hitrostjo - 2.
del: Kodek TETRA
Terrestrial Trunked Radio (TETRA); Speech codec for full-rate traffic channel; Part 2:
TETRA codec
Ta slovenski standard je istoveten z: ETS 300 395-2 Edition 2
ICS:
33.070.10 Prizemni snopovni radio Terrestrial Trunked Radio
(TETRA) (TETRA)
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
EUROPEAN ETS 300 395-2
TELECOMMUNICATION February 1998
STANDARD Second Edition
Source: TETRA Reference: RE/TETRA-05032
ICS: 33.020
Key words: TETRA, codec
Terrestrial Trunked Radio (TETRA);
Speech codec for full-rate traffic channel;
Part 2: TETRA codec
ETSI
European Telecommunications Standards Institute
ETSI Secretariat
Postal address: F-06921 Sophia Antipolis CEDEX - FRANCE
Office address: 650 Route des Lucioles - Sophia Antipolis - Valbonne - FRANCE
X.400: c=fr, a=atlas, p=etsi, s=secretariat - Internet: secretariat@etsi.fr
Tel.: +33 4 92 94 42 00 - Fax: +33 4 93 65 47 16
Copyright Notification: No part may be reproduced except as authorized by written permission. The copyright and the
foregoing restriction extend to reproduction in all media.
© European Telecommunications Standards Institute 1998. All rights reserved.
Page 2
ETS 300 395-2: February 1998
Whilst every care has been taken in the preparation and publication of this document, errors in content,
typographical or otherwise, may occur. If you have comments concerning its accuracy, please write to
"ETSI Editing and Committee Support Dept." at the address shown on the title page.
Page 3
ETS 300 395-2: February 1998
Contents
Foreword .7
1 Scope .10
2 Normative references.10
3 Abbreviations.10
4 Full rate codec.11
4.1 Structure of the codec.11
4.2 Functional description of the codec .13
4.2.1 Pre- and post-processing .13
4.2.2 Encoder .13
4.2.2.1 Short-term prediction .14
4.2.2.2 LP to LSP and LSP to LP conversion.15
4.2.2.3 Quantization and interpolation of LP parameters .17
4.2.2.4 Long-term prediction analysis.18
4.2.2.5 Algebraic codebook: structure and search .19
4.2.2.6 Quantization of the gains.22
4.2.2.7 Detailed bit allocation.24
4.2.3 Decoder.24
4.2.3.1 Decoding process.25
4.2.3.1.1 Decoding of LP filter parameters .25
4.2.3.1.2 Decoding of the adaptive codebook
vector .25
4.2.3.1.3 Decoding of the innovation vector.26
4.2.3.1.4 Decoding of the adaptive and
innovative codebook gains.26
4.2.3.1.5 Computation of the reconstructed
speech .26
4.2.3.2 Error concealment .26
5 Channel coding for speech.27
5.1 General .27
5.2 Interfaces in the error control structure.27
5.3 Notations.29
5.4 Definition of sensitivity classes and error control codes .29
5.4.1 Sensitivity classes .29
5.4.2 CRC codes.29
5.4.3 16-state RCPC codes.31
5.4.3.1 Encoding by the 16-state mother code of rate 1/3.31
5.4.3.2 Puncturing of the mother code .31
5.5 Error control scheme for normal speech traffic channel.32
5.5.1 CRC code.32
5.5.2 RCPC codes.32
5.5.2.1 Puncturing scheme of the RCPC code of rate 8/12 (equal
to 2/3).32
5.5.2.2 Puncturing scheme of the RCPC code of rate 8/18 .32
5.5.3 Matrix Interleaving .33
5.6 Error control scheme for speech traffic channel with frame stealing activated .34
5.6.1 CRC code.34
5.6.2 RCPC codes.35
5.6.2.1 Puncturing scheme of the RCPC code of rate 8/17 .36
5.6.3 Interleaving.36
6 Channel decoding for speech.36
6.1 General .36
Page 4
ETS 300 395-2: February 1998
6.2 Error control structure . 36
7 Codec performance. 37
8 Bit exact description of the TETRA codec. 37
Annex A (informative): Implementation of speech channel decoding . 39
A.1 Algorithmic description of speech channel decoding. 39
A.1.1 Definition of error control codes . 39
A.1.1.1 16-state RCPC codes. 39
A.1.1.1.1 Obtaining the mother code from punctured code . 39
A.1.1.1.2 Viterbi decoding of the 16-state mother code of the rate
1/3 . 39
A.1.1.2 CRC codes . 40
A.1.1.3 Type-4 bits . 40
A.1.2 Error control scheme for normal speech traffic channel . 40
A.1.2.1 Matrix Interleaving . 40
A.1.2.2 RCPC codes. 40
A.1.2.2.1 Puncturing scheme of the RCPC code of rate 8/12 (equal
to 2/3). 41
A.1.2.2.2 Puncturing scheme of the RCPC code of rate 8/18. 41
A.1.2.3 CRC code . 41
A.1.2.4 Speech parameters . 41
A.1.3 Error control scheme for speech traffic channel with frame stealing activated. 41
A.1.3.1 Interleaving . 41
A.1.3.2 RCPC codes. 41
A.1.3.2.1 Puncturing scheme of the RCPC code of rate 8/17. 42
A.1.3.3 CRC code . 42
A.1.3.4 Speech parameters . 42
A.2 C Code for speech channel decoding . 42
Annex B (informative): Indexes . 43
B.1 Index of C code routines. 43
B.2 Index of files. 46
Annex C (informative): Bibliography . 47
Annex D (informative): Codec performance . 48
D.1 General. 48
D.2 Quality. 48
D.2.1 Subjective speech quality.48
D.2.1.1 Description of characterization tests. 48
D.2.1.2 Absolute speech quality. 48
D.2.1.3 Effect of input level . 48
D.2.1.4 Effect of input frequency characteristic. 48
D.2.1.5 Effect of transmission errors. 48
D.2.1.6 Effect of tandeming. 49
D.2.1.7 Effect of acoustic background noise. 49
D.2.1.8 Effect of vocal effort. 49
D.2.1.9 Effect of frame stealing. 49
D.2.1.10 Speaker and language dependency . 49
D.2.2 Comparison with analogue FM. 49
D.2.2.1 Analogue and digital systems results . 49
D.2.2.2 All conditions. 50
D.2.2.3 Input level. 50
D.2.2.4 Error patterns. 51
D.2.2.5 Background noise. 51
Page 5
ETS 300 395-2: February 1998
D.2.3 Additional tests.51
D.2.3.1 Types of signals .51
D.2.3.2 Codec behaviour .51
D.3 Performance of the channel coding/decoding for speech.52
D.3.1 Classes of simulation environment conditions.52
D.3.2 Classes of equipment .52
D.3.3 Classes of bits .53
D.3.4 Channel conditions .53
D.3.5 Results for normal case.53
D.4 Complexity.54
D.4.1 Complexity analysis .54
D.4.1.1 Measurement methodology.54
D.4.1.2 TETRA basic operators.54
D.4.1.3 Worst case path for speech encoder .56
D.4.1.4 Worst case path for speech decoder .57
D.4.1.5 Condensed complexity values for encoder and decoder .58
D.4.2 DSP independence.59
D.4.2.1 Program control structure.59
D.4.2.2 Basic operator implementation.59
D.4.2.3 Additional operator implementation.59
D.5 Delay .59
Annex E (informative): Results of the TETRA codec characterization listening and complexity tests.60
E.1 Characterization listening test .60
E.1.1 Experimental conditions.60
E.1.2 Tables of results .61
E.2 TETRA codec complexity study .75
E.2.1 Computational analysis results .75
E.2.1.1 TETRA speech encoder.75
E.2.1.2 TETRA speech decoder.83
E.2.1.3 TETRA channel encoder and decoder.86
E.2.2 Memory requirements analysis results .88
E.2.2.1 TETRA speech encoder.88
E.2.2.2 TETRA speech decoder.89
E.2.2.3 TETRA speech channel encoder .89
E.2.2.4 TETRA speech channel decoder .90
Annex F (informative): Description of attached computer files .91
F.1 Directory C-WORD.91
F.2 Directory C-CODE.91
History.92
Page 6
ETS 300 395-2: February 1998
Blank page
Page 7
ETS 300 395-2: February 1998
Foreword
This European Telecommunication Standard (ETS) has been produced by Terrestrial Trunked Radio
(TETRA) Project of the European Telecommunications Standards Institute (ETSI).
The sole purpose of the copyright statement below is to protect the documentation of the standard itself
and not the technology which is described therein.
ETSI states that the technology described herein is in the sole ownership of THOMSON-CSF (subject to
the right of it's associated partner) and the disclosure of the standard documentation cannot be construed
as granting any right whatsoever by licence or otherwise on the said technology.
THOMSON-CSF has undertaken to grant licences for the technology described in the standard, on fair,
reasonable and non-discriminatory terms and conditions to the users of the present standard in
accordance with the ETSI IPR Policy. Such licences are subject to a licence agreement to be agreed upon
and entered into with THOMSON-CSF.
This ETS consists of four parts as follows:
Part 1: "General description of speech functions";
Part 2: "TETRA codec";
Part 3: "Specific operating features";
Part 4: "Codec conformance testing".
Clause 4 provides a complete description of the full rate speech source encoder and decoder, whilst
clause 5 describes the speech channel encoder, and clause 6 the speech channel decoder.
Clause 7 describes the codec performance.
Finally, clause 8 introduces the bit exact description of the codec. This description is given as an
ANSI C code, fixed point, bit exact. The whole C code corresponding to the TETRA codec is given in
computer files attached to this ETS, and are an integral part of this ETS.
In addition to these clauses, five informative annexes are provided.
Annex A describes a possible implementation of the speech channel decoding function.
Annex B provides comprehensive indexes of all the routines and files included in the C code associated
with this ETS.
Annex C lists informative references relevant to the speech codec.
Annex D describes the actual quality, performance and complexity aspects of the codec.
Page 8
ETS 300 395-2: February 1998
Annex E reports detailed results from codec characterization listening and complexity tests.
Annex F contains instructions for the use of the attached electronic files.
Transposition dates
Date of adoption of this ETS: 23 January 1998
Date of latest announcement of this ETS (doa): 31 May 1998
Date of latest publication of new National Standard
or endorsement of this ETS (dop/e): 30 November 1998
Date of withdrawal of any conflicting National Standard (dow): 30 November 1998
Page 9
ETS 300 395-2: February 1998
Blank page
Page 10
ETS 300 395-2: February 1998
1 Scope
This European Telecommunication Standard (ETS) contains the full specification of the speech codec for
use in the Terrestrial Trunked Radio (TETRA) system.
2 Normative references
This ETS incorporates by dated and undated reference, provisions from other publications.
These normative references are cited at the appropriate places in the text and the publications are listed
hereafter. For dated references, subsequent amendments to or revisions of any of these publications
apply to this ETS only when incorporated in it by amendment or revision. For undated references the latest
edition of the publication referred to applies.
[1] ETS 300 392-2: "Radio Equipment and Systems (RES); Trans-European
Trunked Radio (TETRA) system; Voice plus Data; Part 2: Air Interface".
[2] CCITT Recommendation P.48 (1988): "Specifications for an Intermediate
Reference System".
3 Abbreviations
For the purposes of this ETS, the following abbreviations apply:
ACELP Algebraic CELP
ANSI American National Standards Institute
BER Bit Error Ratio
BFI Bad Frame Indicator
BS Base Station
CELP Code-Excited Linear Predictive
CRC Cyclic Redundancy Code
DSP Digital Signal Processor
DTMF Dual Tone Multiple Frequency
EQ EQualizer test
EP Error Pattern
FIR Finite Impulse Response
HT Hilly Terrain
IRS Intermediate Reference System
LP Linear Prediction
LPC Linear Predictive Coding
LSF Line Spectral Frequency
LSP Line Spectral Pair
MER Message Error Rate
MNRU Multiplicative Noise Reference Unit
MOS Mean Opinion Score
MS Mobile Station
MSE Mean Square Error
PDF Probability Density Function
PUEM Probability of Undetected Erroneous Message
RCPC Rate-Compatible Punctured Convolutional
RF Radio Frequency
TDM Time Division Multiplex
TU Typical Urban
VQ Vector Quantization
Page 11
ETS 300 395-2: February 1998
4 Full rate codec
4.1 Structure of the codec
The TETRA speech codec is based on the Code-Excited Linear Predictive (CELP) coding model. In this
model, a block of N speech samples is synthesized by filtering an appropriate innovation sequence from a
codebook, scaled by a gain factor g , through two time varying filters. A simplified high level block
c
diagram of this synthesis process, as implemented in the TETRA codec, is shown in figure 1.
Digital
Input
Algebraic codebook index
D
E
Pitch delay
M
U
L
T
GAIN PREDICTION
Gains I
P
AND VQ
L
E
Past
X
Excitation
g
T
p
ADAPTIVE
LPC Info
CODEBOOK
SHORT-TERM
LONG-TERM SYNTHESIS FILTER Output
SYNTHESIS FILTER
Speech
g
k
c
ALGEBRAIC
CODEBOOK
Figure 1: High level block diagram of the TETRA speech synthesizer
The first filter is a long-term prediction filter (pitch filter) aiming at modelling the pseudo-periodicity in the
speech signal and the second is a short-term prediction filter modelling the speech spectral envelope.
The long-term or pitch, synthesis filter is given by:
= (1)
-T
Bz
()
1-gz
p
where T is the pitch delay and g is the pitch gain. The pitch synthesis filter is implemented as an adaptive
p
codebook, where for delays less than the sub-frame length the past excitation is repeated.
The short-term synthesis filter is given by:
Hz== (2)
()
p
Az
()-i
1+ az
∑
i
i=1
where ai,,= 1.,p, are the Linear Prediction (LP) parameters and p is the predictor order. In the
i
TETRA codec p shall be 10.
Page 12
ETS 300 395-2: February 1998
The TETRA encoder uses an analysis-by-synthesis technique to determine the pitch and excitation
codebook parameters. The simplified block diagram of the TETRA encoder is shown in figure 2.
Input
Speech
LPC ANALYSIS
Unquantized
QUANTIZATION
LPC info
& INTERPOLATION
T
OPEN LOOP PERCEPTUAL
PITCH ANALYSIS WEIGHTING
Past
Excitation
g
T p
ADAPTIVE LPC Info
CODEBOOK
SHORT-TERM
SYNTHESIS FILTER
g
k c
ALGEBRAIC
CODEBOOK
PERCEPTUAL
MSE SEARCH
WEIGHTING
Gains M
GAIN VQ
U
L
Pitch delay (T)
T
I
Codebook index (k)
P
L
Digital
LPC info
E
Output
X
Figure 2: High level block diagram of the TETRA speech encoder
In this analysis-by-synthesis technique, the synthetic speech is computed for all candidate innovation
sequences retaining the particular sequence that produces the output closer to the original signal
according to a perceptually weighted distortion measure. The perceptual weighting filter de-emphasizes
the error at the formant regions of the speech spectrum and is given by:
Az
()
Wz = (3)
()
Az
()g
A(z 01g<£g=08, 5
where ) is the LP inverse filter (as in Equation (2)) and . The value shall be used.
Both the weighting filter, Wz , and formant synthesis filter, Hz , shall use the quantized LP
() ()
parameters.
In the Algebraic CELP (ACELP) technique, special innovation codebooks having an algebraic structure
are used. This algebraic structure has several advantages in terms of storage, search complexity, and
robustness. The TETRA codec shall use a specific dynamic algebraic excitation codebook whereby the
fixed excitation vectors are shaped by a dynamic shaping matrix (see annex C {1}). The shaping matrix is
a function of the LP model Az , and its main role is to shape the excitation vectors in the frequency
()
domain so that their energies are concentrated in the important frequency bands. The shaping matrix
used is a Toeplitz lower triangular matrix constructed from the impulse response of the filter:
Az /
()g
Fz = (4)
()
Az()g/
Page 13
ETS 300 395-2: February 1998
where Az is the LP inverse filter. The values g=07, 5 and g=08, 5 shall be used.
()
1 2
In the TETRA codec, 30 ms speech frames shall be used. It is required that the short-term prediction
parameters (or LP parameters) are computed and transmitted every speech frame. The speech frame
shall be divided into 4 sub-frames of 7,5 ms (60 samples). The pitch and algebraic codebook parameters
have also to be transmitted every sub-frame.
Table 1 gives the bit allocation for the TETRA codec. 137 bits shall be produced for each frame of 30 ms
resulting in a bit rate of 4 567 bit/s.
Table 1: Bit allocation for the TETRA codec
Parameter 1st subframe 2nd subframe 3rd subframe 4th subframe Total per frame
LP filter 26
Pitch delay 8 5 5 5 23
Algebraic code 16 16 16 16 64
VQ of 2 gains 6 6 6 6 24
Total 137
More details about the sequence of bits within the speech frame of 137 bits per 30 ms, with reference to
the speech parameters, can be found in subclause 4.2.2.7, table 3.
4.2 Functional description of the codec
4.2.1 Pre- and post-processing
Before starting the encoding process, the speech signal shall be pre-processed using the offset
compensation filter:
-1
1 1-z
Hz = (5)
()
p
-1
1-azŁł
where a = 32 735/32 768. In the time domain, this filter corresponds to:
''
s()n=-sn()//21sn-( )+2as-(n1) (6)
'
where sn() is the input signal and sn() is the pre-processed signal. The purpose of this pre-processing
is firstly to remove the dc from the signal (offset compensation), and secondly, to scale down the input
signal in order to avoid saturation of the synthesis filtering.
At the decoder, the post-processing consists of scaling up the reconstructed signal (multiplication by 2
with saturation control).
4.2.2 Encoder
Figure 3 presents a detailed block diagram of the TETRA encoder illustrating the major parts of the codec
as well as signal flow. On this figure, names appearing at the bottom of the various building blocks
correspond to the C code routines associated with this ETS.
Page 14
ETS 300 395-2: February 1998
Input
OFFSET
Speech INTERPOLATION
COMPENSATION LSP
Pre-processing FOR THE 4
AND DIVISION QUANTIZATION
s(n) ^
SUBFRAMES
A(z)
BY 2
Int_Lpc4 Lsp_Az Clsp_334
Pre_Process
s'(n)
f
WINDOWING
r LEVINSON
AND
a LPC analysis DURBIN A(z) LSP
AUTOCORRELATION
m R [ ] A(z)
R [ ]
Lag_Window Az_Lsp
e Autocorr Levin_32
COMPUTE
INTERPOLATE
Open-loop WEIGHTED FIND
4 SUBFRAMES
pitch search SPEECH OPEN-LOOP PITCH
LSP A(z)
(4 SUBFRAMES)
Int_Lpc4 Lsp_Az Pitch_Ol_Dec
Pond_Ai Residu Syn_Filt
T
^
A(z)
LSP index
COMPUTE
COMPUTE TARGET x(n)
Adaptive
FIND BEST DELAY ADAPTIVE
codebook FOR ADAPTIVE
AND GAIN CODEBOOK
search
CODEBOOK
CONTRIBUTION
Pitch_Fr
Syn_Filt Pred_Lt G_Pitch
pitch index
x(n)
s
u
b
COMPUTE TARGET FIND BEST
Innovative xn2(n)
f code index
codebook FOR INNOVATION
r
search INNOVATION AND GAIN
a
D4i60_16 G_Code
m
e
gains index
GAINS
UPDATE FILTER
Compute
COMPUTE QUANTIZATION
MEMORIES FOR
error EXCITATION IN ENERGY
NEXT SUBFRAME
DOMAIN
Syn_Filt
Ener_Qua
Figure 3: Signal flow at the encoder
4.2.2.1 Short-term prediction
Short-term prediction (LP or LPC analysis) shall be performed every 30 ms. The auto-correlation
approach shall be used with an asymmetric analysis window. The LP analysis window consists of two
halves of Hamming windows with different lengths. This window is given by:
pn
wn =-05,,4 0 46cos , nL=-01,.,
()
L-ł1Ł
(7)
pnL
()-
=+05, 4 0,46cos , nL=+,.,L L-1
11 2
L-1Łł2
A 32 ms analysis window (corresponding to 256 samples with the sampling frequency of 8 kHz) shall be
used with values L = 216 and L = 40. The window shall be positioned such that 40 samples are taken
1 2
from the future frame (look-ahead of 40 samples).
Page 15
ETS 300 395-2: February 1998
The auto-correlation of the windowed speech ¢sn,,n = 0.,255 , are computed by:
()
rk = s¢ns¢n-=k,,k 01.,0 (8)
() ∑ ( ) ( )
nk=
and a 60 Hz bandwidth expansion has to be used by lag windowing the auto-correlation using the window
(see annex C {2}):
Øø
fi
1 2p
0œwi=-Œexp , i =11,.,0 (9)
()
lag
2 fŒŁłœs
ºß
where f = 60 Hz is the bandwidth expansion and f = 8 000 Hz is the sampling frequency. Further, r 0
()
0 s
is multiplied by 1,00005 which is equivalent to adding a noise floor at -43 dB. In the TETRA coder, this is
'
alternatively performed by dividing the lag window as in equation (9) by 1,00005, resulting in w 01=
()
lag
and:
'
wi==w i /1,00005 i 1,.,10 (10)
() ()
lag lag
The modified auto-correlation:
''
rk==rkw k , k 01,.,0 (11)
() ( ) ()
lag
are used to obtain the LP filter coefficients ak, =11,.,0, by solving the set of equations:
k
ar i k =- r i,,i =11.,0 (12)
∑¢¢()
()-
k
=
k 1
The set of equations in (12) shall be solved using the Levinson-Durbin algorithm (see annex C {3}).
4.2.2.2 LP to LSP and LSP to LP conversion
The LP filter coefficients of Az (ak, =11,.,0) shall be converted to the Line Spectral Pair (LSP)
()
k
representation (see annex C {4}) for quantization and interpolation purposes. For a 10th order LP filter, the
LSPs are defined as the roots of the sum and difference polynomials:
' --11 1
Fz()=+A()z z Az (13)
()
and
' --11 1
Fz=-Az z Az (14)
() ()
2 ()
respectively. It can be proven that all roots of these polynomials are on the unit circle and they alternate
each other (see annex C {5}). ¢Fz has a root z =- 1(wp= ) and ¢Fz has a root z==w10().
() ()
1 2
Page 16
ETS 300 395-2: February 1998
To eliminate these two roots, new polynomials are defined:
-1
Fz() = Fz¢() / 1+z (15)
11 ()
and
-1
Fz = Fz¢/ 1-z (16)
() ( )
22 ()
–jwi
Each polynomial has 5 conjugate roots on the unit circle e , therefore, the polynomials can be
()
written as:
--12
Fz()=-∏ 12qz+ z (17)
()
1 i
i=13,,.,9
and
--12
Fz=-12qz+ z (18)
() ∏
2 ()i
i=24,,.,10
where q = cos , with w being the Line Spectral Frequencies (LSFs). They satisfy the ordering
()w
ii i
property 0<
12 10 i
The first five coefficients of each of the symmetric polynomials Fz and Fz are found by the
() ()
1 2
recursive relations (for i = 0 to 4):
fi+=1 a + a-f i
() ()
11ip+-i 1
(19)
fi+=1 a-a + f i
() ()
21ip+-i 2
Fz Fz
The LSPs are found by evaluating the polynomials () and () at 60 points equally spaced between
1 2
0 and p and checking for sign changes. A sign change signifies the existence of a root and the sign
change interval is then divided 4 times to better track the root. The Chebyshev polynomials have to be
used to evaluate Fz and Fz (see annex C {6}). This method is very computationally efficient since
() ()
1 2
it bypasses the cosine computations as the roots are found directly in the cosine domain q . In the
{}
i
TETRA codec, implementation, quantization and interpolation of the LSPs are performed in the cosine
domain, thus no trigonometric computations are needed to convert to the frequency domain.
The polynomials Fz or Fz are given by:
() ()
1 2
-j5w
Fz=+21e T x f Tx+f2T x+f3Tx+f4T x+f5 /2 (20)
( )()() () () ( ) () ( ) () ( ) () ( )
54 3 2 1
where Tx = cosmw is the mth order Chebyshev polynomial, and fi ,i =15,. , are the
() ( ) ()
m
coefficients of either Fz or Fz , computed using the equations in (19). The details of the Chebyshev
() ()
1 2
polynomial evaluation method are found in (see annex C {6}). If this numerical process is not able to find
enough roots, the previous computed set of LSPs is used.
Once the LSPs are quantized and interpolated, they are converted back to the LP coefficient domain
Az . The conversion to the LP domain is done as follows. The coefficients of Fz and Fz are
() () ()
{}
1 2
found by expanding equations (17) and (18) knowing the quantized and interpolated LSPs qi,,=11.0.
i
Page 17
ETS 300 395-2: February 1998
The following recursive relation shall be used to compute fi :
()
i = 1
for to 5
fi =-21q f i +2f-i2
() (-) ( )
12i-11 1
for ji=- 1 down to 1
fj=-f j21-q f j+ f-j2
() () ( ) ( )
11 2i-11 1
with initial values f 01= and f -=10. The coefficients fi are computed similarly by replacing
() () ()
1 1 2
q by q . Once the coefficients fi and fi are found, Fz and Fz are multiplied by
() () () ()
21-i 2i 1 2 1 2
--1 1
1+ z and 1-z , respectively, to obtain ¢Fz and ¢Fz ; that is ¢fi=+f i fi-1 and
() () () () ( )
1 2 11 1
f¢i=-fi fi11=,i ,.,5. Finally the LP coefficients are found by
() () (-)
22 2
af=+05,i 05,fi ,i=1,.,5 and af=-05,i 5 0,5f-i=5 ,i 5,.,10. This is directly
() () ()-( )
i 12 i 12
derived from the relation Az = F z +F z / 2 , and considering the fact that ¢Fz and ¢Fz are
()()¢¢( ) () () ()
12 1 2
symmetrical and anti-symmetrical polynomials, respectively.
4.2.2.3 Quantization and interpolation of LP parameters
The computed LP parameters have to be converted to LSPs and quantized with 26 bits using split-VQ.
NOTE: Both the quantization and interpolation are performed on the LSPs in the cosine
domain; that is:
qf==cos21f , i ,.,10 (21)
()p
iis
where f is the line spectral frequencies in Hz and f is the sampling frequency.
i s
The LSP vector q shall be split into three sub-vectors of length 3, 3, and 4. The first sub-vector
qq,,q shall be quantized with 8 bits while the sub-vectors qq,,q and qq,,qq, shall
{} {}{}
12 3 45 6 78 9 10
be each quantized with 9 bits. The search is performed using Mean Square Error (MSE) minimization in
the q domain with no LSP weighting.
The quantized LP parameters are used for the fourth sub-frame, whereas the first three sub-frames use a
linear interpolation of the parameters of the present and previous frames. The interpolation is performed
� �
on the LSPs in the q domain. Let q be the quantized LSP vector at the present frame and q the
n n-1
quantized LSP vector at the past frame. The interpolated LSP vectors at each of the 4 sub-frames are
given by:
qq00,75� ,25q�
=+
11-nn
qq05, 0� 0,50q�
=+
21-nn
(22)
� �
qq=+00,25 ,75q
31-nn
�
qq=
4 n
�
The initial values of the past quantized LSP vector are given in Q15 by q = {30 000, 26 000, 21 000,
-1
15 000, 8 000, 0, -8 000, -15 000, -21 000, -26 000}. (Divide by 2 to obtain the values in the
range [-1,1]). The interpolated LSP vectors shall be used to compute a different LP filter at each sub-
frame.
Page 18
ETS 300 395-2: February 1998
4.2.2.4 Long-term prediction analysis
The aim of the long term prediction analysis or adaptive codebook search is to find the best pitch
parameters, which are the delay and gain values for the pitch filter. The pitch filter shall be implemented
using the so-called adaptive codebook approach whereby the excitation is repeated for delays less than
the sub-frame length (60). In this implementation the excitation is extended by the LP residual in the
search stage to simplify the closed-loop search. In the first sub-frame, a fractional pitch delay is used with
1 2
resolutions: 1/3 in the range 19-84 and integers only in the range [85 - 143]. For the other
3 3
2 2
sub-frames, a pitch resolution of 1/3 is always used in the range TT--54+ , where T is the
1 1 1
3 3
nearest integer to the fractional pitch lag of the first sub-frame.
To simplify the pitch analysis procedure, a two stage approach shall be used, comprising first an open
loop pitch search followed by a closed loop search.
The open loop pitch has to be computed once every speech frame (30 ms) using a weighted speech
signal sn . A pole-zero type weighting procedure shall be used to get sn . This procedure shall be
() ()
w w
performed with the help of a shaping filter Az/,095/Az/0,60 for which the un-quantized LP
()( )
parameters shall be used.
The open loop pitch search shall then be performed as follows. In a first step, 3 maxima of the correlation:
Cs=-∑ ()22js(jk) (23)
kw w
j=0
are found in the three ranges, [20 - 39], [40 - 79] and [80 - 142], respectively. The retained maxima
Ci, =13,., , are normalized by dividing by sn k ,i =13,., , respectively. The normalized
∑()-
k wi
i
n
maxima and corresponding delays are denoted by Rk,,i =13,. . The winner among the three
()
ii
normalized correlation is selected by favouring the delays in the lower ranges. That is, k is selected if
i
RR> 08, 5 . This procedure of dividing the delay range into 3 sections and favouring the lower
ii+1
sections is used to avoid choosing pitch multiples.
NOTE 1: The past weighted speech samples are initialized to zero.
Having found the open-loop pitch T , a closed-loop pitch analysis has to be performed around the
op
open-loop pitch delay on a sub-frame basis. In the first sub-frame the range T–2 bounded
op
by [20 - 143] is searched. For the other sub-frames, closed-loop pitch analysis is performed around the
pitch selected in the first sub-frame. As mentioned earlier, a pitch resolution of 1/3 is always used for the
2 2
other sub-frames in the range TT--54+ , where T is the integer part of the first sub-frame
1 1 1
3 3
pitch lag. The pitch delay shall be encoded with 8 bits in the first sub-frame while the relative delays of the
other sub-frames shall be encoded with 5 bits per sub-frame.
The closed loop pitch search shall be performed by minimizing the mean-square weighted error between
the original and synthesized speech. This is achieved by maximizing the term:
xn y n
∑ () ()
k
n = 0
t= ,
k
y ny n
∑ () ()
k k
n = 0
(24)
Page 19
ETS 300 395-2: February 1998
where xn is the target for the adaptive codebook search given by the weighted input speech after
()
subtracting the zero-input response of the weighted synthesis filter HzW z and yn is the past
() () ()
k
filtered excitation at delay k (the past excitation is initialized to zero).
NOTE 2: The search range is limited around the open-loop pitch as explained earlier.
For delays k < 60 the excitation signal un is extended by the LP residual signal. Once the optimum
()
2 1 1 2
integer pitch delay is determined, the fractions -, -, , and around that integer are tested.
3 3 3 3
NOTE 3: For the first sub-frame, the fractions are tested only if the integer pitch lag is less than
85.
The fractional pitch search is performed by interpolating the normalized correlation in equation (24) and
searching for its maximum. Once the non-integer pitch is determined, the adaptive codebook vector vn
()
is computed by interpolating the past excitation signal un . The interpolation shall be performed using
()
two FIR filters (Hamming windowed sinc functions); one for interpolating the term in equation (24) with the
sinc truncated at –12 (8 multiplications per fraction) and the other for interpolating the past excitation with
the sinc truncated at –48 (32 multiplications per sample). The pitch gain is then found by:
∑xn()y()n
n=0
g = , bounded by 01££g ,2 (25)
p p
yn y n
∑ () ()
n=0
where yn=*v n h n is the filtered adaptive codebook vector (zero-state response of HzW z to
() () () () ( )
vn ).
()
NOTE 4: Only positive pitch gains are allowed since by maximizing the term in equation (24) the
negative correlations are eliminated.
4.2.2.5 Algebraic codebook: structure and search
A 16-bit algebraic codebook shall be used in the innovative codebook search, the aim of which is to find
the best innovation and gain parameters. The innovation vector contains, at most, four non-zero pulses.
The 4 pulses can assume the amplitudes and positions given in the following table:
Table 2
Codebook parameters Positions of the pulses Codebook
bit allocation
Pulse amplitude: +1,4142 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 5
32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58
Pulse amplitude: -1 2, 10, 18, 26, 34, 42, 50, 58 3
Pulse amplitude: +1 4, 12, 20, 28, 36, 44, 52, (60) 3
Pulse amplitude: -1 6, 14, 22, 30, 38, 46, 54, (62) 3
Global sign flag 1
Shift flag 1
The pulses shall have fixed amplitudes of +1,4142, -1, +1 and -1, respectively. The first pulse position
shall be encoded with 5 bits while the positions of the other pulses shall be encoded with 3 bits. The
positions of all pulses can be simultaneously shifted by one, to occupy odd positions. One bit shall be
used to encode this shift and a global sign bit shall be used to invert all pulses simultaneously, giving a
total of 16 bits.
Page 20
ETS 300 395-2: February 1998
NOTE 1: From table 2, it is possible to position the last two pulses outside the sub-frame which
indicates that these pulses are not present.
The codebook is searched by minimizing the mean squared error between the weighted input speech and
the weighted synthesis speech. The target signal used in the closed-loop pitch search is updated by
subtracting the adaptive codebook contribution. That is, the target for the innovation is computed using:
xn=-xn g yn , n= 05,.9 (26)
() () ()
2 p
where yn=*v n h n is the filtered adaptive codebook vector, with hn being the impulse response
() () () ()
of the weighted synthesis filter HzW z =1/A z /g.
() () ( )
As described in subclause 4.1 the algebraic codebook is dynamically shaped to enhance the important
frequency regions. The used shaping matrix is a lower triangular convolution matrix consisting of the
impulse response of the filter Fz in equation (4). Thus the shaping can be performed as a filtering
()
process. To maintain the simplicity
...










Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...