Digital cellular telecommunications system (Phase 2+) (GSM); Voice Activity Detection (VAD) for Adaptive Multi-Rate (AMR) speech traffic channels; General description (GSM 06.94 version 7.1.1 Release 1998)

Adaptive Multi- Rate (AMR)  Voice Activity detection (VAD)

Digitalni celični telekomunikacijski sistem (faza 2+) – Detektor govornih dejavnosti (VAD) pri prilagodljivih večhitrostnih (AMR) govornih prometnih kanalih – Splošni opis (GSM 06.94, različica 7.1.1, izdaja 1998)

General Information

Status
Published
Publication Date
30-Nov-2003
Current Stage
6060 - National Implementation/Publication (Adopted Project)
Start Date
01-Dec-2003
Due Date
01-Dec-2003
Completion Date
01-Dec-2003
Mandate
Standard
SIST EN 301 708 V7.1.1:2003
English language
28 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day

Standards Content (Sample)


2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.Digital cellular telecommunications system (Phase 2+) (GSM); Voice Activity Detection (VAD) for Adaptive Multi-Rate (AMR) speech traffic channels; General description (GSM 06.94 version 7.1.1 Release 1998)33.070.50Globalni sistem za mobilno telekomunikacijo (GSM)Global System for Mobile Communication (GSM)ICS:Ta slovenski standard je istoveten z:EN 301 708 Version 7.1.1SIST EN 301 708 V7.1.1:2003en01-december-2003SIST EN 301 708 V7.1.1:2003SLOVENSKI
STANDARD
ETSIEN301708V7.1.1(1999-12)EuropeanStandard(Telecommunicationsseries)Digitalcellulartelecommunicationssystem(Phase2+);VoiceActivityDetector(VAD)forAdaptiveMulti-Rate(AMR)speechtrafficchannels;Generaldescription(GSM06.94version7.1.1Release1998)GLOBALSYSTEMFORMOBILECOMMUNICATIONSRSIST EN 301 708 V7.1.1:2003

ETSIETSIEN301708V7.1.1(1999-12)2(GSM06.94version7.1.1Release1998)ReferenceDEN/SMG-110694Q7KeywordsDigitalcellulartelecommunicationssystem,GlobalSystemforMobilecommunications(GSM)ETSIPostaladdressF-06921SophiaAntipolisCedex-FRANCEOfficeaddress650RoutedesLucioles-SophiaAntipolisValbonne-FRANCETel.:+33492944200Fax:+33493654716SiretN°34862356200017-NAF742CAssociationàbutnonlucratifenregistréeàlaSous-PréfecturedeGrasse(06)N°7803/88Internetsecretariat@etsi.frIndividualcopiesofthisETSIdeliverablecanbedownloadedfromhttp://www.etsi.orgIfyoufinderrorsinthepresentdocument,sendyourcommentto:editor@etsi.frImportantnoticeThisETSIdeliverablemaybemadeavailableinmorethanoneelectronicversionorinprint.Inanycaseofexistingorperceiveddifferenceincontentsbetweensuchversions,thereferenceversionisthePortableDocumentFormat(PDF).Incaseofdispute,thereferenceshallbetheprintingonETSIprintersofthePDFversionkeptonaspecificnetworkdrivewithinETSISecretariat.CopyrightNotificationNopartmaybereproducedexceptasauthorizedbywrittenpermission.Thecopyrightandtheforegoingrestrictionextendtoreproductioninallmedia.©EuropeanTelecommunicationsStandardsInstitute1999.Allrightsreserved.SIST EN 301 708 V7.1.1:2003

ETSIETSIEN301708V7.1.1(1999-12)3(GSM06.94version7.1.1Release1998)ContentsIntellectualPropertyRights.4Foreword.41Scope.52References.53TechnicalDescriptionofVADOption1.53.1Definitions,symbolsandabbreviations.53.1.1Definitions.53.1.2Symbols.53.1.2.1Variables.63.1.2.2Constants.63.1.2.3Functions.73.1.3Abbreviations.73.2General.73.3Functionaldescription.73.3.1Filterbankandcomputationofsub-bandlevels.83.3.2Pitchdetection.103.3.3Tonedetection.113.3.4CorrelatedComplexSignalAnalysis(anddetection).113.3.5VADdecision.123.3.5.1Hangoveraddition.133.3.5.2Backgroundnoiseestimation.144TechnicalDescriptionofVADOption2.174.1Definitions,symbolsandabbreviations.174.1.1Definitions.174.1.2Symbols.174.1.2.1Variables.174.1.2.2Constants.184.1.2.3Functions.184.1.3Abbreviations.194.2General.194.3Functionaldescription.194.3.1FrequencyDomainConversion.204.3.2ChannelEnergyEstimator.204.3.3ChannelSNREstimator.214.3.4VoiceMetricCalculation.214.3.5FrameSNRandLong-TermPeakSNRCalculation.214.3.6NegativeSNRSensitivityBias.224.3.7VADDecision.224.3.8SpectralDeviationEstimator.234.3.9SinewaveDetection.244.3.10BackgroundNoiseUpdateDecision.254.3.11BackgroundNoiseEstimateUpdate.255Computationaldetails.26AnnexA(informative):Documentchangehistory.27History.28SIST EN 301 708 V7.1.1:2003

ETSIETSIEN301708V7.1.1(1999-12)4(GSM06.94version7.1.1Release1998)IntellectualPropertyRightsIPRsessentialorpotentiallyessentialtothepresentdocumentmayhavebeendeclaredtoETSI.TheinformationpertainingtotheseessentialIPRs,ifany,ispubliclyavailableforETSImembersandnon-members,andcanbefoundinSR000314:"IntellectualPropertyRights(IPRs);Essential,orpotentiallyEssential,IPRsnotifiedtoETSIinrespectofETSIstandards",whichisavailablefromtheETSISecretariat.LatestupdatesareavailableontheETSIWebserver(http://www.etsi.org/ipr).PursuanttotheETSIIPRPolicy,noinvestigation,includingIPRsearches,hasbeencarriedoutbyETSI.NoguaranteecanbegivenastotheexistenceofotherIPRsnotreferencedinSR000314(ortheupdatesontheETSIWebserver)whichare,ormaybe,ormaybecome,essentialtothepresentdocument.ForewordThisEuropeanStandard(Telecommunicationsseries)hasbeenproducedbySpecialMobileGroup(SMG).Thepresentdocumentspecifiestwooptions(Option1andOption2)fortheVoiceActivityDetector(VAD)tobeusedintheDiscontinuousTransmission(DTX)forAdaptiveMultiRate(AMR)speechtrafficchannelswithinthedigitalcellulartelecommunicationssystem.ImplementorsofmobilestationandinfrastructureequipmentconformingtotheAMRspecificationscanchoosewhichofthetwoVADoptionstoimplement.Therearenointeroperabilityfactorsassociatedwiththischoice.ThecontentsofthepresentdocumentissubjecttocontinuingworkwithinSMGandmaychangefollowingformalSMGapproval.ShouldSMGmodifythecontentsofthepresentdocumentitwillbere-releasedwithanidentifyingchangeofreleasedateandanincreaseinversionnumberasfollows:Version7.x.ywhere:7indicatesRelease1998ofGSMPhase2+xtheseconddigitisincrementedforallchangesofsubstance,i.e.technicalenhancements,corrections,updates,etc.ythethirddigitisincrementedwheneditorialonlychangeshavebeenincorporatedinthespecification.NationaltranspositiondatesDateofadoptionofthisEN:3December1999DateoflatestannouncementofthisEN(doa):31March2000DateoflatestpublicationofnewNationalStandardorendorsementofthisEN(dop/e):30September2000DateofwithdrawalofanyconflictingNationalStandard(dow):30September2000SIST EN 301 708 V7.1.1:2003

ETSIETSIEN301708V7.1.1(1999-12)5(GSM06.94version7.1.1Release1998)1ScopeThepresentdocumentspecifiestwoalternativesfortheVoiceActivityDetector(VAD)tobeusedintheDiscontinuousTransmission(DTX)asdescribedin[3].ImplementorsofmobilestationandinfrastructureequipmentconformingtotheAMRspecificationscanchoosewhichofthetwoVADoptionstoimplement.Therearenointeroperabilityfactorsassociatedwiththischoice.TherequirementsaremandatoryonanyVADtobeusedeitherinGSMMobileStations(MS)sorBaseStationSystems(BSS)sthatutilizetheAMRspeechtrafficchannel.2ReferencesThefollowingdocumentscontainprovisionswhich,throughreferenceinthistext,constituteprovisionsofthepresentdocument.•Referencesareeitherspecific(identifiedbydateofpublication,editionnumber,versionnumber,etc.)ornon-specific.•Foraspecificreference,subsequentrevisionsdonotapply.•Foranon-specificreference,thelatestversionapplies.•Anon-specificreferencetoanETSshallalsobetakentorefertolaterversionspublishedasanENwiththesamenumber.•ForthisRelease1998document,referencestoGSMdocumentsareforRelease1998versions(version7.x.y).[1]GSM06.73:"Digitalcellulartelecommunicationssystem(Phase2+);ANSI-CcodefortheAdaptiveMultiRate(AMR)speechcodec".[2]GSM06.90:"Digitalcellulartelecommunicationssystem(Phase2+);AdaptiveMultiRate(AMR)speechtranscoding".[3]GSM06.93."Digitalcellulartelecommunicationssystem(Phase2+);Discontinuoustransmission(DTX)forAdaptiveMultiRate(AMR)speechtrafficchannels".[4]ITU,TheInternationalTelecommunicationsUnion,BlueBook,Vol.III,TelephoneTransmissionQuality,IXthPlenaryAssembly,Melbourne,14-25November,1988,RecommendationG.711,Pulsecodemodulation(PCM)ofvoicefrequencies.3TechnicalDescriptionofVADOption13.1Definitions,symbolsandabbreviations3.1.1DefinitionsForthepurposesofthepresentydocument,thefollowingtermsanddefinitionsapply:frame:Timeintervalof20mscorrespondingtothetimesegmentationofthespeechtranscoder.3.1.2SymbolsForthepurposesofthepresentdocument,thefollowingsymbolsapply.SIST EN 301 708 V7.1.1:2003

ETSIETSIEN301708V7.1.1(1999-12)6(GSM06.94version7.1.1Release1998)3.1.2.1Variablesbckr_est[n]backgroundnoiseestimateburst_countcountslengthofaspeechburst,usedbyVADhangoveradditionhang_counthangovercounter,usedbyVADhangoveradditioncomplex_hang_counthangovercounter,usedbyCADhangoveradditioncomplex_hang_timerhangoverinitator,usedfoComplexActivityEstimationlagcountpitchdetectioncounterlevel[n]signallevelnew_speechpointerofthespeechencoder,pointsabuffercontaininglastreceivedsamplesofaspeechframe[2]noise_levelaveragelevelofthebackgroundnoiseestimateoldlagcountlagcountofthepreviousframepitchflagindicatingpresenceofaperiodicsignalcomplex_warningflagindicatingthepresenceofacomplexsignal.best_corr_hpnormalizedandlimitedvaluefrommaximumHPfilteredcorrelationvectorcorr_hpfilteredbest_corr_hpvaluespow_sumpoweroftheinputframes(i)samplesoftheinputframersnr_summeasurebetweeninputframeandnoiseestimatestat_countstationaritycounterstat_ratmeasureindicatingstationaryT_op[n]open-looplags[2]t0autocorrelationmaximacalculatedbytheopen-looppitchanalysis[2]t1signalpowerrelatedtotheautocorrelationmaximat0[2]toneflagindicatingthepresenceofatonevad_thrVADthresholdVAD_flagbooleanVADflagvadregintermediateVADdecisioncomplex_lowintermediatecomplexsignaldecisionscomplex_highintermediatecomplexsignaldecisions3.1.2.2ConstantsALPHA_UP1constantforupdatingnoiseestimate(seesubclause5.4.2)ALPHA_DOWN1constantforupdatingnoiseestimate(seesubclause5.4.2)ALPHA_UP2constantforupdatingnoiseestimate(seesubclause5.4.2)ALPHA_DOWN2constantforupdatingnoiseestimate(seesubclause5.4.2)ALPHA3constantforupdatingnoiseestimate(seesubclause5.4.2)ALPHA4constantforupdatingaveragesignallevel(seesubclause5.4.2)ALPHA5constantforupdatingaveragesignallevel(seesubclause5.4.2)BURST_LEN_HIGH_NOISEconstantforcontrollingVADhangoveraddition(seesubclause5.4.1)BURST_LEN_LOW_NOISEconstantforcontrollingVADhangoveraddition(seesubclause5.4.1)COEFF3coefficientforthefilterbank(seesubclause5.1)COEFF5_1coefficientforthefilterbank(seesubclause5.1)COEFF5_2coefficientforthefilterbank(seesubclause5.1)HANG_LEN_HIGH_NOISEconstantforcontrollingVADhangoveraddition(seesubclause5.4.1)HANG_LEN_LOW_NOISEconstantforcontrollingVADhangoveraddition(seesubclause5.4.1)HANG_NOISE_THRconstantforcontrollingVADhangoveraddition(seesubclause5.4.1)L_FRAMEsizeofaspeechframe,160L_NEXTlengthforthelookaheadofthespeechencoder,40LTHRESHthresholdforpitchdetection(seesubclause5.2)NOISE_MAXmaximumvaluefornoiseestimate(seesubclause5.4.2)NOISE_MINminimumvaluefornoiseestimate(seesubclause5.4.2)NTHRESHthresholdforpitchdetection(seesubclause5.2)POW_PITCH_THRthresholdforpitchdetection(seesubclause5.4)POW_COMPLEX_THRthresholdforcomplexdetection(seesubclause5.4)STAT_COUNTthresholdforstationarydetection(seesubclause5.4.2)CAD_MIN_STAT_COUNTminimumthresholdaftercomplexwarningSTAT_THRthresholdforstationarydetection(seesubclause5.4.2)SIST EN 301 708 V7.1.1:2003

ETSIETSIEN301708V7.1.1(1999-12)7(GSM06.94version7.1.1Release1998)STAT_THR_LEVELthresholdforstationarydetection(seesubclause5.4.2)TONE_THRthresholdfortonedetection(seesubclause5.3)VAD_P1constantofcomputationforVADthreshold(seesubclause5.4.2)VAD_POW_LOWconstantforcontrollingVADhangoveraddition(seesubclause5.4.1)VAD_SLOPEconstantofcomputationforVADthreshold(seesubclause5.4)VAD_THR_HIGHconstantofcomputationforVADthreshold(seesubclause5.4)CVAD_THRESH_ADAPT_HIGHconstantforupdatingcomplex_highCVAD_THRESH_ADAPT_LOWconstantforupdatingcomplex_lowCVAD_THRESH_HANGconstantforupdatingcomplex_hang_timerCVAD_HANG_LIMITconstantforinitiatingcomplex_hang_countCVAD_HANG_LENGTHconstantforresettingcomplex_hang_count3.1.2.3Functions+addition-subtraction*multiplication/division|x|absolutevalueofxANDBooleanANDORBooleanORxnnab()=()()()()=++++−+xaxaxbxb11MIN(x,y)=<≤xyyyxx,,MAX(x,y)=>≥xyyyxx,,3.1.3AbbreviationsANSIAmericanNationalStandardsInstituteDTXDiscontinuousTransmissionVADVoiceActivityDetectorCADComplexActivityDetectionCNGComfortNoiseGeneration3.2GeneralThefunctionoftheVADalgorithmistoindicatewhethereach20msframecontainssignalsthatshouldbetransmitted,i.e.speech,musicorinformationtones.TheoutputoftheVADalgorithmisaBooleanflag(VAD_flag)indicatingpresenceofsuchsignals.3.3FunctionaldescriptionTheblockdiagramoftheVADalgorithmisdepictedinfigure1.TheVADalgorithmusesparametersofthespeechencodertocomputetheBooleanVADflag(VAD_flag).SamplesoftheInputframe(s(i))aredividedintosub-bandsandlevelofthesignalineachband(level[n])iscalculated.Inputforthepitchdetectionfunctionareopen-looplags(T_op[n]),whicharecalculatedbyopen-looppitchanalysisofthespeechencoder.Thepitchdetectionfunctioncomputesaflag(pitch)whichindicatespresenceofpitch.Tonedetectionfunctioncalculatesaflag(tone),whichindicatespresenceofaninformationtone.Tonesaredetectedbasedonpitchgainoftheopen-looppitchanalysisThepitchgainisestimatedusingautocorrelationvalues(t0andt1)receivedfromthepitchanalysis.ComplexSignalDetectionfunctioncalculatesaflag(complex_warning),whichindicatespresenceofacorrelatedcomplexsignalsuchasmusic.Correlatecomplexsignalsaredetectedbasedonanalysisofthecorrelationvectoravailableintheopen-looppitchanalysis.TheVADdecisionfunctionestimatesbackgroundnoiselevels.IntermediateVADdecisioniscalculatedSIST EN 301 708 V7.1.1:2003

ETSIETSIEN301708V7.1.1(1999-12)8(GSM06.94version7.1.1Release1998)basedonthecomparisonofthebackgroundnoiseestimateandlevelsoftheinputframe(level[n]).Finally,theVADflagiscalculatedbyaddinghangovertotheintermediateVADdecision.Filterbankandcomputationofsub-bandlevelsVADdecisionPitchdetectionTonedetectionT_op[n]t0,t1VAD_flaglevel[n]pitchtones(i)ComplexsignalanalysisOL-LTPcorrelationvectorcomplex_warningTonedetectiont0,t1complex_timerFigure3.1:SimplifiedblockdiagramoftheVADalgorithm:Option13.3.1Filterbankandcomputationofsub-bandlevelsTheinputsignalisdividedintofrequencybandsusing9-bandfilterbank(figure2).Cut-offfrequenciesforthefilterbankareshownintable3.1.SIST EN 301 708 V7.1.1:2003

ETSIETSIEN301708V7.1.1(1999-12)9(GSM06.94version7.1.1Release1998)Table3.1:Cut-offfrequenciesforthefilterbankBandnumberFrequencies10-250Hz2250-500Hz3500-750Hz4750-1000Hz51000-1500Hz61500-2000Hz72000-2500Hz82500-3000Hz93000-4000HzInputforthefilterbankisthespeechframepointedbythenew_speechpointerofthespeechencoder[1].Inputvaluesforthefilterbankarescaleddownbyonebit.Thisensuressafescaling,i.e.saturationcannotoccurduringcalculationofthefilterbank.5thorderfilterblock5thorderfilterblock5thorderfilterblock3rdorderfilterblock3rdorderfilterblock3rdorderfilterblock3rdorderfilterblock3rdorderfilterblock0-250Hz250-500Hz500-750Hz750-1000Hz3k-4kHz2.5-3kHz2-2.5kHz1.5-2kHz1-1.5kHzFigure3.2:FilterbankThefilterbankconsistsof5thand3rdorderfilterblocks.Eachfilterblockdividestheinputintohigh-passandlow-passpartsanddecimatesthesamplingfrequencyby2.The5thorderfilterblockiscalculatedasfollows:)))(())1(((*5.0)(21ixAixAixlp+−=(3.1a))))(())1(((*5.0)(21ixAixAixhp−−=(3.1b)wherex(i)inputsignalforafilterblock)(ixlplow-passcomponentSIST EN 301 708 V7.1.1:2003

ETSIETSIEN301708V7.1.1(1999-12)10(GSM06.94version7.1.1Release1998))(ixhphigh-passcomponentThe3rdorderfilterblockiscalculatedasfollows:)))1(()((*5.0)(3−+=ixAixixlp(3.2a))))1(()((*5.0)(3−−=ixAixixhp(3.2b)Thefilters()1A,()2A,and()3Aarefirstorderdirectformall-passfilters,whosetransferfunctionisgivenby:11*1)(−−++=zCzCzA,(3.3)whereCisthefiltercoefficient.Coefficientsfortheall-passfilters()1A,()2A,and()3AareCOEFF5_1,COEFF5_2,andCOEFF3,respectively.Signalleveliscalculatedattheouputofthefilterbankateachfrequencybandasfollows:==nnENDSTARTinixnlevel)()(,(3.4)where:nindexforthefrequencyband)(ixnsampleiattheoutputofthefilterbankatfrequencybandnnSTART==−≤≤−≤−9n8,8n54,4n2,nEND==≤≤≤9n,398n519,4n,9Negativeindicesof)(ixnrefertothepreviousframe.3.3.2PitchdetectionThepurposeofthepitchdetectionfunctionistodetectvowelsoundsandotherperiodicsignals.Thepitchdetectionisbasedoncomparisonofopen-looplags(T_op[n]),whicharecalculatedbythespeechencoder[2].Ifthedifferenceofconsecutiveopen-looplags(T_op[n])issmallerthanathreshold,lagcountisincremented.Ifthesumofthelagcountsoftwoconsecutiveframesishighenough,thepitchflagisset.For5.15and4.75kbit/srates,onlyoneopen-looplagiscalculated,andtherforeonlythefirstlag-comparisonismadeeveryframe.Thepitchflagiscalculatedasfollows:SIST EN 301 708 V7.1.1:2003

ETSIETSIEN301708V7.1.1(1999-12)11(GSM06.94version7.1.1Release1998)Lagcount=0;If(|T_op[-1]-T_op[0]|NTHRESH)pitch=1elsepitch=0oldlagcount=LagcountT_op[-1]referstotheopen-looplagofthepreviousframe.3.3.3TonedetectionTonedetectionisusedtodetectinformationtones,sincethepitchdetectionfunctioncannotalwaysdetectthesesignals.Also,othersignalswhichcontainverystrongperiodiccomponentaredetected,becauseitmaysoundannoyingifthesesignalsarereplacedbycomfortnoise.Iftheopen-looppitchgainishigherthantheconstantTONE_THR,toneisdetectedandtoneflagisset.Thepitchgaincanbetestedbycomparingvariablest0andt1asfollows:if(t0>TONE_THR*t1)tone=1Thespeechencodercalculatesthepitchinthreedelayranges,exceptformode10.2kbit/s,whereonlyonerangeisused.Theabovecomparisonismadeonceforeachdelayrangeandthetoneflagshouldbesetiftheconditionistrueatleastinonerange.Otherwise,thetoneflagshouldbesettozero.Thevariablest0andt1arecalculatedbytheopen-looppitchanalysisofthespeechencoder[2].Thevariablet0isautocorrelationmaxi
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...