SIST ETS 300 730 E1:2003
(Main)Digital cellular telecommunications system; Voice Activity Detector (VAD) for Enhanced Full Rate (EFR) speech traffic channels (GSM 06.82)
Digital cellular telecommunications system; Voice Activity Detector (VAD) for Enhanced Full Rate (EFR) speech traffic channels (GSM 06.82)
This European Telecommunication Standard (ETS) specifies the Voice Activity Detector (VAD) to be used in the Discontinuous Transmission (DTX) as described in GSM EFR 06.31. It also specifies the test methods to be used to verify that a VAD complies with this ETS. The requirements are mandatory on any VAD to be used either in GSM Mobile Stations (MS)s or Base Station Systems (BSS)s that utilise the enhanced full-rate speech traffic channel.
Digitalni celični telekomunikacijski sistem – Detektor govornih dejavnosti (VAD) v prometnih kanalih izboljšanega govora s polno hitrostjo (EFR) (GSM 06.82)
General Information
Standards Content (Sample)
SLOVENSKI STANDARD
01-december-2003
'LJLWDOQLFHOLþQLWHOHNRPXQLNDFLMVNLVLVWHP±'HWHNWRUJRYRUQLKGHMDYQRVWL9$'Y
SURPHWQLKNDQDOLKL]EROMãDQHJDJRYRUDVSROQRKLWURVWMR()5*60
Digital cellular telecommunications system; Voice Activity Detector (VAD) for Enhanced
Full Rate (EFR) speech traffic channels (GSM 06.82)
Ta slovenski standard je istoveten z: ETS 300 730 Edition 1
ICS:
33.070.50 Globalni sistem za mobilno Global System for Mobile
telekomunikacijo (GSM) Communication (GSM)
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
EUROPEAN ETS 300 730
TELECOMMUNICATION March 1997
STANDARD
Source: ETSI TC-SMG Reference: DE/SMG-020682
ICS: 33.020
Key words: EFR, VAD, digital cellular telecommunications system, Global System for Mobile communications
(GSM), speech
R
GLOBAL SYSTEM FOR
MOBILE COMMUNICATIONS
Digital cellular telecommunications system;
Voice Activity Detector (VAD) for Enhanced
Full Rate (EFR) speech traffic channels
(GSM 06.82)
ETSI
European Telecommunications Standards Institute
ETSI Secretariat
Postal address: F-06921 Sophia Antipolis CEDEX - FRANCE
Office address: 650 Route des Lucioles - Sophia Antipolis - Valbonne - FRANCE
X.400: c=fr, a=atlas, p=etsi, s=secretariat - Internet: secretariat@etsi.fr
Tel.: +33 4 92 94 42 00 - Fax: +33 4 93 65 47 16
Copyright Notification: No part may be reproduced except as authorized by written permission. The copyright and the
foregoing restriction extend to reproduction in all media.
© European Telecommunications Standards Institute 1997. All rights reserved.
Page 2
ETS 300 730 (GSM 06.82 version 5.0.3): March 1997
Whilst every care has been taken in the preparation and publication of this document, errors in content,
typographical or otherwise, may occur. If you have comments concerning its accuracy, please write to
"ETSI Editing and Committee Support Dept." at the address shown on the title page.
Page 3
ETS 300 730 (GSM 06.82 version 5.0.3): March 1997
Contents
Foreword .5
1 Scope .7
2 Normative references.7
3 Definitions, symbols and abbreviations.7
3.1 Definitions .7
3.2 Symbols .7
3.2.1 Variables .7
3.2.2 Constants .9
3.2.3 Functions.9
3.3 Abbreviations .10
4 General.10
5 Functional description .10
5.1 Overview and principles of operation.10
5.2 Algorithm description .10
5.2.1 Adaptive filtering and energy computation .12
5.2.2 ACF averaging .12
5.2.3 Predictor values computation.12
5.2.4 Spectral comparison.13
5.2.5 Information tone detection.13
5.2.6 Threshold adaptation.14
5.2.7 VAD decision.16
5.2.8 VAD hangover addition.16
5.2.9 Periodicity detection .16
6 Computational description overview.17
6.1 VAD modules.17
6.2 Pseudo-floating point arithmetic .18
Annex A (informative): Simplified block filtering operation.19
Annex B (informative): Pole frequency calculation.20
History.21
Page 4
ETS 300 730 (GSM 06.82 version 5.0.3): March 1997
Blank page
Page 5
ETS 300 730 (GSM 06.82 version 5.0.3): March 1997
Foreword
This European Telecommunication Standard (ETS) has been produced by the Special Mobile Group
(SMG) Technical Committee of the European Telecommunications Standards Institute (ETSI).
This ETS specifies the Voice Activity Detector (VAD) to be used in the Discontinuous Transmission (DTX)
for Enhanced Full Rate (EFR) speech traffic channels within the digital cellular telecommunications
system.
This ETS corresponds to GSM technical specification, GSM 06.82, version 5.0.3
Transposition dates
Date of adoption: 28 February 1997
Date of latest announcement of this ETS (doa): 30 June 1997
Date of latest publication of new National Standard
or endorsement of this ETS (dop/e): 31 December 1997
Date of withdrawal of any conflicting National Standard (dow): 31 December 1997
Page 6
ETS 300 730 (GSM 06.82 version 5.0.3): March 1997
Blank page
Page 7
ETS 300 730 (GSM 06.82 version 5.0.3): March 1997
1 Scope
This European Telecommunication Standard (ETS) specifies the Voice Activity Detector (VAD) to be used
in the Discontinuous Transmission (DTX) as described in GSM 06.81 (ETS 300 729) [5] Discontinuous
transmission (DTX) for Enhanced Full Rate (EFR) speech traffic channels.
The requirements are mandatory on any VAD to be used either in GSM Mobile Stations (MS)s or Base
Station Systems (BSS)s that utilize the enhanced full-rate speech traffic channel.
2 Normative references
This ETS incorporates by dated and undated reference, provisions from other publications. These
normative references are cited at the appropriate places in the text and the publications are listed
hereafter. For dated references, subsequent amendments to or revisions of any of these publications
apply to this ETS only when incorporated in it by amendment or revision. For undated references, the
latest edition of the publication referred to applies.
[1] GSM 01.04 (ETR 100): "Digital cellular telecommunications system (Phase 2);
Abbreviations and acronyms".
[2] GSM 06.53 (ETS 300 724): "Digital cellular telecommunications system; ANSI-C
code for the GSM Enhanced Full Rate (EFR) speech codec".
[3] GSM 06.54 (ETS 300 725): "Digital cellular telecommunications system
(Phase 2); Test vectors for the GSM Enhanced Full Rate (EFR) speech codec".
[4] GSM 06.60 (ETS 300 726): "Digital cellular telecommunications system;
Enhanced Full Rate (EFR) speech transcoding".
[5] GSM 06.81 (ETS 300 729): "Digital cellular telecommunications system;
Discontinuous transmission (DTX) for Enhanced Full Rate (EFR) speech traffic
channels".
3 Definitions, symbols and abbreviations
3.1 Definitions
For the purposes of this ETS, the following definitions apply:
noise: The signal component resulting from acoustic environmental noise.
mobile environment: Any environment in which mobile stations may be used.
3.2 Symbols
For the purposes of this ETS, the following symbols apply.
3.2.1 Variables
aav1 filter predictor values, see subclause 5.2.3
acf the ACF vector which is calculated in the speech encoder
(GSM 06.60 (ETS 300 726) [4])
adaptcount secondary hangover counter, see subclause 5.2.6
av0 averaged ACF vector, see subclause 5.2.2
av1 a previous value of av0, see subclause 5.2.2
burstcount speech burst length counter, see subclause 5.2.8
Page 8
ETS 300 730 (GSM 06.82 version 5.0.3): March 1997
den denominator of left hand side of equation 8 in annex B, see subclause 5.2.5
difference difference between consecutive values of dm, see subclause 5.2.4
dm spectral distortion measure, see subclause 5.2.4
hangcount primary hangover counter, see subclause 5.2.8
lagcount number of subframes in current frame meeting periodicity criterion, see
subclause 5.2.9
lastdm previous value of dm, see subclause 5.2.4
lags the open loop long term predictor lags for the two halves of the speech encoder
frame (GSM 06.60 (ETS 300 726) [4])
num numerator of left hand side of equation 8 in annex B, see subclause 5.2.5
oldlagcount previous value of lagcount, see subclause 5.2.9
prederr fourth order short term prediction error, see subclause 5.2.5
ptch Boolean flag indicating the presence of a periodic signal component, see
subclause 5.2.9
pvad energy in the current filtered signal frame, see subclause 5.2.1
rav1 autocorrelation vector obtained from av1, see subclause 5.2.3
rc the first four unquantized reflection coefficients calculated in the speech encoder
(GSM 06.60 (ETS 300 726) [4])
rvad autocorrelation vector of the adaptive filter predictor values, see subclause 5.2.6
smallag difference between consecutive lag values, see subclause 5.2.9
stat Boolean flag indicating that the frequency spectrum of the input signal is
stationary, see subclause 5.2.4
thvad adaptive primary VAD threshold, see subclause 5.2.6
tone Boolean flag indicating the presence of an information tone, see subclause 5.2.5
vadflag Boolean VAD decision with hangover included, see subclause 5.2.8
veryoldlagcount previous value of oldlagcount, see subclause 5.2.9
vvad Boolean VAD decision before hangover, see subclause 5.2.7
Page 9
ETS 300 730 (GSM 06.82 version 5.0.3): March 1997
3.2.2 Constants
adp number of frames of hangover for secondary VAD, see subclause 5.2.6
burstconst minimum length of speech burst to which hangover is added, see
subclause 5.2.8
dec determines rate of decrease in adaptive threshold, see subclause 5.2.6
fac determines steady state adaptive threshold, see subclause 5.2.6
frames number of frames over which av0 and av1 are calculated, see subclause 5.2.2
freqth threshold for pole frequency decision, see subclause 5.2.5
hangconst number of frames of hangover for primary VAD, see subclause 5.2.8
inc determines rate of increase in adaptive threshold, see subclause 5.2.6
lthresh lag difference threshold for periodicity decision, see subclause 5.2.9
margin determines upper limit for adaptive threshold, see subclause 5.2.6
nthresh frame count threshold for periodicity decision, see subclause 5.2.9
plev lower limit for adaptive threshold, see subclause 5.2.6
predth threshold for short term prediction error, see subclause 5.2.5
pth energy threshold, see subclause 5.2.6
thresh decision threshold for evaluation of stat flag, see subclause 5.2.4
3.2.3 Functions
+ addition
- subtraction
* multiplication
/ division
| x | absolute value of x
AND Boolean AND
OR Boolean OR
b
MULT(x(i)) the product of the series x(i) for i=a to b
i=a
b
SUM(x(i)) the sum of the series x(i) for i=a to b
i=a
Page 10
ETS 300 730 (GSM 06.82 version 5.0.3): March 1997
3.3 Abbreviations
ACF Autocorrelation function
ANSI American National Standards Institute
DTX Discontinuous Transmission
LTP Long Term Predictor
TX Transmission
VAD Voice Activity Detector
For abbreviations not given in this subclause, see GSM 01.04 (ETR 100) [1].
4 General
The function of the VAD is to indicate whether each 20 ms frame produced by the speech encoder
contains speech or not. The output is a Boolean flag (vadflag) which is used by the Transmit (TX) DTX
handler defined in GSM 06.81 (ETS 300 729) [5].
This ETS is organized as follows:
Clause 5 describes the principles of operation of the VAD. Clause 6 provides an overview of the
computational description of the VAD. The computational details necessary for the fixed point
implementation of the VAD algorithm are given in the form of ANSI C program contained in GSM 06.53
(ETS 300 724) [2].
The verification of the VAD is based on the use of digital test sequences which are described in
GSM 06.54 (ETS 300 725) [3].
5 Functional description
The purpose of this clause is to give the reader an understanding of the principles of operation of the
VAD, whereas GSM 06.53 (ETS 300 724) [2] contains the fixed point computational description of the
VAD. In the case of discrepancy between the two descriptions, the description in GSM 06.53
(ETS 300 724) [2] will prevail.
5.1 Overview and principles of operation
The function of the VAD is to distinguish between noise with speech present and noise without speech
present. This is achieved by comparing the energy of a filtered version of the input signal with a threshold.
The presence of speech is indicated whenever the threshold is exceeded.
The detection of speech in a mobile environment is difficult due to the low speech/noise ratios which are
encountered, particularly in moving vehicles. To increase the probability of detecting speech the input
signal is adaptively filtered (see subclause 5.2.1) to reduce its noise content before the voice activity
decision is made (see subclause 5.2.7).
The frequency spectrum and level of the noise may vary within a given environment as well as between
different environments. It is therefore necessary to adapt the input filter coefficients and energy threshold
at regular intervals as described in subclause 5.2.6.
5.2 Algorithm description
The block diagram of the VAD algorithm is shown in figure 1. The individual blocks are described in the
following subclauses. The variables shown in the block diagram are described in table 1.
Page 11
ETS 300 730 (GSM 06.82 version 5.0.3): March 1997
Table 1: Description of variables in figure 1
Var Description
acf The ACF vector which is calculated in the speech encoder
(GSM 06.60 (ETS 300 726) [4]).
av0 Averaged ACF vector.
av1 A previous value of av0.
lags The open loop long term predictor lags for the two halves of the speech encoder frame
(GSM 06.60 (ETS 300 726) [4]).
ptch Boolean flag indicating the presence of a periodic signal component.
pvad Energy in the current filtered signal frame.
rav1 Autocorrelation vector obtained from av1.
rc The first four reflection coefficients calculated in the speech encoder
(GSM 06.60 (ETS 300 726) [4]).
rvad Autocorrelation vector of the adaptive filter predictor values.
stat Boolean flag indicating that the frequency spectrum of the input signal is stationary.
thvad Adaptive primary VAD threshold.
tone Boolean flag indicating the presence of an information tone.
vadflag Boolean VAD decision with hangover included.
vvad Boolean VAD decision before hangover.
p
v vadflag
Adaptive filtering
vad
vad
acf
VAD VAD hangover
and energy
decision addition
computation
r
vad
ptch th
lags
Periodicity vad
detection
Threshold
adaptation
rc
Tone
detection
tone
stat
r
Predictor
av1
Spectral
values
comparison
computation
av1
av0
ACF
averaging
Figure 1: Functional block diagram of the VAD
SIST
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...