European digital cellular telecommunications system; Half rate speech; Part 6: Voice Activity Detector (VAD) for half rate speech traffic channels (GSM 06.42)

DE/SMG-020642

Evropski digitalni celični telekomunikacijski sistem – Govor s polovično hitrostjo – 6. del: Detektor govornih dejavnosti (VAD) pri prometnih kanalih z govorom s polovično hitrostjo (GSM 06.42)

General Information

Status
Published
Publication Date
18-Dec-1995
Technical Committee
Current Stage
12 - Completion
Due Date
01-Dec-1995
Completion Date
19-Dec-1995
Mandate
Standard
ETS 300 581-6 E1:2003
English language
23 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day

Standards Content (Sample)


SLOVENSKI STANDARD
01-december-2003
(YURSVNLGLJLWDOQLFHOLþQLWHOHNRPXQLNDFLMVNLVLVWHP±*RYRUVSRORYLþQRKLWURVWMR±
GHO'HWHNWRUJRYRUQLKGHMDYQRVWL 9$' SULSURPHWQLKNDQDOLK]JRYRURPV
SRORYLþQRKLWURVWMR *60
European digital cellular telecommunications system; Half rate speech; Part 6: Voice
Activity Detector (VAD) for half rate speech traffic channels (GSM 06.42)
Ta slovenski standard je istoveten z: ETS 300 581-6 Edition 1
ICS:
33.070.50 Globalni sistem za mobilno Global System for Mobile
telekomunikacijo (GSM) Communication (GSM)
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

EUROPEAN ETS 300 581-6
TELECOMMUNICATION November 1995
STANDARD
Source: ETSI TC-GSM Reference: DE/SMG-020642
ICS: 33.060.50
European digital cellular telecommunications system, Global System for Mobile communications
Key words:
(GSM)
European digital cellular telecommunications system;
Half rate speech
Part 6: Voice Activity Detector (VAD) for half rate
speech traffic channels
(GSM 06.42)
ETSI
European Telecommunications Standards Institute
ETSI Secretariat
F-06921 Sophia Antipolis CEDEX - FRANCE
Postal address:
650 Route des Lucioles - Sophia Antipolis - Valbonne - FRANCE
Office address:
c=fr, a=atlas, p=etsi, s=secretariat - secretariat@etsi.fr
X.400: Internet:
Tel.: +33 92 94 42 00 - Fax: +33 93 65 47 16
*
Copyright Notification: No part may be reproduced except as authorized by written permission. The copyright and the
foregoing restriction extend to reproduction in all media.
© European Telecommunications Standards Institute 1995. All rights reserved.

Page 2
ETS 300 581-6: November 1995 (06.42 version 4.1.1)
Whilst every care has been taken in the preparation and publication of this document, errors in content,
typographical or otherwise, may occur. If you have comments concerning its accuracy, please write to
"ETSI Editing and Committee Support Dept." at the address shown on the title page.

Page 3
ETS 300 581-6: November 1995 (06.42 version 4.1.1)
Contents
Foreword .5
1 Scope .7
2 Normative references.7
3 Definitions, symbols and abbreviations.7
3.1 Definitions .7
3.2 Symbols .7
3.2.1 Variables .7
3.2.2 Constants .8
3.2.3 Functions.9
3.3 Abbreviations .9
4 General.10
5 Functional description .10
5.1 Overview and principles of operation.10
5.2 Algorithm description .10
5.2.1 Adaptive filtering and energy computation .11
5.2.2 ACF averaging .12
5.2.3 Predictor values computation.12
5.2.4 Spectral comparison.13
5.2.5 Information tone detection.13
5.2.6 Threshold adaptation.14
5.2.7 VAD decision.17
5.2.8 VAD hangover addition.17
5.2.9 Periodicity detection .17
6 Computational description overview.18
6.1 VAD modules.18
6.2 Pseudo-floating point arithmetic .19
Annex A (informative): VAD performance.20
Annex B (informative): Simplified block filtering operation.21
Annex C (informative): Pole frequency calculation.22
History.23

Page 4
ETS 300 581-6: November 1995 (06.42 version 4.1.1)
Blank page
Page 5
ETS 300 581-6: November 1995 (06.42 version 4.1.1)
Foreword
This European Telecommunication Standard (ETS) has been produced by the Special Mobile Group
(SMG) Technical Committee of the European Telecommunications Standards Institute (ETSI).
This ETS specifies the half rate speech traffic channels for the European digital cellular
telecommunications system. This ETS corresponds to GSM technical specification, GSM 06.42, version
4.1.1 and is part 6 of a multi-part ETS covering the half rate speech traffic channels as described below:
GSM 06.02 ETS 300 581-1: "European digital cellular telecommunications system;
Half rate speech Part 1: Half rate speech processing functions".
GSM 06.20 ETS 300 581-2: "European digital cellular telecommunications system;
Half rate speech Part 2: Half rate speech transcoding".
GSM 06.21 ETS 300 581-3: "European digital cellular telecommunications system;
Half rate speech Part 3: Substitution and muting of lost frames for half rate
speech traffic channels".
GSM 06.22 ETS 300 581-4: "European digital cellular telecommunications system;
Half rate speech Part 4: Comfort noise aspects for half rate speech traffic
channels".
GSM 06.41 ETS 300 581-5: "European digital cellular telecommunications system;
Half rate speech Part 5: Discontinuous Transmission (DTX) for half rate speech
traffic channels".
GSM 06.42 ETS 300 581-6: "European digital cellular telecommunications system
(Phase 2); Half rate speech Part 6: Voice Activity Detection (VAD) for half
rate speech traffic channels".
GSM 06.06 ETS 300 581-7: "European digital cellular telecommunications system;
Half rate speech Part 7: ANSI-C code for the GSM half rate speech codec".
GSM 06.07 ETS 300 581-8: "European digital cellular telecommunications system;
Half rate speech Part 8: Test vectors for the GSM half rate speech codec".
NOTE: TC-SMG has produced documents which give the technical specifications for the
implementation of the European digital cellular telecommunications system.
Historically, these documents have been identified as GSM Technical Specifications
(GSM-TS). These TSs may have subsequently become Interim European
Telecommunication Standards (I-ETSs), (Phase 1), or European Telecommunication
Standards (ETSs), (Phase 2), whilst others may become ETSI Technical Reports
(ETRs).
Transposition dates
Date of adoption of this ETS: 27 October 1995
Date of latest announcement of this ETS (doa): 28 February 1996
Date of latest publication of new National Standard
or endorsement of this ETS (dop/e): 31 August 1996
Date of withdrawal of any conflicting National Standard (dow): 31 August 1996

Page 6
ETS 300 581-6: November 1995 (06.42 version 4.1.1)
Blank page
Page 7
ETS 300 581-6: November 1995 (06.42 version 4.1.1)
1 Scope
This European Telecommunication Standard (ETS) specifies the Voice Activity Detector (VAD) to be used
in the Discontinuous Transmission (DTX) as described in GSM 06.41 (ETS 300 581-5) [4]. It also
specifies the test methods to be used to verify that a VAD implementation complies with this ETS.
The requirements are mandatory on any VAD to be used either in GSM Mobile Stations (MS)s or Base
Station Systems (BSS)s that utilise the half-rate GSM speech traffic channel.
2 Normative references
This ETS incorporates by dated and undated reference, provisions from other publications. These
normative references are cited at the appropriate places in the text and the publications are listed
hereafter. For dated references, subsequent amendments to or revisions of any of these publications
apply to this ETS only when incorporated in it by amendment or revision. For undated references, the
latest edition of the publication referred to applies.
[1] GSM 01.04 (ETR 100): "European digital cellular telecommunications system;
Abbreviations and acronyms".
[2] GSM 06.20 (ETS 300 581-2): "European digital cellular telecommunications
system; Half rate speech Part 2: Half rate speech transcoding".
[3] GSM 06.22 (ETS 300 581-4): "European digital cellular telecommunications
system; Half rate speech Part 4: Comfort noise aspects for half rate speech
traffic channels".
[4] GSM 06.41 (ETS 300 581-5): "European digital cellular telecommunications
system; Half rate speech Part 5: Discontinuous transmission (DTX) for half rate
speech traffic channels".
[5] GSM 06.06 (ETS 300 581-7): "European digital cellular telecommunications
system; Half rate speech Part 7: ANSI C code for the GSM half rate speech
codec".
[6] GSM 06.07 (ETS 300 581-8): "European digital cellular telecommunications
system; Half rate speech Part 8: Test sequences for the GSM half rate speech
codec".
3 Definitions, symbols and abbreviations
3.1 Definitions
For the purpose of this ETS, the following definitions apply.
noise: The signal component resulting from acoustic environmental noise.
mobile environment: Any environment in which mobile stations may be used.
3.2 Symbols
For the purpose of this ETS, the following symbols apply.
3.2.1 Variables
aav1 filter predictor values, see subclause 5.2.3
acf the ACF vector which is calculated in the speech encoder
(GSM 06.20 (ETS 300 581-2) [2])
adaptcount secondary hangover counter, see subclause 5.2.6
av0 averaged ACF vector, see subclause 5.2.2
av1 a previous value of av0, see subclause 5.2.2

Page 8
ETS 300 581-6: November 1995 (06.42 version 4.1.1)
burstcount speech burst length counter, see subclause 5.2.7
den denominator of left hand side of equation 8 in annex C, see subclause 5.2.5
difference difference between consecutive values of dm, see subclause 5.2.4
dm spectral distortion measure, see subclause 5.2.4
hangcount primary hangover counter, see subclause 5.2.7
lagcount number of subframes in current frame meeting periodicity criterion, see
subclause 5.2.9
lastdm previous value of dm, see subclause 5.2.4
lags the open loop long term predictor lags for the four speech encoder subframes
(GSM 06.20 (ETS 300 581-2) [2].)
num numerator of left hand side of equation 8 in annex C, see subclause 5.2.5
oldlagcount previous value of lagcount, see subclause 5.2.9
prederr fourth order short term prediction error, see subclause 5.2.5
ptch Boolean flag indicating the presence of a periodic signal component, see
subclause 5.2.9
pvad energy in the current filtered signal frame, see subclause 5.2.1
rav1 autocorrelation vector obtained from av1, see subclause 5.2.3
rc the first four unquantized reflection coefficients calculated in the speech encoder
(GSM 06.20 (ETS 300 581-2) [2])
rvad autocorrelation vector of the adaptive filter predictor values, see subclause 5.2.6
smallag difference between consecutive lag values, see subclause 5.2.9
stat Boolean flag indicating that the frequency spectrum of the input signal is
stationary, see subclause 5.2.4
thvad adaptive primary VAD threshold, see subclause 5.2.6
tone Boolean flag indicating the presence of an information tone, see subclause 5.2.5
vadflag Boolean VAD decision with hangover included, see subclause 5.2.8
veryoldlagcount previous value of oldlagcount, see subclause 5.2.9
vvad Boolean VAD decision before hangover, see subclause 5.2.7
3.2.2 Constants
adp number of frames of hangover for secondary VAD, see subclause 5.2.6
burstconst minimum length of speech burst to which hangover is added, see
subclause 5.2.8
dec determines rate of decrease in adaptive threshold, see subclause 5.2.6
fac determines steady state adaptive threshold, see subclause 5.2.6
frames number of frames over which av0 and av1 are calculated, see subclause 5.2.2
freqth threshold for pole frequency decision, see subclause 5.2.5
hangconst number of frames of hangover for primary VAD, see subclause 5.2.8
inc determines rate of increase in adaptive threshold, see subclause 5.2.6
lthresh lag difference threshold for periodicity decision, see subclause 5.2.9
margin determines upper limit for adaptive threshold, see subclause 5.2.6
nthresh frame count threshold for periodicity decision, see subclause 5.2.9
plev lower limit for adaptive threshold, see subclause 5.2.6
predth threshold for short term prediction error, see subclause 5.2.5
pth energy threshold, see subclause 5.2.6
thresh decision threshold for evaluation of stat flag, subclause 5.2.4

Page 9
ETS 300 581-6: November 1995 (06.42 version 4.1.1)
3.2.3 Functions
+ addition
- subtraction
* multiplication
/ division
| x | absolute value of x
AND Boolean AND
OR Boolean OR
b
MULT(x(i)) the product of the series x(i) for i=a to b
i=a
b
SUM(x(i)) the sum of the series x(i) for i=a to b
i=a
3.3 Abbreviations
ACF Autocorrelation Function
AFLAT Autocorrelation Fixed point LAttice Technique
ANSI American National Standards Institute
DTX Discontinuous Transmission
LTP Long Term Predictor
TX Transmission
VAD Voice Activity Detector
For abbreviations not given in this subclause see GSM 01.04 (ETR 100) [1]

Page 10
ETS 300 581-6: November 1995 (06.42 version 4.1.1)
4 General
The function of the VAD is to indicate whether each 20 ms frame produced by the speech encoder
contains speech or not. The output is a Boolean flag (vadflag) which is used by the Transmit (TX) DTX
handler defined in GSM 06.41 (ETS 300 581-5) [4].
This ETS is organised as follows:
Clause 5 describes the principles of operation of the VAD. Clause 6 provides an overview of the
computational description of the VAD. The computational details necessary for the fixed point
implementation of the VAD algorithm are given in the form of an American National Standards Institute
(ANSI) C program contained in GSM 06.06 (ETS 300 581-7) [5].
The verification of the VAD is based on the use of digital test sequences which are described in
GSM 06.07 (ETS 300 581-8) [6].
The performance of the VAD algorithm is characterised by the amount of audible speech clipping it
introduces and the percentage activity it indicates. The characteristics for the VAD defined in this ETS
have been established by extensive testing under a wide range of operating conditions. The results are
summarised in annex A.
5 Functional description
The purpose of this clause is to give the reader an understanding of the principles of operation of the
VAD, whereas GSM 06.06 (ETS 300 581-7) [5] contains the fixed point computational description of the
VAD. In the case of discrepancy between the two descriptions, the description in
GSM 06.06 (ETS 300 581-7) [5] will prevail.
5.1 Overview and principles of operation
The function of the VAD is to distinguish between noise with speech present and noise without speech
present. This is achieved by comparing the energy of a filtered version of the input signal with a threshold.
The presence of speech is indicated whenever the threshold is exceeded.
The detection of speech in mobile environments is difficult due to the low speech/noise ratios which are
encountered, particularly in moving vehicles. To increase the probability of detecting speech, the input
signal is adaptively filtered (see subclause 5.2.1) to reduce its noise content before the voice activity
decision is made (see subclause 5.2.7).
The frequency spectrum and level of the noise may vary within a given environment as well as between
different environments. It is therefore necessary to adapt the input filter coefficients and energy threshold
at regular intervals as described in subclause 5.2.6.
5.2 Algorithm description
The block diagram of the VAD algorithm is shown in figure 1. The individual blocks are described in the
following subclauses. The global variables shown in the block diagram are described in table 1.

Page 11
ETS 300 581-6: November 1995 (06.42 version 4.1.1)
Table 1: Description of variables in figure 1
Var Description
acf The ACF vector which is calculated in the speech encoder
(GSM 06.20 (ETS 300 581-2) [2]).
av0 Averaged ACF vector.
av1 A previous value of av0.
lags The open loop long term predictor lags for the four speech encoder subframes
(GSM 06.20 (ETS 300 581-2) [2]).
ptch Boolean flag indicating the presence of a periodic signal component.
pvad Energy in the current filtered signal frame.
rav1 Autocorrelation vector obtained from av1.
rc The first four unquantized reflection coefficients calculated in the speech encoder
(GSM 06.20 (ETS 300 581-2) [2]).
rvad Autocorrelation vector of the adaptive filter predictor values.
stat Boolean flag indicating that the frequency spectrum of the input signal is stationary.
thvad Adaptive primary VAD threshold.
tone Boolean flag indicating the presence of an information tone.
vadflag Boolean VAD decision with hangover included.
v
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...