Speech and multimedia Transmission Quality (STQ) - Speech quality performance in the presence of background noise - Part 1: Background noise simulation technique and background noise database

The quality of background noise transmission is an important factor, which significantly contributes to the perceived overall quality of speech. Existing and even more the new generation of terminals, networks and system configurations including broadband services can be greatly improved with a proper design of terminals and systems in the presence of background noise. The present document:
- describes a noise simulation environment using realistic background noise scenarios for laboratory use;
- contains a database including the relevant background noise samples for subjective and objective evaluation. The present document provides information about the recording techniques needed for background noise recordings and discusses the advantages and drawbacks of existing methods. The present document describes the requirements for laboratory conditions. The loudspeaker setup and the loudspeaker calibration and equalization procedure are described. The simulation environment specified can be used for the evaluation and optimization of terminals and of complex configurations including terminals, networks and other configurations. The main application areas should be: office, home and car environment. The setup and database as described in the present document are applicable for:
- Objective performance evaluation of terminals in different (simulated) background noise environments.
- Speech processing evaluation by using the pre-processed speech signal in the presence of background noise, recorded by a terminal.
- Subjective evaluation of terminals by performing conversational tests, specific double talk tests or talking and listening tests in the presence of background noise.
- Subjective evaluation in third party listening tests by recording the speech samples of terminals in the presence of background noise.

Kakovost prenosa govora in večpredstavnih vsebin (STQ) - Zmogljivost kakovosti govora v prisotnosti šuma ozadja - 1. del: Simulacijska tehnika šuma ozadja in podatkovna zbirka šumov ozadja

General Information

Status
Published
Publication Date
09-Jun-2009
Current Stage
6060 - National Implementation/Publication (Adopted Project)
Start Date
11-May-2009
Due Date
16-Jul-2009
Completion Date
10-Jun-2009

Buy Standard

Standard
ETSI EG 202 396-1 V1.2.3 (2009-03) - Speech and multimedia Transmission Quality (STQ); Speech quality performance in the presence of background noise; Part 1: Background noise simulation technique and background noise database
English language
58 pages
sale 15% off
Preview
sale 15% off
Preview
Standard
ETSI EG 202 396-1 V1.2.3 (2009-01) - Speech and multimedia Transmission Quality (STQ); Speech quality performance in the presence of background noise; Part 1: Background noise simulation technique and background noise database
English language
58 pages
sale 15% off
Preview
sale 15% off
Preview
Guide
SIST-V ETSI/EG 202 396-1 V1.2.3:2009 - BARVNE STRANI - GRAFI
English language
58 pages
sale 10% off
Preview
sale 10% off
Preview

e-Library read for
1 day

Standards Content (sample)

ETSI EG 202 396-1 V1.2.3 (2009-03)
ETSI Guide
Speech and multimedia Transmission Quality (STQ);
Speech quality performance
in the presence of background noise;
Part 1: Background noise simulation technique
and background noise database
---------------------- Page: 1 ----------------------
2 ETSI EG 202 396-1 V1.2.3 (2009-03)
Reference
REG/STQ-00139
Keywords
performance, quality, speech
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE
Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16
Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88
Important notice
Individual copies of the present document can be downloaded from:
http://www.etsi.org

The present document may be made available in more than one electronic version or in print. In any case of existing or

perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF).

In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive

within ETSI Secretariat.

Users of the present document should be aware that the document may be subject to revision or change of status.

Information on the current status of this and other ETSI documents is available at

http://portal.etsi.org/tb/status/status.asp

If you find errors in the present document, please send your comment to one of the following services:

http://portal.etsi.org/chaircor/ETSI_support.asp
Copyright Notification
No part may be reproduced except as authorized by written permission.
The copyright and the foregoing restriction extend to reproduction in all media.
© European Telecommunications Standards Institute 2009.
All rights reserved.
TM TM TM TM

DECT , PLUGTESTS , UMTS , TIPHON , the TIPHON logo and the ETSI logo are Trade Marks of ETSI registered

for the benefit of its Members.

3GPP is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners.

LTE™ is a Trade Mark of ETSI currently being registered
for the benefit of its Members and of the 3GPP Organizational Partners.

GSM® and the GSM logo are Trade Marks registered and owned by the GSM Association.

ETSI
---------------------- Page: 2 ----------------------
3 ETSI EG 202 396-1 V1.2.3 (2009-03)
Contents

Intellectual Property Rights ................................................................................................................................ 5

Foreword ............................................................................................................................................................. 5

Introduction ........................................................................................................................................................ 5

1 Scope ........................................................................................................................................................ 6

2 References ................................................................................................................................................ 6

2.1 Normative references ......................................................................................................................................... 6

2.2 Informative references ........................................................................................................................................ 7

3 Definitions and abbreviations ................................................................................................................... 8

3.1 Definitions .......................................................................................................................................................... 8

3.2 Abbreviations ..................................................................................................................................................... 8

4 Overview of existing methods for realistic sound reproduction............................................................... 8

4.1 Introduction ........................................................................................................................................................ 8

4.2 Surround Sound Techniques............................................................................................................................... 9

4.3 IOSONO ........................................................................................................................................................... 10

4.4 Eidophonie ....................................................................................................................................................... 10

4.5 Four-loudspeaker arrangement for playback of binaurally recorded signals .................................................... 11

4.6 NTT Background-Noise Database ................................................................................................................... 12

4.7 General conclusions ......................................................................................................................................... 12

5 Recording arrangement .......................................................................................................................... 13

5.1 Binaural equalization ........................................................................................................................................ 13

5.2 The equalization procedure .............................................................................................................................. 13

6 Loudspeaker Setup for Background Noise Simulation .......................................................................... 15

6.1 Test Room Requirements ................................................................................................................................. 15

6.2 Loudspeaker Positioning .................................................................................................................................. 15

6.3 Equalization and Calibration ............................................................................................................................ 16

6.4 Accuracy of the reproduction arrangement ...................................................................................................... 21

6.4.1 Comparison between original sound field and simulated sound field ......................................................... 21

6.4.2 Displacement of the test arrangement in the simulated sound field ............................................................ 23

6.4.3 Transmission of background noise: Comparison of terminal performance in the original sound field

and the simulated sound field ..................................................................................................................... 25

7 Background Noise Simulation in cars .................................................................................................... 28

7.1 General setup .................................................................................................................................................... 28

7.2 Recording arrangement .................................................................................................................................... 29

7.2.1 Recording setup with the terminal's microphone ........................................................................................ 29

7.2.2 Recording setup with a pair of cardioid microphones................................................................................. 30

7.3 Equalization and Calibration with the terminal's microphone .......................................................................... 30

7.4 Equalization and Calibration with a pair of cardioid microphones .................................................................. 35

7.5 Accuracy of the reproduction arrangement ...................................................................................................... 40

7.5.1 Comparison between original sound field and simulated sound field ......................................................... 40

7.5.2 Transmission of background noise: Comparison of terminal performance in the original sound field

and the simulated sound field ..................................................................................................................... 41

8 Background Noise Database .................................................................................................................. 44

8.1 Binaural signals ................................................................................................................................................ 44

8.2 Stereophonic signals ......................................................................................................................................... 46

Annex A: Comparison of Tests in Sending Direction and D-Values Conducted in Different

Rooms ..................................................................................................................................... 47

A.1 Test Setup ............................................................................................................................................... 47

A.2 Results of the Tests ................................................................................................................................ 48

A.2.1 Sending Frequency Response Characteristics and SLR ................................................................................... 48

ETSI
---------------------- Page: 3 ----------------------
4 ETSI EG 202 396-1 V1.2.3 (2009-03)

A.2.2 D-Value with Pink Noise ................................................................................................................................. 48

A.2.3 D-Value with Cafeteria Noise .......................................................................................................................... 49

A.3 Conclusions ............................................................................................................................................ 49

Annex B: Graphs .................................................................................................................................... 50

History .............................................................................................................................................................. 58

ETSI
---------------------- Page: 4 ----------------------
5 ETSI EG 202 396-1 V1.2.3 (2009-03)
Intellectual Property Rights

IPRs essential or potentially essential to the present document may have been declared to ETSI. The information

pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found

in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in

respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web

server (http://webapp.etsi.org/IPR/home.asp).

Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee

can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web

server) which are, or may be, or may become, essential to the present document.
Foreword

This ETSI Guide (EG) has been produced by ETSI Technical Committee Speech and multimedia Transmission Quality

(STQ).

The present document is part 1 of a multi-part deliverable covering Speech and multimedia Transmission Quality

(STQ); Speech quality performance in the presence of background noise, as identified below:

Part 1: "Background noise simulation technique and background noise database";

Part 2: "Background noise transmission - Network simulation - Subjective test database and results";

Part 3: "Background noise transmission - Objective test methods".
Introduction

Background noise is present in most of the conversations today. Background noise may impact the speech

communication performance to terminal and network equipment significantly. Therefore testing and optimization of

such equipment is necessary using realistic background noises. Furthermore reproducible conditions for the tests are

required which can be guaranteed only under lab type condition.

The present document addresses this issue by describing a methodology for recording and playback of background

noises under well defined and calibratable conditions in a lab-type environment. Furthermore a database with real

background noises is included.
ETSI
---------------------- Page: 5 ----------------------
6 ETSI EG 202 396-1 V1.2.3 (2009-03)
1 Scope

The quality of background noise transmission is an important factor, which significantly contributes to the perceived

overall quality of speech. Existing and even more the new generation of terminals, networks and system configurations

including broadband services can be greatly improved with a proper design of terminals and systems in the presence of

background noise. The present document:

• describes a noise simulation environment using realistic background noise scenarios for laboratory use;

• contains a database including the relevant background noise samples for subjective and objective evaluation.

The present document provides information about the recording techniques needed for background noise recordings and

discusses the advantages and drawbacks of existing methods. The present document describes the requirements for

laboratory conditions. The loudspeaker setup and the loudspeaker calibration and equalization procedure are described.

The simulation environment specified can be used for the evaluation and optimization of terminals and of complex

configurations including terminals, networks and other configurations. The main application areas should be: office,

home and car environment.
The setup and database as described in the present document are applicable for:

• Objective performance evaluation of terminals in different (simulated) background noise environments.

• Speech processing evaluation by using the pre-processed speech signal in the presence of background noise,

recorded by a terminal.

• Subjective evaluation of terminals by performing conversational tests, specific double talk tests or talking and

listening tests in the presence of background noise.

• Subjective evaluation in third party listening tests by recording the speech samples of terminals in the presence

of background noise.
2 References

References are either specific (identified by date of publication and/or edition number or version number) or

non-specific.
• For a specific reference, subsequent revisions do not apply.

• Non-specific reference may be made only to a complete document or a part thereof and only in the following

cases:

- if it is accepted that it will be possible to use all future changes of the referenced document for the

purposes of the referring document;
- for informative references.

Referenced documents which are not found to be publicly available in the expected location might be found at

http://docbox.etsi.org/Reference.

NOTE: While any hyperlinks included in this clause were valid at the time of publication ETSI cannot guarantee

their long term validity.
2.1 Normative references

The following referenced documents are indispensable for the application of the present document. For dated

references, only the edition cited applies. For non-specific references, the latest edition of the referenced document

(including any amendments) applies.
Not applicable.
ETSI
---------------------- Page: 6 ----------------------
7 ETSI EG 202 396-1 V1.2.3 (2009-03)
2.2 Informative references

The following referenced documents are not essential to the use of the present document but they assist the user with

regard to a particular subject area. For non-specific references, the latest version of the referenced document (including

any amendments) applies.

[i.1] Surround Sound Past, Present, and Future: "A history of multichannel audio from mag stripe to

Dolby Digital", Joseph Hull - Dolby Laboratories Inc.

[i.2] AES preprint 3332 (1992): "Improved Possibilities of Binaural Recording and Playback

Techniques", K. Genuit, H.W. Gierlich; U. Künzli.
NOTE: See at http://www.aes.org/e-lib/browse.cfm?elib=6801.

[i.3] AES preprint 3732 (1993): "A System for the Reproduction Technique for Playback of Binaural

Recordings", N. Xiang, K. Genuit, H.W. Gierlich.
NOTE: See at http://www.aes.org/e-lib/browse.cfm?elib=6501.
[i.4] NTTAT Database: "Ambient Noise Database CD-ROM".
NOTE: See at http://www.ntt-at.com/products_e/noise-DB/index.html.

[i.5] ISO 11904-1: "Acoustics - Determination of sound immission from sound sources placed close to

the ear - Part 1: Technique using a microphone in a real ear (MIRE technique)".

[i.6] Spatial Hearing: "The psychophysics of human sound localization", J. Blauert.

[i.7] ITU-T Recommendation P.57: "Artificial ears".
[i.8] ITU-T Recommendation P.58: "Head and torso simulator for telephonometry".

[i.9] ITU-T Recommendation P.340: "Transmission characteristics and speech quality parameters of

hands-free terminals".

[i.10] ITU-T recommendation P.64: "Determination of sensitivity/frequency characteristics of local

telephone systems".
[i.11] ITU-T Recommendation G.722: "7 kHz audio-coding within 64 kbit/s".

[i.12] Genuit, K.: "A Description of the Human Outer Ear Transfer Function by Elements of

Communication Theory (No. B6-8)".

NOTE: Proceedings of the 12th International Congress on Acoustics. Toronto published on behalf of the

Technical Program Committee by the Executive Committee of the 12th International Congress on

Acoustics.

[i.13] IEC 60050-722: "International Electrotechnical Vocabulary - Chapter 722: Telephony".

[i.14] "Wellenfeldsynthese - Eine neue Dimension der 3D-Audiowiedergabe"; Fernseh- und

Kino-Technik, Nr. 11/2002, pp. 735-738.
[i.15] "The Iosono Sound Difference".
NOTE: http://www.iosono-sound.de

[i.16] "Ein neues Verfahren der raumbezogenen Stereophonie mit verbesserter Übertragung der

Rauminformation"; P. Scherer, Rundfunktechnische Mitteilungen, 1977, pp. 196-204.

[i.17] ETSI EG 202 396-1 (V.1.1.2): "Speech Processing, Transmission and Quality Aspects (STQ);

Speech quality performance in the presence of background noise; Part 1: Background noise

simulation technique and background noise database".

[i.18] ETSI TS 151 010-1: "Digital cellular telecommunications system (Phase 2+); Mobile Station (MS)

conformance specification; Part 1: Conformance specification (3GPP TS 51.010-1)".

ETSI
---------------------- Page: 7 ----------------------
8 ETSI EG 202 396-1 V1.2.3 (2009-03)
3 Definitions and abbreviations
3.1 Definitions

For the purposes of the present document, the following terms and definitions apply:

crosstalk: appearance of undesired energy in a channel, owing to the presence of a signal in another channel, caused

by, for example induction, conduction or non linearity
NOTE: See IEC 60050-722 [i.13].
3.2 Abbreviations
For the purposes of the present document, the following abbreviations apply:
CD Compact Disc
FFT Fast Fourier Transform
FIR Finite Impulse Response
HATS Head And Torso Simulator
IIR Infinite Impulse Response
MIRE Microphone In Real Ear
NTT Nippon Telegraph and Telephone Corporation
SLR Send Loudness Rating
VHF Very High Frequency
4 Overview of existing methods for realistic sound
reproduction
4.1 Introduction

In general the existing methods for close to original sound recording and reproduction aimed for different applications:

• Techniques intending to reproduce the actual sound field.

• Techniques providing hearing adequate (ear related) signals in the human ear canal.

• Techniques generating artificial acoustical environments.

Within this clause the different methods are briefly described and their applicability for close to original sound-filed

reproduction is discussed. A variety of methods have been studied, in the following a summary of the most important

ones relevant to the present document is given. The different methods were analyzed on the basis of the following

requirements:
• The background noise recording technique should be:
- easy to use;
- easy to calibrate;
- capable of wideband recording;
- available at reasonable costs;

- mostly compatible to existing standards and procedures used in telecommunications testing;

- applicable to different environments (at least office, home and car).
ETSI
---------------------- Page: 8 ----------------------
9 ETSI EG 202 396-1 V1.2.3 (2009-03)
• The background noise simulation arrangement should:
- be easy to setup;
- not require any specific acoustical treatment for the simulation requirement;

- provide a mostly realistic background noise simulation for all typical background noises faced with in

telecommunication applications;
- be easy to calibrate;

- be mostly insensitive against the positioning of (test)-objects in the simulated sound field;

- be applicable to all typical terminals used in telecommunication;
- be available at reasonable costs.
4.2 Surround Sound Techniques

The basics of surround techniques are found in cinema applications. The virtual image provided by stereophonic

presentation of sounds seemed not to be sufficient for the large screen display in cinema. In the 1950s 4-channel and

6-channel soundtracks recorded on magnetic stripes associated to the films were developed, 4-channel and 6-channel

loudspeaker systems were installed in cinemas to reproduce the multichannel sounds. The newer techniques were

mostly developed and marketed by Dolby® [i.1]: Dolby Surround, Dolby Surround Pro Logic, Dolby Digital and Dolby

Digital Surround are examples for the techniques introduced more recently. The most common configuration is the

"5.1-configuration" used in cinema but in home applications as well. The reproduction system consists of left and right

channel, a centre speaker, two surround channels (left and right, arranged in the back of the listener) and a low

frequency channel for low frequency effects.

The aim of all surround system is to create an artificial acoustical image in the recording studio rather than recording a

real acoustical scenario and providing true to original playback possibilities.

On the recording side special surround encoders are used allowing the 5-channel signal to be encoded from a special

mixing console to the 5.1 digital data stream. The playback system consists of a special decoder allowing to separate the

5 channels again and distribute them on the 5.1 loudspeaker playback system. The systems are mono and stereo

compatible and can handle the older 4 channel surround techniques by a specific decoder.

Applications:

Typical applications for surround systems are cinemas and home theatres. The source material is produced by

professional recording studios using multi-channel mixing consoles and specific 5.1 decoding techniques. In mostly all

cases virtual environments are created which support the visual image by an appropriate acoustical image.

Conclusion:

Surround techniques are designed for creating acoustical images rather than for close to original recording and

reproduction. Although the spatial impression provided by surround techniques is sometimes remarkable the acoustical

image created is always artificial. Due to the lack of easy to use recording techniques allowing a spatial recording of a

sound field surround sound techniques are not suitable for creation of a background noise database with realistic

background noises and calibrated background noise simulation in a lab.
ETSI
---------------------- Page: 9 ----------------------
10 ETSI EG 202 396-1 V1.2.3 (2009-03)
4.3 IOSONO

The IOSONO® sound system (see [i.14] and [i.16]) is based on the Wave-Field Synthesis. It employs Huygens

principle of wave theory. Applied to acoustics this principle means that it is possible to reproduce any form of wave

front with an array of loudspeakers, so that virtual sound sources can be placed anywhere within a listening area. For

practical use it is necessary to position loudspeakers all-round the playback room. In order to generate realistic sound

fields the input signal for each loudspeaker has to be calculated separately. For this purpose each single sound source

(e.g. voices) has to be recorded individually. If the recordings are done in a room, the characteristics (like reverberation)

of the recording room also have to be recorded separately. All resulting sound tracks are then mixed and manipulated

during the post-editing process and the reproduction.

The natural and realistic spatial sound reproduction is then achieved in a wide area of the play back room. Common 5.1

stereo systems achieve a "realistic" sound reproduction only in a small area of the reproduction room.

Applications:

Typical applications are sound systems for home use, cinemas and other entertainment events. The IOSONO sound

system is also able to play back recordings made in common stereo or 5.1 stereo techniques.

Conclusion:

The drawbacks of this method are the components needed: a sophisticated recording system, a powerful computing unit

for real-time mixing the large number of recorded sound tracks and the number of loudspeakers that have to be installed

in the listening room. In a common size cinema for example about 200 loudspeakers are needed.

The advantage is that with the IOSONO sound system a very realistic sound reproduction is possible, but it requires an

enormous effort, which is too high for daily use in laboratories.
4.4 Eidophonie

This method (see [i.17]) was developed for realistic sound reproduction using the VHF transmission technique. The

main principle is to separate the base signal from the part of the signal, which contains the information about the

direction of sound incidence.

For recording a 1 order gradient microphone with a cardioid directivity is used. During the recordings its directivity

rotates with 38 kHz in the recording plane. This "turning microphone" provides an amplitude-modulated signal at its

electrical output. The resulting side bands are out of the transmitted frequency range. But these side bands contain the

information of the direction of sound incidence. Using the VHF- transmission techniques this phase information can be

transmitted within the 2 audio-frequency channel.

The sound reproduction is made by a spatial demodulation: a switch is positioned before each loudspeaker and each

switches synchronously with the turning directivity. So a low pass filtered short-term section of the signal containing

the information of the direction of sound incidence is played back on each loudspeaker. The loudspeakers are positioned

all around the playback room.
Applications:

Eidophonie was developed to provide a realistic sound environment using a signal received from a VHF broadcast

station. With this technique the common stereo sound reproduction should be improved. Nevertheless Eidophonie is

also compatible to common mono and stereo recordings.
ETSI
---------------------- Page: 10 ----------------------
11 ETSI EG 202 396-1 V1.2.3 (2009-03)
Conclusion:

Benefits of this system are that three loudspeakers are sufficient to produce a realistic sound field. Using more

loudspeakers (e.g. 16) the spatial sound reproduction gets more and more independent from the listening position.

Moreover the independency of the transmitted sound from the acoustics of the reproduction room increases with the

number of loudspeakers used. But there are significant limitations of the method: The microphone directivity is

frequency dependent and not ideal. Therefore the interference between the different channels is created. A second

problem is the loudspeaker directivity, which does not fit the microphone directivity. This problem could be reduced if

the number of channels would be increased. This however is not possible due to the limited directivity of the

microphone arrangement used.

Localization of sound sources is hardly possible due to the interference effects of the microphone signals and the

loudspeakers. At close to original reproduction depends on the number and distribution of sound sources present. For

most of the sound source combinations this goal cannot be achieved.

In general the coding technique needed to record the sound field by a "turning microphone", is complicated and not

available commercially. A further drawback of this method is the complicated decoding technique needed on the

reproduction side, which is also not commercially available.
4.5 Four-loudspeaker arrangement for playback of binaurally
recorded signals

This reproduction procedure was originally investigated to reproduce binaurally signals recorded using artificial head

technology. It improves the impressions of direction and distance. Four loudspeakers are typically positioned in a

square formation around a central point (listening point) equidistantly e.g. 2 m. The binaural recordings are played back

as follows: the two left-hand loudspeakers receive the same free-field equalized artificial head signal of the left-hand

channel only. The right-hand side is arranged similarly. For equalization the transfer function from the two left-hand

loudspeakers is measured at the artificial heads left ear channel. With this result IIR and FIR fil

...

Final draft ETSI EG 202 396-1 V1.2.3 (2009-01)
ETSI Guide
Speech and multimedia Transmission Quality (STQ);
Speech quality performance
in the presence of background noise;
Part 1: Background noise simulation technique
and background noise database
---------------------- Page: 1 ----------------------
2 Final draft ETSI EG 202 396-1 V1.2.3 (2009-01)
Reference
REG/STQ-00139
Keywords
performance, quality, speech
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE
Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16
Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88
Important notice
Individual copies of the present document can be downloaded from:
http://www.etsi.org

The present document may be made available in more than one electronic version or in print. In any case of existing or

perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF).

In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive

within ETSI Secretariat.

Users of the present document should be aware that the document may be subject to revision or change of status.

Information on the current status of this and other ETSI documents is available at

http://portal.etsi.org/tb/status/status.asp

If you find errors in the present document, please send your comment to one of the following services:

http://portal.etsi.org/chaircor/ETSI_support.asp
Copyright Notification
No part may be reproduced except as authorized by written permission.
The copyright and the foregoing restriction extend to reproduction in all media.
© European Telecommunications Standards Institute 2009.
All rights reserved.
TM TM TM TM

DECT , PLUGTESTS , UMTS , TIPHON , the TIPHON logo and the ETSI logo are Trade Marks of ETSI registered

for the benefit of its Members.

3GPP is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners.

LTE™ is a Trade Mark of ETSI currently being registered
for the benefit of its Members and of the 3GPP Organizational Partners.

GSM® and the GSM logo are Trade Marks registered and owned by the GSM Association.

ETSI
---------------------- Page: 2 ----------------------
3 Final draft ETSI EG 202 396-1 V1.2.3 (2009-01)
Contents

Intellectual Property Rights ................................................................................................................................ 5

Foreword ............................................................................................................................................................. 5

Introduction ........................................................................................................................................................ 5

1 Scope ........................................................................................................................................................ 6

2 References ................................................................................................................................................ 6

2.1 Normative references ......................................................................................................................................... 6

2.2 Informative references ........................................................................................................................................ 7

3 Definitions and abbreviations ................................................................................................................... 8

3.1 Definitions .......................................................................................................................................................... 8

3.2 Abbreviations ..................................................................................................................................................... 8

4 Overview of existing methods for realistic sound reproduction............................................................... 8

4.1 Introduction ........................................................................................................................................................ 8

4.2 Surround Sound Techniques............................................................................................................................... 9

4.3 IOSONO ........................................................................................................................................................... 10

4.4 Eidophonie ....................................................................................................................................................... 10

4.5 Four-loudspeaker arrangement for playback of binaurally recorded signals .................................................... 11

4.6 NTT Background-Noise Database ................................................................................................................... 12

4.7 General conclusions ......................................................................................................................................... 12

5 Recording arrangement .......................................................................................................................... 13

5.1 Binaural equalization ........................................................................................................................................ 13

5.2 The equalization procedure .............................................................................................................................. 13

6 Loudspeaker Setup for Background Noise Simulation .......................................................................... 15

6.1 Test Room Requirements ................................................................................................................................. 15

6.2 Loudspeaker Positioning .................................................................................................................................. 15

6.3 Equalization and Calibration ............................................................................................................................ 16

6.4 Accuracy of the reproduction arrangement ...................................................................................................... 21

6.4.1 Comparison between original sound field and simulated sound field ......................................................... 21

6.4.2 Displacement of the test arrangement in the simulated sound field ............................................................ 23

6.4.3 Transmission of background noise: Comparison of terminal performance in the original sound field

and the simulated sound field ..................................................................................................................... 25

7 Background Noise Simulation in cars .................................................................................................... 28

7.1 General setup .................................................................................................................................................... 28

7.2 Recording arrangement .................................................................................................................................... 29

7.2.1 Recording setup with the terminal's microphone ........................................................................................ 29

7.2.2 Recording setup with a pair of cardioid microphones................................................................................. 30

7.3 Equalization and Calibration with the terminal's microphone .......................................................................... 30

7.4 Equalization and Calibration with a pair of cardioid microphones .................................................................. 35

7.5 Accuracy of the reproduction arrangement ...................................................................................................... 40

7.5.1 Comparison between original sound field and simulated sound field ......................................................... 40

7.5.2 Transmission of background noise: Comparison of terminal performance in the original sound field

and the simulated sound field ..................................................................................................................... 41

8 Background Noise Database .................................................................................................................. 44

8.1 Binaural signals ................................................................................................................................................ 44

8.2 Stereophonic signals ......................................................................................................................................... 46

Annex A: Comparison of Tests in Sending Direction and D-Values Conducted in Different

Rooms ..................................................................................................................................... 47

A.1 Test Setup ............................................................................................................................................... 47

A.2 Results of the Tests ................................................................................................................................ 48

A.2.1 Sending Frequency Response Characteristics and SLR ................................................................................... 48

ETSI
---------------------- Page: 3 ----------------------
4 Final draft ETSI EG 202 396-1 V1.2.3 (2009-01)

A.2.2 D-Value with Pink Noise ................................................................................................................................. 48

A.2.3 D-Value with Cafeteria Noise .......................................................................................................................... 49

A.3 Conclusions ............................................................................................................................................ 49

Annex B: Graphs .................................................................................................................................... 50

History .............................................................................................................................................................. 58

ETSI
---------------------- Page: 4 ----------------------
5 Final draft ETSI EG 202 396-1 V1.2.3 (2009-01)
Intellectual Property Rights

IPRs essential or potentially essential to the present document may have been declared to ETSI. The information

pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found

in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in

respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web

server (http://webapp.etsi.org/IPR/home.asp).

Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee

can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web

server) which are, or may be, or may become, essential to the present document.
Foreword

This ETSI Guide (EG) has been produced by ETSI Technical Committee Speech and multimedia Transmission Quality

(STQ), and is now submitted for the ETSI standards Membership Approval Procedure.

The present document is part 1 of a multi-part deliverable covering Speech and multimedia Transmission Quality

(STQ); Speech quality performance in the presence of background noise, as identified below:

Part 1: "Background noise simulation technique and background noise database";

Part 2: "Background noise transmission - Network simulation - Subjective test database and results";

Part 3: "Background noise transmission - Objective test methods".
Introduction

Background noise is present in most of the conversations today. Background noise may impact the speech

communication performance to terminal and network equipment significantly. Therefore testing and optimization of

such equipment is necessary using realistic background noises. Furthermore reproducible conditions for the tests are

required which can be guaranteed only under lab type condition.

The present document addresses this issue by describing a methodology for recording and playback of background

noises under well defined and calibratable conditions in a lab-type environment. Furthermore a database with real

background noises is included.
ETSI
---------------------- Page: 5 ----------------------
6 Final draft ETSI EG 202 396-1 V1.2.3 (2009-01)
1 Scope

The quality of background noise transmission is an important factor, which significantly contributes to the perceived

overall quality of speech. Existing and even more the new generation of terminals, networks and system configurations

including broadband services can be greatly improved with a proper design of terminals and systems in the presence of

background noise. The present document:

• describes a noise simulation environment using realistic background noise scenarios for laboratory use;

• contains a database including the relevant background noise samples for subjective and objective evaluation.

The present document provides information about the recording techniques needed for background noise recordings and

discusses the advantages and drawbacks of existing methods. The present document describes the requirements for

laboratory conditions. The loudspeaker setup and the loudspeaker calibration and equalization procedure are described.

The simulation environment specified can be used for the evaluation and optimization of terminals and of complex

configurations including terminals, networks and other configurations. The main application areas should be: office,

home and car environment.
The setup and database as described in the present document are applicable for:

• Objective performance evaluation of terminals in different (simulated) background noise environments.

• Speech processing evaluation by using the pre-processed speech signal in the presence of background noise,

recorded by a terminal.

• Subjective evaluation of terminals by performing conversational tests, specific double talk tests or talking and

listening tests in the presence of background noise.

• Subjective evaluation in third party listening tests by recording the speech samples of terminals in the presence

of background noise.
2 References

References are either specific (identified by date of publication and/or edition number or version number) or

non-specific.
• For a specific reference, subsequent revisions do not apply.

• Non-specific reference may be made only to a complete document or a part thereof and only in the following

cases:

- if it is accepted that it will be possible to use all future changes of the referenced document for the

purposes of the referring document;
- for informative references.

Referenced documents which are not found to be publicly available in the expected location might be found at

http://docbox.etsi.org/Reference.

NOTE: While any hyperlinks included in this clause were valid at the time of publication ETSI cannot guarantee

their long term validity.
2.1 Normative references

The following referenced documents are indispensable for the application of the present document. For dated

references, only the edition cited applies. For non-specific references, the latest edition of the referenced document

(including any amendments) applies.
Not applicable.
ETSI
---------------------- Page: 6 ----------------------
7 Final draft ETSI EG 202 396-1 V1.2.3 (2009-01)
2.2 Informative references

The following referenced documents are not essential to the use of the present document but they assist the user with

regard to a particular subject area. For non-specific references, the latest version of the referenced document (including

any amendments) applies.

[i.1] Surround Sound Past, Present, and Future: "A history of multichannel audio from mag stripe to

Dolby Digital", Joseph Hull - Dolby Laboratories Inc.

[i.2] AES preprint 3332 (1992): "Improved Possibilities of Binaural Recording and Playback

Techniques", K. Genuit, H.W. Gierlich; U. Künzli.
NOTE: See at http://www.aes.org/e-lib/browse.cfm?elib=6801.

[i.3] AES preprint 3732 (1993): "A System for the Reproduction Technique for Playback of Binaural

Recordings", N. Xiang, K. Genuit, H.W. Gierlich.
NOTE: See at http://www.aes.org/e-lib/browse.cfm?elib=6501.
[i.4] NTTAT Database: "Ambient Noise Database CD-ROM".
NOTE: See at http://www.ntt-at.com/products_e/noise-DB/index.html.

[i.5] ISO 11904-1: "Acoustics - Determination of sound immission from sound sources placed close to

the ear - Part 1: Technique using a microphone in a real ear (MIRE technique)".

[i.6] Spatial Hearing: "The psychophysics of human sound localization", J. Blauert.

[i.7] ITU-T Recommendation P.57: "Artificial ears".
[i.8] ITU-T Recommendation P.58: "Head and torso simulator for telephonometry".

[i.9] ITU-T Recommendation P.340: "Transmission characteristics and speech quality parameters of

hands-free terminals".

[i.10] ITU-T recommendation P.64: "Determination of sensitivity/frequency characteristics of local

telephone systems".
[i.11] ITU-T Recommendation G.722: "7 kHz audio-coding within 64 kbit/s".

[i.12] Genuit, K.: "A Description of the Human Outer Ear Transfer Function by Elements of

Communication Theory (No. B6-8)".

NOTE: Proceedings of the 12th International Congress on Acoustics. Toronto published on behalf of the

Technical Program Committee by the Executive Committee of the 12th International Congress on

Acoustics.

[i.13] IEC 60050-722: "International Electrotechnical Vocabulary - Chapter 722: Telephony".

[i.14] "Wellenfeldsynthese - Eine neue Dimension der 3D-Audiowiedergabe"; Fernseh- und

Kino-Technik, Nr. 11/2002, pp. 735-738.
[i.15] "The Iosono Sound Difference".
NOTE: http://www.iosono-sound.de

[i.16] "Ein neues Verfahren der raumbezogenen Stereophonie mit verbesserter Übertragung der

Rauminformation"; P. Scherer, Rundfunktechnische Mitteilungen, 1977, pp. 196-204.

[i.17] ETSI EG 202 396-1 (V.1.1.2): "Speech Processing, Transmission and Quality Aspects (STQ);

Speech quality performance in the presence of background noise; Part 1: Background noise

simulation technique and background noise database".

[i.18] ETSI TS 151 010-1: "Digital cellular telecommunications system (Phase 2+); Mobile Station (MS)

conformance specification; Part 1: Conformance specification (3GPP TS 51.010-1)".

ETSI
---------------------- Page: 7 ----------------------
8 Final draft ETSI EG 202 396-1 V1.2.3 (2009-01)
3 Definitions and abbreviations
3.1 Definitions

For the purposes of the present document, the following terms and definitions apply:

crosstalk: appearance of undesired energy in a channel, owing to the presence of a signal in another channel, caused

by, for example induction, conduction or non linearity
NOTE: See IEC 60050-722 [i.13].
3.2 Abbreviations
For the purposes of the present document, the following abbreviations apply:
CD Compact Disc
FFT Fast Fourier Transform
FIR Finite Impulse Response
HATS Head And Torso Simulator
IIR Infinite Impulse Response
MIRE Microphone In Real Ear
NTT Nippon Telegraph and Telephone Corporation
SLR Send Loudness Rating
VHF Very High Frequency
4 Overview of existing methods for realistic sound
reproduction
4.1 Introduction

In general the existing methods for close to original sound recording and reproduction aimed for different applications:

• Techniques intending to reproduce the actual sound field.

• Techniques providing hearing adequate (ear related) signals in the human ear canal.

• Techniques generating artificial acoustical environments.

Within this clause the different methods are briefly described and their applicability for close to original sound-filed

reproduction is discussed. A variety of methods have been studied, in the following a summary of the most important

ones relevant to the present document is given. The different methods were analyzed on the basis of the following

requirements:
• The background noise recording technique should be:
- easy to use;
- easy to calibrate;
- capable of wideband recording;
- available at reasonable costs;

- mostly compatible to existing standards and procedures used in telecommunications testing;

- applicable to different environments (at least office, home and car).
ETSI
---------------------- Page: 8 ----------------------
9 Final draft ETSI EG 202 396-1 V1.2.3 (2009-01)
• The background noise simulation arrangement should:
- be easy to setup;
- not require any specific acoustical treatment for the simulation requirement;

- provide a mostly realistic background noise simulation for all typical background noises faced with in

telecommunication applications;
- be easy to calibrate;

- be mostly insensitive against the positioning of (test)-objects in the simulated sound field;

- be applicable to all typical terminals used in telecommunication;
- be available at reasonable costs.
4.2 Surround Sound Techniques

The basics of surround techniques are found in cinema applications. The virtual image provided by stereophonic

presentation of sounds seemed not to be sufficient for the large screen display in cinema. In the 1950s 4-channel and

6-channel soundtracks recorded on magnetic stripes associated to the films were developed, 4-channel and 6-channel

loudspeaker systems were installed in cinemas to reproduce the multichannel sounds. The newer techniques were

mostly developed and marketed by Dolby® [i.1]: Dolby Surround, Dolby Surround Pro Logic, Dolby Digital and Dolby

Digital Surround are examples for the techniques introduced more recently. The most common configuration is the

"5.1-configuration" used in cinema but in home applications as well. The reproduction system consists of left and right

channel, a centre speaker, two surround channels (left and right, arranged in the back of the listener) and a low

frequency channel for low frequency effects.

The aim of all surround system is to create an artificial acoustical image in the recording studio rather than recording a

real acoustical scenario and providing true to original playback possibilities.

On the recording side special surround encoders are used allowing the 5-channel signal to be encoded from a special

mixing console to the 5.1 digital data stream. The playback system consists of a special decoder allowing to separate the

5 channels again and distribute them on the 5.1 loudspeaker playback system. The systems are mono and stereo

compatible and can handle the older 4 channel surround techniques by a specific decoder.

Applications:

Typical applications for surround systems are cinemas and home theatres. The source material is produced by

professional recording studios using multi-channel mixing consoles and specific 5.1 decoding techniques. In mostly all

cases virtual environments are created which support the visual image by an appropriate acoustical image.

Conclusion:

Surround techniques are designed for creating acoustical images rather than for close to original recording and

reproduction. Although the spatial impression provided by surround techniques is sometimes remarkable the acoustical

image created is always artificial. Due to the lack of easy to use recording techniques allowing a spatial recording of a

sound field surround sound techniques are not suitable for creation of a background noise database with realistic

background noises and calibrated background noise simulation in a lab.
ETSI
---------------------- Page: 9 ----------------------
10 Final draft ETSI EG 202 396-1 V1.2.3 (2009-01)
4.3 IOSONO

The IOSONO® sound system (see [i.14] and [i.16]) is based on the Wave-Field Synthesis. It employs Huygens

principle of wave theory. Applied to acoustics this principle means that it is possible to reproduce any form of wave

front with an array of loudspeakers, so that virtual sound sources can be placed anywhere within a listening area. For

practical use it is necessary to position loudspeakers all-round the playback room. In order to generate realistic sound

fields the input signal for each loudspeaker has to be calculated separately. For this purpose each single sound source

(e.g. voices) has to be recorded individually. If the recordings are done in a room, the characteristics (like reverberation)

of the recording room also have to be recorded separately. All resulting sound tracks are then mixed and manipulated

during the post-editing process and the reproduction.

The natural and realistic spatial sound reproduction is then achieved in a wide area of the play back room. Common 5.1

stereo systems achieve a "realistic" sound reproduction only in a small area of the reproduction room.

Applications:

Typical applications are sound systems for home use, cinemas and other entertainment events. The IOSONO sound

system is also able to play back recordings made in common stereo or 5.1 stereo techniques.

Conclusion:

The drawbacks of this method are the components needed: a sophisticated recording system, a powerful computing unit

for real-time mixing the large number of recorded sound tracks and the number of loudspeakers that have to be installed

in the listening room. In a common size cinema for example about 200 loudspeakers are needed.

The advantage is that with the IOSONO sound system a very realistic sound reproduction is possible, but it requires an

enormous effort, which is too high for daily use in laboratories.
4.4 Eidophonie

This method (see [i.17]) was developed for realistic sound reproduction using the VHF transmission technique. The

main principle is to separate the base signal from the part of the signal, which contains the information about the

direction of sound incidence.

For recording a 1 order gradient microphone with a cardioid directivity is used. During the recordings its directivity

rotates with 38 kHz in the recording plane. This "turning microphone" provides an amplitude-modulated signal at its

electrical output. The resulting side bands are out of the transmitted frequency range. But these side bands contain the

information of the direction of sound incidence. Using the VHF- transmission techniques this phase information can be

transmitted within the 2 audio-frequency channel.

The sound reproduction is made by a spatial demodulation: a switch is positioned before each loudspeaker and each

switches synchronously with the turning directivity. So a low pass filtered short-term section of the signal containing

the information of the direction of sound incidence is played back on each loudspeaker. The loudspeakers are positioned

all around the playback room.
Applications:

Eidophonie was developed to provide a realistic sound environment using a signal received from a VHF broadcast

station. With this technique the common stereo sound reproduction should be improved. Nevertheless Eidophonie is

also compatible to common mono and stereo recordings.
ETSI
---------------------- Page: 10 ----------------------
11 Final draft ETSI EG 202 396-1 V1.2.3 (2009-01)
Conclusion:

Benefits of this system are that three loudspeakers are sufficient to produce a realistic sound field. Using more

loudspeakers (e.g. 16) the spatial sound reproduction gets more and more independent from the listening position.

Moreover the independency of the transmitted sound from the acoustics of the reproduction room increases with the

number of loudspeakers used. But there are significant limitations of the method: The microphone directivity is

frequency dependent and not ideal. Therefore the interference between the different channels is created. A second

problem is the loudspeaker directivity, which does not fit the microphone directivity. This problem could be reduced if

the number of channels would be increased. This however is not possible due to the limited directivity of the

microphone arrangement used.

Localization of sound sources is hardly possible due to the interference effects of the microphone signals and the

loudspeakers. At close to original reproduction depends on the number and distribution of sound sources present. For

most of the sound source combinations this goal cannot be achieved.

In general the coding technique needed to record the sound field by a "turning microphone", is complicated and not

available commercially. A further drawback of this method is the complicated decoding technique needed on the

reproduction side, which is also not commercially available.
4.5 Four-loudspeaker arrangement for playback of binaurally
recorded signals

This reproduction procedure was originally investigated to reproduce binaurally signals recorded using artificial head

technology. It improves the impressions of direction and distance. Four loudspeakers are typically positioned in a

square formation around a central point (listening point) equidistantly e.g. 2 m. The binaural recordings are played back

as follows: the two left-hand loudspeakers receive the same free-field equalized artificial head signal of the left-hand

channel onl
...

SLOVENSKI STANDARD
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
01-julij-2009
.DNRYRVWSUHQRVDJRYRUDLQYHþSUHGVWDYQLKYVHELQ 674 =PRJOMLYRVWNDNRYRVWL
JRYRUDYSULVRWQRVWLãXPDR]DGMDGHO6LPXODFLMVNDWHKQLNDãXPDR]DGMDLQ
SRGDWNRYQD]ELUNDãXPRYR]DGMD

Speech and multimedia Transmission Quality (STQ) - Speech quality performance in the

presence of background noise - Part 1: Background noise simulation technique and
background noise database
Ta slovenski standard je istoveten z: EG 202 396-1 Version 1.2.3
ICS:
33.040.35 Telefonska omrežja Telephone networks
SIST-V ETSI/EG 202 396-1 V1.2.3:2009 en

2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

---------------------- Page: 1 ----------------------
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
---------------------- Page: 2 ----------------------
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
ETSI EG 202 396-1 V1.2.3 (2009-03)
ETSI Guide
Speech and multimedia Transmission Quality (STQ);
Speech quality performance
in the presence of background noise;
Part 1: Background noise simulation technique
and background noise database
---------------------- Page: 3 ----------------------
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
2 ETSI EG 202 396-1 V1.2.3 (2009-03)
Reference
REG/STQ-00139
Keywords
performance, quality, speech
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE
Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16
Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88
Important notice
Individual copies of the present document can be downloaded from:
http://www.etsi.org

The present document may be made available in more than one electronic version or in print. In any case of existing or

perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF).

In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive

within ETSI Secretariat.

Users of the present document should be aware that the document may be subject to revision or change of status.

Information on the current status of this and other ETSI documents is available at

http://portal.etsi.org/tb/status/status.asp

If you find errors in the present document, please send your comment to one of the following services:

http://portal.etsi.org/chaircor/ETSI_support.asp
Copyright Notification
No part may be reproduced except as authorized by written permission.
The copyright and the foregoing restriction extend to reproduction in all media.
© European Telecommunications Standards Institute 2009.
All rights reserved.
TM TM TM TM

DECT , PLUGTESTS , UMTS , TIPHON , the TIPHON logo and the ETSI logo are Trade Marks of ETSI registered

for the benefit of its Members.

3GPP is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners.

LTE™ is a Trade Mark of ETSI currently being registered
for the benefit of its Members and of the 3GPP Organizational Partners.

GSM® and the GSM logo are Trade Marks registered and owned by the GSM Association.

ETSI
---------------------- Page: 4 ----------------------
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
3 ETSI EG 202 396-1 V1.2.3 (2009-03)
Contents

Intellectual Property Rights ................................................................................................................................ 5

Foreword ............................................................................................................................................................. 5

Introduction ........................................................................................................................................................ 5

1 Scope ........................................................................................................................................................ 6

2 References ................................................................................................................................................ 6

2.1 Normative references ......................................................................................................................................... 6

2.2 Informative references ........................................................................................................................................ 7

3 Definitions and abbreviations ................................................................................................................... 8

3.1 Definitions .......................................................................................................................................................... 8

3.2 Abbreviations ..................................................................................................................................................... 8

4 Overview of existing methods for realistic sound reproduction............................................................... 8

4.1 Introduction ........................................................................................................................................................ 8

4.2 Surround Sound Techniques............................................................................................................................... 9

4.3 IOSONO ........................................................................................................................................................... 10

4.4 Eidophonie ....................................................................................................................................................... 10

4.5 Four-loudspeaker arrangement for playback of binaurally recorded signals .................................................... 11

4.6 NTT Background-Noise Database ................................................................................................................... 12

4.7 General conclusions ......................................................................................................................................... 12

5 Recording arrangement .......................................................................................................................... 13

5.1 Binaural equalization ........................................................................................................................................ 13

5.2 The equalization procedure .............................................................................................................................. 13

6 Loudspeaker Setup for Background Noise Simulation .......................................................................... 15

6.1 Test Room Requirements ................................................................................................................................. 15

6.2 Loudspeaker Positioning .................................................................................................................................. 15

6.3 Equalization and Calibration ............................................................................................................................ 16

6.4 Accuracy of the reproduction arrangement ...................................................................................................... 21

6.4.1 Comparison between original sound field and simulated sound field ......................................................... 21

6.4.2 Displacement of the test arrangement in the simulated sound field ............................................................ 23

6.4.3 Transmission of background noise: Comparison of terminal performance in the original sound field

and the simulated sound field ..................................................................................................................... 25

7 Background Noise Simulation in cars .................................................................................................... 28

7.1 General setup .................................................................................................................................................... 28

7.2 Recording arrangement .................................................................................................................................... 29

7.2.1 Recording setup with the terminal's microphone ........................................................................................ 29

7.2.2 Recording setup with a pair of cardioid microphones................................................................................. 30

7.3 Equalization and Calibration with the terminal's microphone .......................................................................... 30

7.4 Equalization and Calibration with a pair of cardioid microphones .................................................................. 35

7.5 Accuracy of the reproduction arrangement ...................................................................................................... 40

7.5.1 Comparison between original sound field and simulated sound field ......................................................... 40

7.5.2 Transmission of background noise: Comparison of terminal performance in the original sound field

and the simulated sound field ..................................................................................................................... 41

8 Background Noise Database .................................................................................................................. 44

8.1 Binaural signals ................................................................................................................................................ 44

8.2 Stereophonic signals ......................................................................................................................................... 46

Annex A: Comparison of Tests in Sending Direction and D-Values Conducted in Different

Rooms ..................................................................................................................................... 47

A.1 Test Setup ............................................................................................................................................... 47

A.2 Results of the Tests ................................................................................................................................ 48

A.2.1 Sending Frequency Response Characteristics and SLR ................................................................................... 48

ETSI
---------------------- Page: 5 ----------------------
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
4 ETSI EG 202 396-1 V1.2.3 (2009-03)

A.2.2 D-Value with Pink Noise ................................................................................................................................. 48

A.2.3 D-Value with Cafeteria Noise .......................................................................................................................... 49

A.3 Conclusions ............................................................................................................................................ 49

Annex B: Graphs .................................................................................................................................... 50

History .............................................................................................................................................................. 58

ETSI
---------------------- Page: 6 ----------------------
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
5 ETSI EG 202 396-1 V1.2.3 (2009-03)
Intellectual Property Rights

IPRs essential or potentially essential to the present document may have been declared to ETSI. The information

pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found

in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in

respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web

server (http://webapp.etsi.org/IPR/home.asp).

Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee

can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web

server) which are, or may be, or may become, essential to the present document.
Foreword

This ETSI Guide (EG) has been produced by ETSI Technical Committee Speech and multimedia Transmission Quality

(STQ).

The present document is part 1 of a multi-part deliverable covering Speech and multimedia Transmission Quality

(STQ); Speech quality performance in the presence of background noise, as identified below:

Part 1: "Background noise simulation technique and background noise database";

Part 2: "Background noise transmission - Network simulation - Subjective test database and results";

Part 3: "Background noise transmission - Objective test methods".
Introduction

Background noise is present in most of the conversations today. Background noise may impact the speech

communication performance to terminal and network equipment significantly. Therefore testing and optimization of

such equipment is necessary using realistic background noises. Furthermore reproducible conditions for the tests are

required which can be guaranteed only under lab type condition.

The present document addresses this issue by describing a methodology for recording and playback of background

noises under well defined and calibratable conditions in a lab-type environment. Furthermore a database with real

background noises is included.
ETSI
---------------------- Page: 7 ----------------------
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
6 ETSI EG 202 396-1 V1.2.3 (2009-03)
1 Scope

The quality of background noise transmission is an important factor, which significantly contributes to the perceived

overall quality of speech. Existing and even more the new generation of terminals, networks and system configurations

including broadband services can be greatly improved with a proper design of terminals and systems in the presence of

background noise. The present document:

• describes a noise simulation environment using realistic background noise scenarios for laboratory use;

• contains a database including the relevant background noise samples for subjective and objective evaluation.

The present document provides information about the recording techniques needed for background noise recordings and

discusses the advantages and drawbacks of existing methods. The present document describes the requirements for

laboratory conditions. The loudspeaker setup and the loudspeaker calibration and equalization procedure are described.

The simulation environment specified can be used for the evaluation and optimization of terminals and of complex

configurations including terminals, networks and other configurations. The main application areas should be: office,

home and car environment.
The setup and database as described in the present document are applicable for:

• Objective performance evaluation of terminals in different (simulated) background noise environments.

• Speech processing evaluation by using the pre-processed speech signal in the presence of background noise,

recorded by a terminal.

• Subjective evaluation of terminals by performing conversational tests, specific double talk tests or talking and

listening tests in the presence of background noise.

• Subjective evaluation in third party listening tests by recording the speech samples of terminals in the presence

of background noise.
2 References

References are either specific (identified by date of publication and/or edition number or version number) or

non-specific.
• For a specific reference, subsequent revisions do not apply.

• Non-specific reference may be made only to a complete document or a part thereof and only in the following

cases:

- if it is accepted that it will be possible to use all future changes of the referenced document for the

purposes of the referring document;
- for informative references.

Referenced documents which are not found to be publicly available in the expected location might be found at

http://docbox.etsi.org/Reference.

NOTE: While any hyperlinks included in this clause were valid at the time of publication ETSI cannot guarantee

their long term validity.
2.1 Normative references

The following referenced documents are indispensable for the application of the present document. For dated

references, only the edition cited applies. For non-specific references, the latest edition of the referenced document

(including any amendments) applies.
Not applicable.
ETSI
---------------------- Page: 8 ----------------------
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
7 ETSI EG 202 396-1 V1.2.3 (2009-03)
2.2 Informative references

The following referenced documents are not essential to the use of the present document but they assist the user with

regard to a particular subject area. For non-specific references, the latest version of the referenced document (including

any amendments) applies.

[i.1] Surround Sound Past, Present, and Future: "A history of multichannel audio from mag stripe to

Dolby Digital", Joseph Hull - Dolby Laboratories Inc.

[i.2] AES preprint 3332 (1992): "Improved Possibilities of Binaural Recording and Playback

Techniques", K. Genuit, H.W. Gierlich; U. Künzli.
NOTE: See at http://www.aes.org/e-lib/browse.cfm?elib=6801.

[i.3] AES preprint 3732 (1993): "A System for the Reproduction Technique for Playback of Binaural

Recordings", N. Xiang, K. Genuit, H.W. Gierlich.
NOTE: See at http://www.aes.org/e-lib/browse.cfm?elib=6501.
[i.4] NTTAT Database: "Ambient Noise Database CD-ROM".
NOTE: See at http://www.ntt-at.com/products_e/noise-DB/index.html.

[i.5] ISO 11904-1: "Acoustics - Determination of sound immission from sound sources placed close to

the ear - Part 1: Technique using a microphone in a real ear (MIRE technique)".

[i.6] Spatial Hearing: "The psychophysics of human sound localization", J. Blauert.

[i.7] ITU-T Recommendation P.57: "Artificial ears".
[i.8] ITU-T Recommendation P.58: "Head and torso simulator for telephonometry".

[i.9] ITU-T Recommendation P.340: "Transmission characteristics and speech quality parameters of

hands-free terminals".

[i.10] ITU-T recommendation P.64: "Determination of sensitivity/frequency characteristics of local

telephone systems".
[i.11] ITU-T Recommendation G.722: "7 kHz audio-coding within 64 kbit/s".

[i.12] Genuit, K.: "A Description of the Human Outer Ear Transfer Function by Elements of

Communication Theory (No. B6-8)".

NOTE: Proceedings of the 12th International Congress on Acoustics. Toronto published on behalf of the

Technical Program Committee by the Executive Committee of the 12th International Congress on

Acoustics.

[i.13] IEC 60050-722: "International Electrotechnical Vocabulary - Chapter 722: Telephony".

[i.14] "Wellenfeldsynthese - Eine neue Dimension der 3D-Audiowiedergabe"; Fernseh- und

Kino-Technik, Nr. 11/2002, pp. 735-738.
[i.15] "The Iosono Sound Difference".
NOTE: http://www.iosono-sound.de

[i.16] "Ein neues Verfahren der raumbezogenen Stereophonie mit verbesserter Übertragung der

Rauminformation"; P. Scherer, Rundfunktechnische Mitteilungen, 1977, pp. 196-204.

[i.17] ETSI EG 202 396-1 (V.1.1.2): "Speech Processing, Transmission and Quality Aspects (STQ);

Speech quality performance in the presence of background noise; Part 1: Background noise

simulation technique and background noise database".

[i.18] ETSI TS 151 010-1: "Digital cellular telecommunications system (Phase 2+); Mobile Station (MS)

conformance specification; Part 1: Conformance specification (3GPP TS 51.010-1)".

ETSI
---------------------- Page: 9 ----------------------
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
8 ETSI EG 202 396-1 V1.2.3 (2009-03)
3 Definitions and abbreviations
3.1 Definitions

For the purposes of the present document, the following terms and definitions apply:

crosstalk: appearance of undesired energy in a channel, owing to the presence of a signal in another channel, caused

by, for example induction, conduction or non linearity
NOTE: See IEC 60050-722 [i.13].
3.2 Abbreviations
For the purposes of the present document, the following abbreviations apply:
CD Compact Disc
FFT Fast Fourier Transform
FIR Finite Impulse Response
HATS Head And Torso Simulator
IIR Infinite Impulse Response
MIRE Microphone In Real Ear
NTT Nippon Telegraph and Telephone Corporation
SLR Send Loudness Rating
VHF Very High Frequency
4 Overview of existing methods for realistic sound
reproduction
4.1 Introduction

In general the existing methods for close to original sound recording and reproduction aimed for different applications:

• Techniques intending to reproduce the actual sound field.

• Techniques providing hearing adequate (ear related) signals in the human ear canal.

• Techniques generating artificial acoustical environments.

Within this clause the different methods are briefly described and their applicability for close to original sound-filed

reproduction is discussed. A variety of methods have been studied, in the following a summary of the most important

ones relevant to the present document is given. The different methods were analyzed on the basis of the following

requirements:
• The background noise recording technique should be:
- easy to use;
- easy to calibrate;
- capable of wideband recording;
- available at reasonable costs;

- mostly compatible to existing standards and procedures used in telecommunications testing;

- applicable to different environments (at least office, home and car).
ETSI
---------------------- Page: 10 ----------------------
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
9 ETSI EG 202 396-1 V1.2.3 (2009-03)
• The background noise simulation arrangement should:
- be easy to setup;
- not require any specific acoustical treatment for the simulation requirement;

- provide a mostly realistic background noise simulation for all typical background noises faced with in

telecommunication applications;
- be easy to calibrate;

- be mostly insensitive against the positioning of (test)-objects in the simulated sound field;

- be applicable to all typical terminals used in telecommunication;
- be available at reasonable costs.
4.2 Surround Sound Techniques

The basics of surround techniques are found in cinema applications. The virtual image provided by stereophonic

presentation of sounds seemed not to be sufficient for the large screen display in cinema. In the 1950s 4-channel and

6-channel soundtracks recorded on magnetic stripes associated to the films were developed, 4-channel and 6-channel

loudspeaker systems were installed in cinemas to reproduce the multichannel sounds. The newer techniques were

mostly developed and marketed by Dolby® [i.1]: Dolby Surround, Dolby Surround Pro Logic, Dolby Digital and Dolby

Digital Surround are examples for the techniques introduced more recently. The most common configuration is the

"5.1-configuration" used in cinema but in home applications as well. The reproduction system consists of left and right

channel, a centre speaker, two surround channels (left and right, arranged in the back of the listener) and a low

frequency channel for low frequency effects.

The aim of all surround system is to create an artificial acoustical image in the recording studio rather than recording a

real acoustical scenario and providing true to original playback possibilities.

On the recording side special surround encoders are used allowing the 5-channel signal to be encoded from a special

mixing console to the 5.1 digital data stream. The playback system consists of a special decoder allowing to separate the

5 channels again and distribute them on the 5.1 loudspeaker playback system. The systems are mono and stereo

compatible and can handle the older 4 channel surround techniques by a specific decoder.

Applications:

Typical applications for surround systems are cinemas and home theatres. The source material is produced by

professional recording studios using multi-channel mixing consoles and specific 5.1 decoding techniques. In mostly all

cases virtual environments are created which support the visual image by an appropriate acoustical image.

Conclusion:

Surround techniques are designed for creating acoustical images rather than for close to original recording and

reproduction. Although the spatial impression provided by surround techniques is sometimes remarkable the acoustical

image created is always artificial. Due to the lack of easy to use recording techniques allowing a spatial recording of a

sound field surround sound techniques are not suitable for creation of a background noise database with realistic

background noises and calibrated background noise simulation in a lab.
ETSI
---------------------- Page: 11 ----------------------
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
10 ETSI EG 202 396-1 V1.2.3 (2009-03)
4.3 IOSONO

The IOSONO® sound system (see [i.14] and [i.16]) is based on the Wave-Field Synthesis. It employs Huygens

principle of wave theory. Applied to acoustics this principle means that it is possible to reproduce any form of wave

front with an array of loudspeakers, so that virtual sound sources can be placed anywhere within a listening area. For

practical use it is necessary to position loudspeakers all-round the playback room. In order to generate realistic sound

fields the input signal for each loudspeaker has to be calculated separately. For this purpose each single sound source

(e.g. voices) has to be recorded individually. If the recordings are done in a room, the characteristics (like reverberation)

of the recording room also have to be recorded separately. All resulting sound tracks are then mixed and manipulated

during the post-editing process and the reproduction.

The natural and realistic spatial sound reproduction is then achieved in a wide area of the play back room. Common 5.1

stereo systems achieve a "realistic" sound reproduction only in a small area of the reproduction room.

Applications:

Typical applications are sound systems for home use, cinemas and other entertainment events. The IOSONO sound

system is also able to play back recordings made in common stereo or 5.1 stereo techniques.

Conclusion:

The drawbacks of this method are the components needed: a sophisticated recording system, a powerful computing unit

for real-time mixing the large number of recorded sound tracks and the number of loudspeakers that have to be installed

in the listening room. In a common size cinema for example about 200 loudspeakers are needed.

The advantage is that with the IOSONO sound system a very realistic sound reproduction is possible, but it requires an

enormous effort, which is too high for daily use in laboratories.
4.4 Eidophonie

This method (see [i.17]) was developed for realistic sound reproduction using the VHF transmission technique. The

main principle is to separate the base signal from the part of the signal, which contains the information about the

direction of sound incidence.

For recording a 1 order gradient microphone with a cardioid directivity is used. During the recordings its directivity

rotates with 38 kHz in the recording plane. This "turning microphone" provides an amplitude-modulated signal at its

electrical output. The resulting side bands are out of the transmitted frequency range. But these side bands contain the

information of the direction of sound incidence. Using the VHF- transmission techniques this phase information can be

transmitted within the 2 audio-frequency channel.

The sound reproduction is made by a spatial demodulation: a switch is positioned before each loudspeaker and each

switches synchronously with the turning directivity. So a low pass filtered short-term section of the signal containing

the information of the direction of sound incidence is played back on each loudspeaker. The loudspeakers are positioned

all around the playback room.
Applications:

Eidophonie was developed to provide a realistic sound environment using a signal received from a VHF broadcast

station. With this technique the common stereo sound reproduction should be improved. Nevertheless Eidophonie is

also compatible to common mono and stereo recordings.
ETSI
---------------------- Page: 12 ----------------------
SIST-V ETSI/EG 202 396-1 V1.2.3:2009
11 ETSI EG 202 396-1 V1.2.3 (2009-03)
Conclusion:

Benefits of this system are that three loudspeakers are sufficient to produce a realistic sound field. Using more

loudspeakers (e.g. 16) the spatial sound reproduction gets more and more independent from the listening position.

Moreover the independency of the transmitted sound from the acoustics of the reproduction room increases with the

number of loudspeakers used. But there are significant limitations of the method: The microphone directivity is

frequency dependent and not ideal. Therefore the interference between the different channels is created. A second

problem is the loudspeaker directivity, which does not fit the microphone directivity. This problem could be reduced if

the number of channels would be increased. This however is not possible due to the limited directivity of the

microphone arrangement used.
Localization of sound sources is hardly poss
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.