ETSI TS 103 106 V1.6.1 (2021-07)
Speech and multimedia Transmission Quality (STQ); Speech quality performance in the presence of background noise: Background noise transmission for mobile terminals-objective test methods
Speech and multimedia Transmission Quality (STQ); Speech quality performance in the presence of background noise: Background noise transmission for mobile terminals-objective test methods
RTS/STQ-298
General Information
Standards Content (Sample)
TECHNICAL SPECIFICATION
Speech and multimedia Transmission Quality (STQ);
Speech quality performance in
the presence of background noise:
Background noise transmission for
mobile terminals-objective test methods
2 ETSI TS 103 106 V1.6.1 (2021-07)
Reference
RTS/STQ-298
Keywords
noise, quality, speech, testing, transmission
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE
Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16
Siret N° 348 623 562 00017 - APE 7112B
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° w061004871
Important notice
The present document can be downloaded from:
http://www.etsi.org/standards-search
The present document may be made available in electronic versions and/or in print. The content of any electronic and/or
print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any
existing or perceived difference in contents between such versions and/or in print, the prevailing version of an ETSI
deliverable is the one made publicly available in PDF format at www.etsi.org/deliver.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
https://portal.etsi.org/TB/ETSIDeliverableStatus.aspx
If you find errors in the present document, please send your comment to one of the following services:
https://portal.etsi.org/People/CommiteeSupportStaff.aspx
Notice of disclaimer & limitation of liability
The information provided in the present deliverable is directed solely to professionals who have the appropriate degree of
experience to understand and interpret its content in accordance with generally accepted engineering or
other professional standard and applicable regulations.
No recommendation as to products and services or vendors is made or should be implied.
No representation or warranty is made that this deliverable is technically accurate or sufficient or conforms to any law
and/or governmental rule and/or regulation and further, no representation or warranty is made of merchantability or fitness
for any particular purpose or against infringement of intellectual property rights.
In no event shall ETSI be held liable for loss of profits or any other incidental or consequential damages.
Any software contained in this deliverable is provided "AS IS" with no warranties, express or implied, including but not
limited to, the warranties of merchantability, fitness for a particular purpose and non-infringement of intellectual property
rights and ETSI shall not be held liable in any event for any damages whatsoever (including, without limitation, damages
for loss of profits, business interruption, loss of information, or any other pecuniary loss) arising out of or related to the use
of or inability to use the software.
Copyright Notification
No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and
microfilm except as authorized by written permission of ETSI.
The content of the PDF version shall not be modified without the written authorization of ETSI.
The copyright and the foregoing restriction extend to reproduction in all media.
© ETSI 2021.
All rights reserved.
ETSI
3 ETSI TS 103 106 V1.6.1 (2021-07)
Contents
Intellectual Property Rights . 5
Foreword . 5
Modal verbs terminology . 5
1 Scope . 6
2 References . 6
2.1 Normative references . 6
2.2 Informative references . 7
3 Definition of terms, symbols and abbreviations . 8
3.1 Terms . 8
3.2 Symbols . 8
3.3 Abbreviations . 8
4 Introduction . 9
5 Underlying speech databases and preparations . 9
6 Modifications to the model described in ETSI EG 202 396-3 . 11
6.1 Prefiltering in Narrowband Mode (NB) . 11
6.2 Void . 11
6.3 Speech level adjustment in wideband . 11
6.4 Modified neural network for S-MOS . 11
6.5 Retraining of parameter regression for N-MOS and G-MOS . 12
7 Comparison of objective and subjective results after the training process . 13
7.0 General . 13
7.1 Results in wideband mode . 14
7.1.0 General . 14
7.1.1 Results for database "Audience - Test 3" . 14
7.1.2 Results for database "Audience - Test 3L" (excluded during retraining) . 15
7.1.3 Results for database "Audience - Test 4" . 15
7.1.4 Results for database "Audience - Test 4L" . 16
7.1.5 Results for database "Nokia - Test 1" . 17
7.1.6 Results for database "Nokia - Test 2" (excluded during retraining) . 17
7.1.7 Results for database "Orange" . 18
7.1.8 Results for database "Qualcomm - Test 3" . 19
7.1.9 Results for database "Qualcomm - Test 4" . 19
7.2 Results in narrowband mode . 20
7.2.0 General . 20
7.2.1 Results for database "Audience - Test 1" . 20
7.2.2 Results for database "Audience - Test 1L" . 21
7.2.3 Results for database "Audience - Test 2" . 21
7.2.4 Results for database "Audience - Test 2L" . 22
7.2.5 Results for database "Qualcomm- Test 1" . 23
7.2.6 Results for database "Qualcomm- Test 2" . 23
8 Validation results . 24
8.0 Preamble . 24
8.1 Audience validation data . 24
8.1.1 Description of tests . 24
8.1.2 Description of validation results . 26
8.1.2.0 General explanation . 26
8.1.2.1 Experiment 5: Narrowband . 26
8.1.2.2 Experiment 6: Narrowband . 28
8.1.2.3 Experiment 7: Wideband . 30
8.1.2.4 Experiment 8: Wideband . 32
8.2 Orange validation data . 34
8.2.1 Description of tests . 34
ETSI
4 ETSI TS 103 106 V1.6.1 (2021-07)
8.2.2 Description of validation results . 35
8.3 Qualcomm validation data . 36
8.3.1 Description of tests . 36
8.3.2 Description of validation results . 39
8.4 Validation data for additional use cases . 43
8.4.1 Tests 1 and 2: Description . 43
8.4.2 Tests 1 and 2: Results . 44
8.4.3 Tests 3, 4, 5 and 6: Description. 53
8.4.4 Tests 3 and 4: Results, Narrowband . 54
8.4.5 Tests 5 and 6: Results, Wideband . 62
9 Application of the retrained model . 69
Annex A (normative): Summary of Retraining Databases. 70
Annex B (normative): Test vectors for model verification . 71
B.0 Test vectors . 71
B.1 Audience test vectors. 71
B.2 Orange test vectors . 73
Annex C (normative): Speech material to be used for objective testing . 74
Annex D (informative): Subjective testing framework used for the present document . 75
D.1 Introduction . 75
D.2 Subjective test plan . 75
D.2.1 Traceability. 75
D.2.2 Speech database requirements . 75
D.2.3 Reference Conditions . 75
D.2.4 Test Conditions . 75
D.2.5 Pre-processing of reference conditions. 77
D.2.6 Post-processing of test conditions . 77
D.2.7 Calibration and equalization of headphones for presentation . 77
D.2.8 Requirements on the listening laboratory . 77
D.2.9 Experimental design . 77
D.2.10 Training session . 78
D.3 Set-up for acquisition of test conditions . 78
D.3.1 Terminal positioning and HATS calibration . 78
D.3.2 Background Noise reproduction . 78
D.3.3 Noise and speech playback synchronization . 79
D.3.4 Convergence sequence . 79
D.3.5 Example of noise and speech playback sequence including convergence period . 79
D.3.6 Recordings at the network simulator electrical reference point . 80
D.3.7 Recordings at the MRP and terminal's primary microphone location . 80
D.4 Processing test plan block diagram . 81
History . 83
ETSI
5 ETSI TS 103 106 V1.6.1 (2021-07)
Intellectual Property Rights
Essential patents
IPRs essential or potentially essential to normative deliverables may have been declared to ETSI. The declarations
pertaining to these essential IPRs, if any, are publicly available for ETSI members and non-members, and can be
found in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to
ETSI in respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the
ETSI Web server (https://ipr.etsi.org/).
Pursuant to the ETSI Directives including the ETSI IPR Policy, no investigation regarding the essentiality of IPRs,
including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not
referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may be, or may become,
essential to the present document.
Trademarks
The present document may include trademarks and/or tradenames which are asserted and/or registered by their owners.
ETSI claims no ownership of these except for any which are indicated as being the property of ETSI, and conveys no
right to use or reproduce any trademark and/or tradename. Mention of those trademarks in the present document does
not constitute an endorsement by ETSI of products, services or organizations associated with those trademarks.
DECT™, PLUGTESTS™, UMTS™ and the ETSI logo are trademarks of ETSI registered for the benefit of its
Members. 3GPP™ and LTE™ are trademarks of ETSI registered for the benefit of its Members and of the 3GPP
Organizational Partners. oneM2M™ logo is a trademark of ETSI registered for the benefit of its Members and of the ®
oneM2M Partners. GSM and the GSM logo are trademarks registered and owned by the GSM Association.
Foreword
This Technical Specification (TS) has been produced by ETSI Technical Committee Speech and multimedia
Transmission Quality (STQ).
The present document is to be used in conjunction with the ETSI ES 202 396-1 [i.2] and ETSI EG 202 396-3 [i.4]:
ETSI ES 202 396-1: "Background noise simulation technique and background noise database";
ETSI EG 202 396-3: "Background noise transmission - Objective test methods".
The present document is based on the objective test method described in ETSI EG 202 396-3 [i.4] and contains
modifications of the model required in order to provide a good prediction of the uplink speech quality in the presence of
background noise of modern mobile terminals.
Modal verbs terminology
In the present document "shall", "shall not", "should", "should not", "may", "need not", "will", "will not", "can" and
"cannot" are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of
provisions).
"must" and "must not" are NOT allowed in ETSI deliverables except when used in direct citation.
ETSI
6 ETSI TS 103 106 V1.6.1 (2021-07)
1 Scope
The present document describes testing methodologies which can be used to objectively evaluate the performance of
narrowband and wideband mobile terminals for speech communication in the presence of background noise.
Background noise is a problem in mostly all situations and conditions and needs to be taken into account in both,
terminals and networks. The present document provides information about the testing methods applicable to objectively
evaluate the speech quality of mobile terminals with AMR and AMR-WB codecs in the presence of background noise.
The present document includes:
• The method which is applicable to objectively determine the different parameters influencing the speech
quality in the presence of background noise taking into account:
- the speech quality;
- the background noise transmission quality;
- the overall quality.
• The description of the adaptation of the test method described in ETSI ES 202 396-1 [i.2].
• The model results in comparison with the underlying subjective tests used for the retraining of the objective
model.
• The model validation results:
- Additional validation results are provided for cases which include some conditions outside the scope of
ETSI ES 202 396-1 [i.2]. These include music as background noise, and user holding a handset in other
than nominal position, as defined in Recommendation ITU-T P.64 [i.24]. In addition, validation results
are provided for Chinese language.
The present document is to be used in conjunction with:
- ETSI ES 202 396-1 [i.2] which describes a recording and reproduction setup for realistic simulation of
background noise scenarios in lab-type environments for the performance evaluation of terminals and
communication systems.
- ETSI EG 202 396-3 [i.4] which describes the basic objective model underlying to the Model described in the
present document.
- American English speech sentences as enclosed in the present document.
The prediction of performance for narrowband and wideband mobile terminals, which feature artificial intelligence-
based speech enhancement techniques, is out of scope. Devices employing such techniques were not available for
training and validation of the present model.
2 References
2.1 Normative references
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the
referenced document (including any amendments) applies.
Referenced documents which are not found to be publicly available in the expected location might be found at
https://docbox.etsi.org/Reference/.
NOTE: While any hyperlinks included in this clause were valid at the time of publication, ETSI cannot guarantee
their long term validity.
ETSI
7 ETSI TS 103 106 V1.6.1 (2021-07)
The following referenced documents are necessary for the application of the present document.
Not applicable.
2.2 Informative references
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the
referenced document (including any amendments) applies.
NOTE: While any hyperlinks included in this clause were valid at the time of publication, ETSI cannot guarantee
their long term validity.
The following referenced documents are not necessary for the application of the present document but they assist the
user with regard to a particular subject area.
[i.1] 3GPP S4-120542: "Common subjective testing framework for training of P.835 test predictors".
[i.2] ETSI ES 202 396-1: "Speech and multimedia Transmission Quality (STQ); Speech quality
performance in the presence of background noise; Part 1: Background noise simulation technique
and background noise database".
[i.3] Void.
[i.4] ETSI EG 202 396-3: "Speech and multimedia Transmission Quality (STQ); Speech Quality
performance in the presence of background noise Part 3: Background noise transmission -
Objective test methods".
[i.5] ETSI TS 126 073: "Digital cellular telecommunications system (Phase 2+) (GSM); Universal
Mobile Telecommunications System (UMTS); LTE; ANSI-C code for the Adaptive Multi Rate
(AMR) speech codec (3GPP TS 26.073)".
[i.6] Recommendation ITU-T P.835: "Subjective test methodology for evaluating speech
communication systems that include noise suppression algorithm".
[i.7] Recommendation ITU-T G.722.2: "Wideband coding of speech at around 16 kbit/s using Adaptive
Multi-Rate Wideband (AMR-WB)".
[i.8] Recommendation ITU-T P.56: "Objective measurement of active speech level".
[i.9] Recommendation ITU-T P.1401: "Methods, metrics and procedures for statistical evaluation,
qualifying and comparison of objective quality prediction models".
[i.10] Void.
[i.11] Recommendation ITU-T G.191: "Software tools for speech and audio coding standardization".
[i.12] Void.
[i.13] Recommendation ITU-T P.501: "Test Signals for Use in Telephonometry".
[i.14] Recommendation ITU-T P.58: "Head and Torso simulator for telephonometry".
[i.15] Recommendation ITU-T P.57: "Artificial ears".
[i.16] ETSI TS 126 131: "Universal Mobile Telecommunications System (UMTS); LTE; Terminal
acoustic characteristics for telephony; Requirements (3GPP TS 26.131)".
[i.17] Recommendation ITU-T P.800: "Methods for subjective determination of transmission quality".
[i.18] ETSI TS 126 132: "Universal Mobile Telecommunications System (UMTS); LTE; Speech and
video telephony terminal acoustic test specification (3GPP TS 26.132)".
[i.19] Void.
ETSI
8 ETSI TS 103 106 V1.6.1 (2021-07)
[i.20] Recommendation ITU-T TD 477 (GEN/12): "Handbook of subjective test practical procedures"
(temporary document) - Geneva, 18-27 January 2011.
[i.21] AH-11-029: "Better Reference System for the P.835 SIG Rating Scale", Q7/12 Rapporteur's
meeting, 20-21 June 2011, Geneva, Switzerland.
[i.22] 3GPP, Tdoc S4(12)0621, Ext-ATS Permanent document (EATS-3): "Common subjective testing
framework for validation of P.835 test predictors".
[i.23] Recommendation ITU-T P.50: "Artificial voices".
[i.24] Recommendation ITU-T P.64: "Determination of sensitivity/frequency characteristics of local
telephone systems".
3 Definition of terms, symbols and abbreviations
3.1 Terms
Void.
3.2 Symbols
Void
3.3 Abbreviations
For the purposes of the present document, the following abbreviations apply:
78KBP 7,8 kHz Band-Pass
NOTE: According to Recommendation ITU-T G.191 [i.11].
AMR Adaptive MultiRate
AMR-NB Adaptive MultiRate Codec - Narrow Band
AMR-WB Adaptive MultiRate Wideband Speech Codec
BAK Background Noise Component
dB SPL Sound Pressure Level re 20 µ Pa in dB
DRP Drum Refe rence Point
DTX Discontinuous Transm ission
EATS Enhanced Acoustic Test Specification
EXP Experiment
FB Fullband
G-MOS Globa l MOS
NOTE: MOS related to the overall sample.
HATS Head And Torso Simulator
HHHF Hand-Held Hands-Free
IRS Intermediate Reference System
ITU International Telecommunication Union
ITU-T Telecommunication standardization sector of ITU
MOS Mean Opinion Score
MRP Mouth Reference Point
MSIN Mobile Station Input Filter
NB Narrowband
N-MOS Noise MOS
NOTE: MOS related to the noise transmission only.
ETSI
9 ETSI TS 103 106 V1.6.1 (2021-07)
NS Noise Suppression
NTT Nippon Telegraph and Telephone
OVRL Overall (speech + noise) Component
PRO Professional
RCV ReCeiVe
RMSE Root Mean Square Error
RMSE* epsilon insensitive Root Mean Square Error
SIG SIGnal component
S-MOS Speech MOS
NOTE: MOS related to the speech signal only.
SND Sending Direction
SNR Signal to Noise Ratio
SPL Sound Pressure Level
WB Wideband
WCDMA Wideband Code Division Multiple Access
4 Introduction
The present document describes the modifications of the ETSI EG 202 396-3 [i.4] model which were necessary to adapt
to the training databases provided by the 3GPP contributors listed in annex A. The core model itself retains mainly
unmodified except the points given in the clauses below. Modifications affect the narrow- and wideband mode in
different ways.
The adapted objective method described in the present document is intended to be used for all types of modern mobile
terminals using different bitrates of AMR [i.5] and AMR-WB [i.7] coding.
5 Underlying speech databases and preparations
The base for each mode of the objective model (wideband/narrowband) as described in ETSI EG 202 396-3 [i.4] are
listening tests conducted according to Recommendation ITU-T P.835 [i.6]. From the beginning of the development,
these listening test databases were designed to be a training set for predicting Recommendation ITU-T P.835 [i.6]
scores. They included a huge amount of conditions (> 170) and a wide range of speech and noise quality. Besides real
terminals also terminal simulations and transmission impairments were included. However, the data and processing
included were based on technologies actual at the time when the standard and its updates were created.
The underlying databases for the retraining as described in the present document were created using real state-of-the-art
mobile devices and thus the quality ranges yielded may not be normally distributed over all MOS scales. The context
between the databases can also differ (e.g. pure handset recordings vs. mixed handset/hands-free databases).
Furthermore new reference conditions extensively discussed in different standards groups and described in [i.1] were
included in the tests.
ETSI
10 ETSI TS 103 106 V1.6.1 (2021-07)
Table 1: Set of reference conditions
File SIG SNR Noise Type
i01 Source (filtered) No Noise -
i02 Source (filtered) 0 dB Fullsize_Car1_130Kmh_binaural
i03 Source (filtered) 12 dB Fullsize_Car1_130Kmh_binaural
i04 Source (filtered) 24 dB Fullsize_Car1_130Kmh_binaural
i05 Source (filtered) 36 dB Fullsize_Car1_130Kmh_binaural
i06 NS Level 1 No Noise -
i07 NS Level 2 No Noise -
i08 NS Level 3 No Noise -
i09 NS Level 4 No Noise -
i10 NS Level 3 24 dB Fullsize_Car1_130Kmh_binaural
i11 NS Level 2 12 dB Fullsize_Car1_130Kmh_binaural
i12 NS Level 1 0 dB Fullsize_Car1_130Kmh_binaural
NOTE: In case of clipping is generated for condition i02 and/or i12, a different signal
scaling is recommended (e.g. 73 dB refers to -36 instead of -26 dBov).
SPL
Each training database was provided together with 12 reference conditions, mainly created according to the annex of
[i.1], table 1 shows one possible arrangement. Although it was observed that not all reference sets included exactly the
same speech material, used background noise, SNR ranges and speech distortion configuration, this data indicates
which range of speech and noise degradations can be expected in the databases.
For transforming the different databases (to achieve at least approximately on a common base for the retraining of the
model), thus the 12 x 3 values of the reference conditions (averaged over all samples) were used to linearly transform
the subjective MOS data. In a first step, the reference conditions of all databases included in the retraining process were
weighted together to an average reference condition set. The weight per database depends on the number of samples it
provides for the training.
For each database, a mapping between the reference conditions and the average reference condition set is calculated. To
catch also inter-relations between speech, noise and global ratings, a matrix transformation instead a per-scale
regression was chosen. To compensate biases, a constant column was added to the reference set. Then a transformation
Tj is calculated for each database j with reference set Rj which minimizes the distance to the average reference set A:
(5.1)
The transformation matrix Tj (size 4 x 3) can easily be determined to:
(5.2)
If the three scales (S-MOS/N-MOS/G-MOS) are independent from each other for any database, the matrix
transformation T equals a linear per-scale transformation. Before the retraining of the model, the transformation is
j
applied to the whole test data on a per-sample base:
(5.3)
ETSI
11 ETSI TS 103 106 V1.6.1 (2021-07)
6 Modifications to the model described in ETSI
EG 202 396-3
6.1 Prefiltering in Narrowband Mode (NB)
In the narrowband mode described in ETSI EG 202 396-3 [i.4], the listening test audio files included a far-end handset
simulation, realized with an IRS RCV filter. In the requirements described in ETSI EG 202 396-3 [i.4], neither for
narrow- nor for wideband such a listening filter was described or used in the databases.
The narrowband mode internally filters the unprocessed and clean reference with IRS SND and IRS RCV to simulate a
transmission over high-quality listening devices and network. The principle of IRS seems to be outdated, modern
state-of-the-art mobiles do not have this frequency characteristic. Even more when using these newly created NB
databases, where the used devices have almost flat frequency responses in sending direction.
Thus the filtering with IRS SND and RCV of the two reference signals was replaced by filtering with the MSIN [i.11]
filter, which is mainly a band pass. Also no listening filter was applied to the processed signals.
6.2 Void
6.3 Speech level adjustment in wideband
The current ETSI EG 202 396-3 [i.4] implementation assumes 79 dB SPL/-15 dB Pa active speech level due to the
underlying listening test databases of the wideband mode.
For the objective model as described in the present document, the level adjustment of the recordings of the training
databases was applied in such a way, that the active speech level over the analysed sequence should be normalized to
73 dB SPL/-21 dB Pa (for the listening test and the algorithm) as described in ETSI EG 202 396-3 [i.4].
6.4 Modified neural network for S-MOS
The model described in ETSI EG 202 396-3 [i.4] calculates several parameters out of the psycho-acoustically motivated
inner representation for the estimation of S-MOS (and N-MOS as well). The parameters are shown in tables 2 and 3.
A detailed description of the calculation for the parameters can be found in ETSI EG 202 396-3 [i.4].
Table 2: Extracted parameters for N-MOS
P N P σ(ΔRA )
0 BGN,P 3 BGN,P-U
P σ(RA ) P μ(RA )
1 BGN,U 4 BGN,U
P σ(RA ) P μ(RA )
2 BGN,P 5 BGN,P
Table 3: Extracted Parameters for S-MOS
P P μ(ΔRA )
ΔSNR
1 4 Sp, P-U
P μ(RA) P σ(ΔRA )
2 Sp, P 5 Sp, P-C
P μ(ΔRA ) P σ(ΔRA )
3 Sp, P-C 6 Sp, P-U
The calculation of the objective S-MOS in clause 6.5.2 of ETSI EG 202 396-3 [i.4] is performed with the 6 parameters
of table 3 in conjunction with a neural network.
Figure 1: Void
Several vectors and matrices are necessary to implement the neural network with regard to the underlying listening test
database. For the present document, the training databases differ from the ones described in ETSI EG 202 396-3 [i.4]
are provided in the following equations.
ETSI
12 ETSI TS 103 106 V1.6.1 (2021-07)
For the normalization of the input parameters, different average and standard deviation vectors M and S for narrow-
in in
and wideband mode are necessary. For wideband, the vectors are provided in equation (6.1), for narrowband in
equation (6.2).
� �
� = 0,0 12,7309 4,2076 − 1,2456 0,8834 12,25222 7,0541 (6.1)
��,��
� =�1,0 11,8503 1,2824 − 0,3124 0,2511 6,7091 5, s� (6.2)
��,��
For narrowband, the corresponding input normalization vector are given by equations (6.3) and (6.4).
� = �0,0 13,7519 2,0884 − 0,3124 0,2511 6,7091 5,2951� (6.3)
��,��
� =�1,0 11,4341 0,4047 0,3877 0,3309 3,1189 2,5976� (6.4)
��,��
�
As described in ETSI EG 202 396-3 [i.4], the output of the hidden layer is calculated with a matrix multiplication of �
and H. H describes all weights from each input parameter to each neuron in the hidden layer. These weights are the
results of the training with the back-propagation algorithm. In consequence, H is different for each bandwidth mode.
For the present document, the updated matrices are provided in equations (6.5) for wideband and (6.6) for narrowband.
(6.5)
(6.6)
The five transformed output values of the hidden layer are then passed to the output layer. Here the output of the neural
network is calculated with another matrix multiplication with the matrix O, which weights the outputs of the hidden
layers to an output score S-MOS . This output layer matrix O is also given for wide and narrowband mode
objective, raw
independently. For the present document, the updated vectors are provided in equations (6.7) for wideband and (6.8) for
narrowband.
� �
� = 0,1777 − 0.2835 − 0,3147 0,1837 − 0,3237 (6.7)
��
� =�0,3832 − 0.5250 − 0,1878 − 0,2674 − 0,1548� (6.8)
��
With these modifications described above, instrumental assessment of S-MOS is completely defined for the context of
the present document.
6.5 Retraining of parameter regression for N-MOS and G-MOS
The objective N-MOS is the result of a linear, quadratic regression algorithm applied to the six parameters of table 2
according to equation (6.9):
2 6
j
NMOS = c + c ⋅ P (1)
0 ji i
ji==1 1
(6.9)
ETSI
13 ETSI TS 103 106 V1.6.1 (2021-07)
The overall or global quality G-MOS is calculated by using the previously calculated N-MOS and S-MOS as input
parameters for a linear quadratic regression according to equation (6.10):
2 2
j j
GMOS = c + c ⋅ SMOS + c ⋅ NMOS (1)
0 Sj Nj
j=1 j=1
(6.10)
The calculation steps for N-MOS and G-MOS are not modified, only the coefficients for the linear regressions
according to equations (6.9) and (6.10) are adapted to the new training material. The new coefficients are given in
tables 4 to 7.
Table 4: N-MOS coefficients for narrowband; Parameters P according to table 2
i
Bias P P P P P P
1 2 3 4 5 6
Order j = 1 2,2231 -0,0395 -0,0359 0,2825 0,0023 -0,3959 -2,6965
Order j = 2 - - 0,0021 -0,0239 -0,0003 0,0542 0,8684
Table 5: N-MOS coefficients for wideband; Parameters P according to table 2
i
Bias P P P P P P
1 2 3 4 5 6
Order j = 1 1,4279 -0,0484 0,0994 0,2189 -0,0732 -0,3346 -1,3108
Order j = 2 - - -0,0018 -0,0079 0,0011 0,0891 0,2566
Table 6: G-MOS coefficients for narrowband
Bias S-MOS N-MOS
Order j = 1 -0,4879 0,2647 0,8274
Order j = 2 - 0,0696 -0,0737
Table 7: G-MOS coefficients for wideband
Bias S-MOS N-MOS
Order j = 1 -0,2141 0,2735 0,4542
Order j = 2 - 0,0708 -0,0065
7 Comparison of objective and subjective results after
the training process
7.0 General
The comparison between the results of the subjective tests and the objective prediction of the conditions used in the
training process are given in this clause. The metrics used in the statistical evaluation process are derived from
Recommendation ITU-T P.1401 [i.9]. Besides the RMSE or RMSE* values, the different metrics and scatterplots are
given in this clause.
A summary of the databases and the conditions used for retraining is given in annex A.
ETSI
14 ETSI TS 103 106 V1.6.1 (2021-07)
7.1 Results in wideband mode
7.1.0 General
For the wideband retraining procedure two databases were not included within the training for several reasons. Removal
of these databases significantly increases the performance. Further analysis is required why these databases seem to be
"incompatible" with the remaining training set.
Overall, 7 databases with 387 conditions and 5 544 samples were used.
7.1.1 Results for database "Audience - Test 3"
no
RMSE: 0,25 0,28 0,24
Mapping
st
1 Ord.
0,25 0,23 0,22
Mapping
rd
3 Ord.
0,24 0,21 0,19
Mapping
no
RMSE*: 0,17 0,18 0,15
Mapping
st
1 Ord.
0,16 0,13 0,12
Mapping
rd
3 Ord.
0,15 0,11 0,10
Mapping
ETSI
15 ETSI TS 103 106 V1.6.1 (2021-07)
7.1.2 Results for database "Audience - Test 3L" (excluded during
retraining)
SIG BAK OVRL
no
RMSE: 0,56 0,67 0,53
Mapping
st
1 Ord.
0,44 0,40 0,36
Mapping
rd
3 Ord.
0,42 0,39 0,34
Mapping
no
RMSE*: 0,45 0,56 0,43
Mapping
st
1 Ord.
0,34 0,30 0,26
Mapping
rd
3 Ord.
0,31 0,28 0,25
Mapping
7.1.3 Results for database "Audience - Test 4"
ETSI
16 ETSI TS 103 106 V1.6.1 (2021-07)
SIG BAK OVRL
no
RMSE:
0,22 0,18 0,21
Mapping
st
1 Ord.
0,21 0,18 0,20
Mapping
rd
3 Ord.
0,21 0,17 0,18
Mapping
no
RMSE*: 0,14 0,11 0,14
Mapping
st
1 Ord.
0,12 0,10 0,12
Mapping
rd
3 Ord.
0,12 0,08 0,10
Mapping
7.1.4 Results for database "Audience - Test 4L"
SIG BAK OVRL
no
RMSE: 0,34 0,21 0,27
Mapping
st
1 Ord.
0,28 0,21 0,22
Mapping
rd
3 Ord.
0,26 0,18 0,20
Mapping
no
RMSE*: 0,23 0,11 0,17
Mapping
st
1 Ord.
0,17 0,11 0,14
Mapping
rd
3 Ord.
0,15 0,08 0,12
Mapping
ETSI
17 ETSI TS 103 106 V1.6.1 (2021-07)
7.1.5 Results for database "Nokia - Test 1"
SIG BAK OVRL
no
RMSE: 0,16 0,19 0,17
Mapping
st
1 Ord.
0,16 0,19 0,18
Mapping
rd
3 Ord.
0,16 0,20 0,18
Mapping
no
RMSE*:
0,06 0,08 0,09
Mapping
st
1 Ord.
0,07 0,08 0,09
Mapping
rd
3 Ord.
0,07 0,09 0,09
Mapping
7.1.6 Results for database "Nokia - Test 2" (excluded during retraining)
ETSI
18 ETSI TS 103 106 V1.6.1 (2021-07)
SIG BAK OVRL
no
RMSE:
0,33 0,47 0,36
Mapping
st
1 Ord.
0,33 0,48 0,36
Mapping
rd
3 Ord.
0,34 0,49 0,37
Mapping
no
RMSE*: 0,23 0,37 0,26
Mapping
st
1 Ord.
0,23 0,38 0,26
Mapping
rd
3 Ord.
0,24 0,38 0,26
Mapping
7.1.7 Results for database "Orange"
SIG BAK OVRL
no
RMSE: 0,28 0,27 0,26
Mapping
st
1 Ord.
0,20 0,24 0,16
Mapping
rd
3 Ord.
0,20 0,23 0,15
Mapping
no
RMSE*: 0,22 0,21 0,20
Mapping
st
1 Ord.
0,13 0,19 0,10
Mapping
rd
3 Ord.
0,13 0,18 0,10
Mapping
ETSI
19 ETSI TS 103 106 V1.6.1 (2021-07)
7.1.8 Results for database "Qualcomm - Test 3"
SIG BAK OVRL
no 0,21 0,17 0,27
RMSE:
Mapping
st
0,21 0,17 0,28
1 Ord.
Mapping
rd
0,21 0,16 0,19
3 Ord.
Mapping
no 0,11 0,08 0,16
RMSE*:
Mapping
st
0,11 0,08 0,18
1 Ord.
Mapping
rd
0,11 0,07 0,10
3 Ord.
Mapping
7.1.9 Results for database "Qualcomm - Test 4"
ETSI
20 ETSI TS 103 106 V1.6.1 (2021-07)
SIG BAK OVRL
no
RMSE: 0,32 0,17 0,24
Mapping
st
1 Ord.
0,31 0,28 0,24
Mapping
3rd Ord.
0,28 0,17 0,21
Mapping
no
RMSE*: 0,22 0,09 0,16
Mapping
st
1 Ord.
0,21 0,18 0,14
Mapping
rd
3 Ord.
0,17 0,08 0,12
Mapping
7.2 Results in narrowband mode
7.2.0 General
For the narrowband retraining procedure, no database was excluded.
Overall, 6 databases with 288 conditions and 3 840 samples were used.
7.2.1 Results for database "Audience - Test 1"
SIG BAK OVRL
no
RMSE: 0,25 0,21 0,21
Mapping
st
1 Ord.
0,18 0,21 0,19
Mapping
rd
3 Ord.
0,18 0,21 0,17
Mapping
no
RMSE*: 0,15 0,12 0,12
Mapping
st
1 Ord.
0,08 0,11 0,10
Mapping
rd
3 Ord.
0,08 0,11 0,07
Mapping
ETSI
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...