ISO/IEC 23003-7:2022
(Main)Information technology — MPEG audio technologies — Part 7: Unified speech and audio coding conformance testing
Information technology — MPEG audio technologies — Part 7: Unified speech and audio coding conformance testing
This document specifies conformance criteria for both bitstreams and decoders compliant with the MPEG-D Unified speech and audio coding standard as defined in ISO/IEC 23003‑3. This is done to assist implementers and to ensure interoperability.
Technologies de l'information — Technologies audio MPEG — Partie 7: Titre manque
General Information
Buy Standard
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 23003-7
First edition
2022-04
Information technology — MPEG
audio technologies —
Part 7:
Unified speech and audio coding
conformance testing
Reference number
ISO/IEC 23003-7:2022(E)
© ISO/IEC 2022
---------------------- Page: 1 ----------------------
ISO/IEC 23003-7:2022(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2022
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC 23003-7:2022(E)
Contents Page
Foreword .iv
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Conformance testing . .1
4.1 General . 1
4.2 USAC conformance testing . 1
4.2.1 Profiles . 1
4.2.2 Conformance tools and test procedure . 2
4.3 USAC bitstreams . 5
4.3.1 General . 5
4.3.2 USAC configuration . 5
4.3.3 Framework . 8
4.3.4 Frequency domain coding (FD mode) . 9
4.3.5 Linear predictive domain coding (LPD mode) . 11
4.3.6 Common core coding tools .12
4.3.7 Enhanced spectral band replication (eSBR) .12
4.3.8 eSBR – Predictive vector coding (PVC) . 15
4.3.9 eSBR – Inter temporal envelope shaping (inter-TES) .15
4.3.10 MPEG Surround 2-1-2 .15
4.3.11 Configuration Extensions . 18
4.3.12 AudioPreRoll . 18
4.3.13 DRC . 18
4.3.14 Restrictions depending on profiles and levels . 18
4.4 USAC Decoders. 21
4.4.1 General . 21
4.4.2 FD core mode tests . 21
4.4.3 LPD core mode tests . 27
4.4.4 Combined core coding tests . 32
4.4.5 eSBR Tests .34
4.4.6 MPEG Surround 212 Tests . 42
4.4.7 Bitstream Extensions . 45
4.5 Decoder settings . 47
4.5.1 General . 47
4.5.2 Target loudness [Lou-] . 47
4.5.3 DRC effect type request [Eff-] . 47
4.6 Decoding of MPEG-4 file format parameters to support exact time alignment in
file-to-file coding .48
iii
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 3 ----------------------
ISO/IEC 23003-7:2022(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance
are described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria
needed for the different types of document should be noted. This document was drafted in
accordance with the editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives or
www.iec.ch/members_experts/refdocs).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents) or the IEC
list of patent declarations received (see https://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see
www.iso.org/iso/foreword.html. In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
A list of all parts in the ISO/IEC 23003 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards
body. A complete listing of these bodies can be found at www.iso.org/members.html and
www.iec.ch/national-committees.
iv
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 4 ----------------------
INTERNATIONAL STANDARD ISO/IEC 23003-7:2022(E)
Information technology — MPEG audio technologies —
Part 7:
Unified speech and audio coding conformance testing
1 Scope
This document specifies conformance criteria for both bitstreams and decoders compliant with the
MPEG-D Unified speech and audio coding standard as defined in ISO/IEC 23003-3. This is done to assist
implementers and to ensure interoperability.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 23003-3, Information technology — MPEG audio technologies — Part 3: Unified speech and audio
coding
3 Terms and definitions
No terms and definitions are listed in this document.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
4 Conformance testing
4.1 General
This clause specifies conformance criteria for both bitstreams and decoders compliant with the USAC
standard as defined in this document. This is done to assist implementers and to ensure interoperability.
4.2 USAC conformance testing
4.2.1 Profiles
Profiles are defined in ISO/IEC 23003-3:2020, Subclause 4.5. Some conformance criteria apply to USAC
in general, while others are specific to certain profiles and their respective levels. Conformance shall be
tested for the level of the profile with which a given bitstream or decoder claims to comply.
In addition to the conformance requirements described in this clause, a decoder, which claims to
comply with the Extended HE AAC Profile, shall fulfil conformance for the HE AAC v2 profile according
to ISO/IEC 14496-26.
1
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 5 ----------------------
ISO/IEC 23003-7:2022(E)
4.2.2 Conformance tools and test procedure
4.2.2.1 General
To test USAC compliant audio decoders, this document provides a number of conformance test
sequences. Supplied sequences cover all profiles as defined in ISO/IEC 23003-3:2020, Subclause 4.5. For
a given test sequence, testing can be performed by comparing the output of a decoder under test with
a reference waveform. For some test sequences, the decoder requires additional input parameters, so-
called decoder settings, which are defined in 4.5. In cases where the decoder under test is followed by
additional operations (e.g. quantizing a signal to a 16 bit output signal) the conformance point is prior
to such additional operations, i.e. it is permitted to use the actual decoder output (e.g. with more than
16 bit) for conformance testing.
Measurements are carried out relative to full scale where the output signals of the decoders are
normalized to be in the range between −1,0 and +1,0.
In ISO/IEC 14496-26, a set of test methods is defined to test the output of the decoder under test against
the reference output. RMS/LSB Measurement, Segmental SNR and PNS conformance criteria are used
for the comparison. A particular test method for a certain test sequence is specified in 4.5.
For elements producing output that cannot be tested with the methods described in ISO/IEC 14496-26
specific conformance testing procedures are described in 4.5.
4.2.2.2 Conformance data
All test sequences and a worksheet (“ISO_IEC_23003-7_Conformance_Tables.xlsx”) that lists all test
sequences for each module are accessible at https:// standards .iso .org/ iso -iec/ 23003/ -7/ ed -1/ en.
NOTE All conformance test sequences for this document are accessible using this link.
For all conformance test sequences, the file names are composed of several parts, which convey
information about:
— which module of the decoder is tested;
— which channelConfigurationIndex is employed;
— which test conditions apply to the test sequence;
— which coreSbrFrameLengthIndex applies to the test sequence;
— which sampling frequency is signalled in the test sequence.
The file naming convention given in Table 1 shall be used.
Table 1 — File name conventions
Module File File name
Fd__c__.mp4
Frequency domain compressed mp4
coding (FD mode),
reference wav
Fd__c__.wav
4.3.4
Lpd__c__.mp4
Linear predictive compressed mp4
domain coding (LPD
reference wav
Lpd__c__.wav
mode), 4.3.5
Cct__c__.mp4
Combined core cod- compressed mp4
ing tools, 4.3.6
Cct__c__.wav
reference wav
Enhanced spectral compressed mp4 eSbr__c__.mp4
band replication
reference wav
eSbr__c__.wav
(eSBR), 4.3.7
2
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 6 ----------------------
ISO/IEC 23003-7:2022(E)
Table 1 (continued)
Module File File name
Mps__c_fr_Sc__.
MPEG Surround compressed mp4
mp4
2–1-2, 4.3.10
Mps__c_fr_Sc__.
reference wav
wav
Ext__c__.mp4
Bitstream Exten- compressed mp4
sions
Ext__c____.
reference wav
wav
channelConfigurationIndex as described in ISO/IEC 23003-3:2020, Table 73.
Setup string. May consist of a concatenation of one or more abbreviations as
listed in Table 2. If no setup, string is specified the basic test conditions apply.
If no testCase is added, only one single underline character shall occur at that
position.
coreSbrFrameLengthIndex as described in .
usacSamplingFrequencyIndex as described in ISO/IEC 23003-3:2020,
Table 75. If the sampling rate is specified explicitly and signalled by means of
the escape value index the sampling rate value in Hz is placed in the file name
instead of the index value, e.g. “Lpd_1_c1_Bpf_6000.mp4” for a sampling fre-
quency of 6000 Hz.
bsFreqRes as described in ISO/IEC 23003-1:2007, Table 39.
stereoConfigIndex as described in ISO/IEC 23003-3:2020, Table 77.
Setup string. May consist of a concatenation of one or more abbreviations as
listed in Table 3. If no decoderSetting is added, no underline character shall
occur after .
Table 2 — Test conditions and abbreviations
Module Test condition Abbrev.
FD window switching test condition Win
Noise filling test condition Nf
Temporal Noise Shaping (TNS) test condition Tns
Varying max_sfb test condition Sfb
FD core mode Handling of extensions condition Ex
Context adaptive arithmetic coder test condition Ac
Non-meaningful FD window switching test condtion Nmf
M/S stereo test condition Ms
Complex prediction stereo test condition Cp
Linear predictive coding (LPC) test condition Lpc
Algebraic code excited linear prediction (ACELP) core mode test condition Ace
Transform coded excitation (TCX) and noise filling test condition Tcx
LPD core mode
LPD mode coverage and FAC test condition Lpd
Bass-post filter test condition Bpf
Algebraic vector quantizer (AVQ) test condition Avq
3
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 7 ----------------------
ISO/IEC 23003-7:2022(E)
Table 2 (continued)
Module Test condition Abbrev.
FD-LPD transition and FAC test condition Flt
FD/TCX noise filling test condition Cnf
Bass-post filter test condition Cbf
Combined core coding
synchr. FD-LPD transition and FAC test condition Flts
asynchr. FD-LPD transition and FAC test condition Flta
Context adaptive arithmetic coder test condition CAc
Quadrature mirror filter (QMF) accuracy test condition Qma
Envelope adjuster accuracy and SBR preprocessing test condition Eaa
Header and grid control test condition test condition Hgt
Inverse filtering test condition Ift
Additional sine test (missing harmonics) test condition Ast
Sampling rate test condition Sr
Channel mode test condition Cm
eSbr interTes test condition Tes
Predictive vector coding (PVC) test condition Pvc
Harmonic transposition (QMF) test condition Htq
Harmonic transposition (crossproducts) test condition Xp
Transposer toggle test condition Ttt
Envelope shaping toggle (PVC on/off) test condition Est
Varying crossover frequency test condition Xo
stereoConfigIndex test condition Mps
Transient steering decorrelator (TSD) test condition Tsd
Rate mode test condition Rm
Phase coding test condition Pc
Decorrelator configuration. test condition Dc
Downmix (DMX) gain test condition Dm
Mpeg surround 212
Bands phase test condition Bp
Pseudo lr test condition Plr
Residual bands test condition Rb
Temporal Shaping Enabling test condition Tse
Smoothing mode test condition Smg
AudioPreRoll() and streamID condition, immediate play-out frame (IPF) I-foo-
Bitstream extensions Loudness normalization test condition Ln
Dynamic range control test condition Drc
Table 3 — Decoder setting conditions
Decoder setting Abbrev.
Target loudness Lou-
DRC effect type request Eff-
4
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 8 ----------------------
ISO/IEC 23003-7:2022(E)
4.3 USAC bitstreams
4.3.1 General
4.3.1.1 Characteristics
Characteristics of bitstreams specify the constraints that are applied by the encoder in generating the
bitstreams. These syntactic and semantic constraints may for example restrict the range or the values
of parameters that are encoded directly or indirectly in the bitstreams. The constraints applied to a
given bitstreams may or may not be known a priori.
4.3.1.2 Test procedure
Each USAC bitstream shall meet the syntactic and semantic requirements specified in this document.
The present subclause defines the conformance criteria that shall be fulfilled by a compliant bitstream.
These criteria are specified for the syntactic elements of the bitstream and for some parameters
decoded from the USAC bitstream payload.
For each tool, a set of semantic tests to be performed on the bitstreams is described. To verify whether
the syntax is correct is straightforward and therefore not defined herein after. In the description of
the semantic tests, it is assumed that the tested bitstreams contains no errors due to transmission or
other causes. For each test the condition or conditions that shall be satisfied are given, as well as the
prerequisites or conditions in which the test can be applied.
4.3.2 USAC configuration
4.3.2.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) usacSamplingFrequencyIndex;
b) usacSamplingFrequency;
c) coreSbrFrameLengthIndex;
d) channelConfigurationIndex;
e) presence of configuration extensions;
f) numOutChannels;
g) bsOutputChannelPos;
h) numElements;
i) stereoConfigIndex;
j) use of time warped MDCT;
k) use of noise filling in FD mode;
l) use of the eSBR harmonic transposer;
m) use of the eSBR inter-TES tool;
n) use of the eSBR PVC tool;
o) SBR default header, for details see 4.3.7;
p) MPS config, for details see 4.3.10.
5
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 9 ----------------------
ISO/IEC 23003-7:2022(E)
4.3.2.2 Test procedure
4.3.2.2.1 UsacConfig()
usacSamplingFrequencyIndex Shall be encoded with a non-reserved value specified in ISO/
IEC 23003-3:2020, Table 72. For further profile and level dependent
restrictions see 4.3.11.
usacSamplingFrequency No restrictions apply. For profile and level dependent restrictions,
see 4.3.11.
coreSbrFrameLengthIndex No restrictions apply.
channelConfigurationIndex Shall be encoded with a non-reserved value specified in ISO/
IEC 23003-3:2020, Table 73. For further profile and level depend-
ent restrictions see 4.3.11. In the case of channelConfigurationIn-
dex == 0 further restrictions apply as described in 4.3.2.2.2.
usacConfigExtensionPresent No restrictions apply.
4.3.2.2.2 UsacChannelConfig()
numOutChannels No restrictions apply. For profile and level dependent restrictions, see
4.3.11.
bsOutputChannelPos A bsOutputChannelPos of value 3 or 26 (LFE speaker positions) shall
be associated with an LFE channel. Any other value shall be associated
with a main audio channel.
4.3.2.2.3 UsacDecoderConfig()
numElements The value of this data element shall be such that the accumulated sum
of all channels contained in the bitstream complies with the restric-
tions outlined in 4.3.2.2.1.
usacElementType No restrictions apply. For profile and level dependent restrictions, see
4.3.11.
4.3.2.2.4 UsacSingleChannelElementConfig()
No restrictions are applicable to this bitstream element.
4.3.2.2.5 UsacChannelPairElementConfig()
The UsacChannelPairElementConfig() element and all included elements can only be present when
coding more than one output channel (see restrictions applying to UsacConfig() in 4.3.2.2.1).
stereoConfigIndex No restrictions apply.
4.3.2.2.6 UsacLfeElementConfig()
No restrictions are applicable to this bitstream element.
4.3.2.2.7 UsacCoreConfig()
tw_mdct No restrictions apply. For profile and level dependent restrictions,
see 4.3.11.
noiseFilling No restrictions apply.
6
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 10 ----------------------
ISO/IEC 23003-7:2022(E)
4.3.2.2.8 SbrConfig()
harmonicSBR No restrictions apply.
bs_interTes No restrictions apply.
bs_pvc No restrictions apply.
4.3.2.2.9 SbrDfltHeader()
dflt_start_freq No restrictions apply.
dflt_stop_freq No restrictions apply.
dflt_header_extra1 No restrictions apply.
dflt_header_extra2 No restrictions apply.
dflt_freq_scale No restrictions apply.
dflt_alter_scale No restrictions apply.
dftl_nose_bands No restrictions apply.
dflt_limiter_bands No restrictions apply.
dflt_limiter_gains No restrictions apply.
dflt_interpol_freq No restrictions apply.
dflt_smoothing_mode No restrictions apply.
4.3.2.2.10 Mps212Config()
bsFreqRes Shall not be encoded with a value of 0.
bsFixedGainDMX No restrictions apply
bsTempShapeConfig No restrictions apply.
bsDecorrConfig Shall not be encoded with a value of 3.
bsHighRateMode No restrictions apply.
bsPhaseCoding No restrictions apply.
bsOttBandsPhasePresent No restrictions apply.
bsOttBandsPhase Shall not be encoded with a value larger than the value of numBands
as given by ISO/IEC 23003-1:2007, 5.2, Table 39 and depends on
bsFreqRes.
bsResidualBands Shall not be encoded with a value larger than the value of numBands
as given by ISO/IEC 23003-1:2007, 5.2, Table 39 and depends on
bsFreqRes.
bsPseudoLr No restrictions apply.
bsEnvQuantMode Shall be 0.
4.3.2.2.11 UsacExtElementConfig()
usacExtElementType No restrictions apply.
usacExtElementConfigLength No restrictions apply.
usacExtElementDefaultLengthPre- No restrictions apply.
sent
usacExtElementDefaultLength No restrictions apply.
usacExtElementPayloadFrag No restrictions apply.
7
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 11 ----------------------
ISO/IEC 23003-7:2022(E)
4.3.2.2.12 UsacConfigExtension()
numConfigExtensions No restrictions apply.
usacConfigExtType[] No restrictions apply.
usacConfigExtLength[] No restrictions apply.
fill_byte Should be ‘10100101’.
4.3.3 Framework
4.3.3.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) signalling of independently decodable frames;
b) presence of extension elements;
c) core_mode;
d) presence of TNS.
4.3.3.2 Test procedure
4.3.3.2.1 UsacFrame()
usacIndependencyFlag No restrictions apply.
4.3.3.2.2 UsacSingleChannelElement
No restrictions are applicable to this bitstream element.
4.3.3.2.3 UsacChannelPairElement
No restrictions are applicable to this bitstream element.
4.3.3.2.4 UsacLfeElement
No restrictions are applicable to this bitstream element.
4.3.3.2.5 UsacExtElement
usacExtElementPresent No restrictions apply.
usacExtElementUseDefaultLength No restrictions apply.
usacExtElementPayloadLength No restrictions apply.
usacExtElementStart No restrictions apply.
usacExtElementStop No restrictions apply.
usacExtElementSegmentData No restrictions apply.
4.3.3.2.6 UsacCoreCoderData
core_mode No restrictions apply.
tns_data_present No restrictions apply.
8
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 12 ----------------------
ISO/IEC 23003-7:2022(E)
4.3.4 Frequency domain coding (FD mode)
4.3.4.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) use of noise filling;
b) window_shape;
c) M/S Stereo;
d) use of TNS;
e) complex prediction stereo coding;
f) max_sfb;
g) use of time warped MDCT;
h) use of long blocks;
i) use of short blocks.
4.3.4.2 Test procedure
4.3.4.2.1 fd_channel_stream
global_gain No restrictions apply.
noise_level No restrictions apply.
noise_offset No restrictions apply.
fac_data_present Shall be 0, if the core_mode of the preceding frame of the same
channel was 0 or if mod[3] of the preceding frame of the same
channel was > 0.
4.3.4.2.2 ics_info
window_sequence A conformant bitstream shall consist of only meaningful win-
dow_sequence transitions. However, decoders are required to
handle non-meaningful window_sequence transitions as well.
The meaningful window_sequence transitions are shown in ISO/
IEC 23003-3:2020, Table 138.
window_shape A compliant bitstream shall set window_shape to 0 if the next
block is encoded in LPD coding mode. However, decoders are re-
quired to handle both window_shapes for all transitions.
max_sfb Shall be < = num_swb_long or num_swb_short as appropriate for
window_sequence and sampling frequency and core coder frame
length.
scale_factor_grouping No restrictions apply.
4.3.4.2.3 tw_data
tw_data_present No restrictions apply.
tw_ratio No restrictions apply.
9
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 13 ----------------------
ISO/IEC 23003-7:2022(E)
4.3.4.2.4 scale_factor_data
hcod_sf Shall only be encoded with the values listed in the scalefactor Huff-
man table. Shall be encoded such that the decoded scalefactors sf[g]
[sfb] are within the range of zero to 255, both inclusive.
4.3.4.2.5 tns_data
n_filt No restrictions apply.
coef_res No restrictions apply.
length Shall be small enough such that the lower bound of the filtered re-
gion does not exceed the start of the array containing the spectral
coefficients.
order Shall not exceed the values listed in ISO/IEC 23003-3:2020,
Table 135.
direction No restrictions apply.
coef_compress No restrictions apply.
coef No restrictions apply.
4.3.4.2.6 ac_spectral_data
arith_reset_flag No restrictions apply.
4.3.4.2.7 StereoCoreToolInfo
tns_active No restrictions apply.
common_window No restrictions apply.
common_max_sfb No restrictions apply.
max_sfb1 Shall be < = num_swb_long or num_swb_short as appropriate for win-
dow_sequence and sampling frequency and core coder frame length.
ms_mask_present No restrictions apply.
ms_used No restrictions apply.
common_tw No restrictions apply.
common_tns No restrictions apply.
tns_on_lr No restrictions apply.
tns_present_both No restrictions apply.
tns_data_present No restrictions apply.
4.3.4.2.8 cplx_pred_data
cplx_pred_all N
...
FINAL
INTERNATIONAL ISO/IEC
DRAFT
STANDARD FDIS
23003-7
ISO/IEC JTC 1/SC 29
Information technology — MPEG
Secretariat: JISC
audio technologies —
Voting begins on:
2021-11-15
Part 7:
Voting terminates on:
Unified speech and audio coding
2021-01-10
conformance testing
RECIPIENTS OF THIS DRAFT ARE INVITED TO
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
Reference number
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
ISO/IEC FDIS 23003-7:2021(E)
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
NATIONAL REGULATIONS. © ISO/IEC 2021
---------------------- Page: 1 ----------------------
ISO/IEC FDIS 23003-7:2021(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2021
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
© ISO/IEC 2021 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC FDIS 23003-7:2021(E)
Contents Page
Foreword .iv
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Conformance testing . .1
4.1 General . 1
4.2 USAC conformance testing . 1
4.2.1 Profiles . 1
4.2.2 Conformance tools and test procedure . 2
4.3 USAC bitstreams . 5
4.3.1 General . 5
4.3.2 USAC configuration . 5
4.3.3 Framework . 8
4.3.4 Frequency domain coding (FD mode) . 9
4.3.5 Linear predictive domain coding (LPD mode) . 11
4.3.6 Common core coding tools . 13
4.3.7 Enhanced spectral band replication (eSBR) . 13
4.3.8 eSBR – Predictive vector coding (PVC) . 15
4.3.9 eSBR – Inter temporal envelope shaping (inter-TES) . 16
4.3.10 MPEG Surround 2-1-2 . 16
4.3.11 Configuration Extensions . 18
4.3.12 AudioPreRoll . 18
4.3.13 DRC . 19
4.3.14 Restrictions depending on profiles and levels . 19
4.4 USAC decoders . 21
4.4.1 General . 21
4.4.2 FD core mode tests . 21
4.4.3 LPD core mode tests .28
4.4.4 Combined core coding tests . 33
4.4.5 eSBR Tests .34
4.4.6 MPEG Surround 212 Tests . 42
4.4.7 Bitstream extensions . 45
4.5 Decoder settings . 47
4.5.1 General . 47
4.5.2 Target loudness [Lou-] . 47
4.5.3 DRC effect type request [Eff-] .48
4.6 Decoding of MPEG-4 file format parameters to support exact time alignment in
file-to-file coding .48
iii
© ISO/IEC 2021 – All rights reserved
---------------------- Page: 3 ----------------------
ISO/IEC FDIS 23003-7:2021(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance
are described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria
needed for the different types of document should be noted. This document was drafted in
accordance with the editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives or
www.iec.ch/members_experts/refdocs).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents) or the IEC
list of patent declarations received (see https://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see
www.iso.org/iso/foreword.html. In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
A list of all parts in the ISO/IEC 23003 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards
body. A complete listing of these bodies can be found at www.iso.org/members.html and
www.iec.ch/national-committees.
iv
© ISO/IEC 2021 – All rights reserved
---------------------- Page: 4 ----------------------
FINAL DRAFT INTERNATIONAL STANDARD ISO/IEC FDIS 23003-7:2021(E)
Information technology — MPEG audio technologies —
Part 7:
Unified speech and audio coding conformance testing
1 Scope
This document specifies conformance criteria for both bitstreams and decoders compliant with the
MPEG-D Unified speech and audio coding standard as defined in ISO/IEC 23003-3. This is done to assist
implementers and to ensure interoperability.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 23003-3:2020, Information technology — MPEG audio technologies — Part 3: Unified speech and
audio coding
3 Terms and definitions
No terms and definitions are listed in this document.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
4 Conformance testing
4.1 General
This clause specifies conformance criteria for both bitstreams and decoders compliant with the USAC
standard as defined in this document. This is done to assist implementers and to ensure interoperability.
4.2 USAC conformance testing
4.2.1 Profiles
Profiles are defined in ISO/IEC 23003-3:2020, Subclause 4.5. Some conformance criteria apply to USAC
in general, while others are specific to certain profiles and their respective levels. Conformance shall be
tested for the level of the profile with which a given bitstream or decoder claims to comply.
In addition to the conformance requirements described in this clause, a decoder, which claims to comply
with the Extended HE AAC Profile, shall fulfil conformance for the HE AAC v2 profile according to ISO/
IEC 14496-26.
1
© ISO/IEC 2021 – All rights reserved
---------------------- Page: 5 ----------------------
ISO/IEC FDIS 23003-7:2021(E)
4.2.2 Conformance tools and test procedure
4.2.2.1 General
To test USAC compliant audio decoders, this document provides a number of conformance test
sequences. Supplied sequences cover all profiles as defined in ISO/IEC 23003-3:2020, Subclause 4.5. For
a given test sequence, testing can be performed by comparing the output of a decoder under test with
a reference waveform. For some test sequences, the decoder requires additional input parameters, so-
called decoder settings, which are defined in 4.5. In cases where the decoder under test is followed by
additional operations (e.g. quantizing a signal to a 16 bit output signal) the conformance point is prior
to such additional operations, i.e. it is permitted to use the actual decoder output (e.g. with more than
16 bit) for conformance testing.
Measurements are carried out relative to full scale where the output signals of the decoders are
normalized to be in the range between −1,0 and +1,0.
In ISO/IEC 14496-26, a set of test methods is defined to test the output of the decoder under test against
the reference output. RMS/LSB Measurement, Segmental SNR and PNS conformance criteria are used
for the comparison. A particular test method for a certain test sequence is specified in 4.5.
For elements producing output that cannot be tested with the methods described in ISO/IEC 14496-26
specific conformance testing procedures are described in 4.5.
4.2.2.2 Conformance data
All test sequences and a worksheet (“Usac_Conformance_Tables.xlsx”) that lists all test sequences for
each module are accessible at https:// standards .iso .org/ iso -iec/ 23003/ -7/ ed -1/ en.
NOTE All conformance test sequences for ISO/IEC 23003-3 are accessible using this link. All previous
electronic attachments to this document (and its Amendments) are replaced by those at this link.
For all conformance test sequences, the file names are composed of several parts, which convey
information about:
— which module of the decoder is tested;
— which channelConfigurationIndex is employed;
— which test conditions apply to the test sequence;
— which coreSbrFrameLengthIndex applies to the test sequence;
— which sampling frequency is signalled in the test sequence.
The file naming convention given in Table 1 shall be used.
Table 1 — File name conventions
Module File File name
Fd__c__.mp4
Frequency domain compressed mp4
coding (FD mode),
reference wav
Fd__c__.wav
4.3.4
Lpd__c__.mp4
Linear predictive compressed mp4
domain coding
reference wav
Lpd__c__.wav
(LPD mode), 4.3.5
Cct__c__.mp4
Combined core cod- compressed mp4
ing tools, 4.3.6
Cct__c__.wav
reference wav
Enhanced spectral compressed mp4 eSbr__c__.mp4
band replication
reference wav
eSbr__c__.wav
(eSBR), 4.3.7
2
© ISO/IEC 2021 – All rights reserved
---------------------- Page: 6 ----------------------
ISO/IEC FDIS 23003-7:2021(E)
Table 1 (continued)
Module File File name
Mps__c_fr_Sc__.
MPEG Surround compressed mp4
mp4
2–1-2, 4.3.10
Mps__c_fr_Sc__.
reference wav
wav
Ext__c__.mp4
Bitstream Exten- compressed mp4
sions
Ext__c____.
reference wav
wav
channelConfigurationIndex as described in ISO/IEC 23003-3:2020, Table 73.
Setup string. May consist of a concatenation of one or more abbreviations as
listed in Table 2. If no setup, string is specified the basic test conditions apply.
If no testCase is added, only one single underline character shall occur at that
position.
coreSbrFrameLengthIndex as described in .
usacSamplingFrequencyIndex as described in ISO/IEC 23003-3:2020,
Table 75. If the sampling rate is specified explicitly and signalled by means of
the escape value index the sampling rate value in Hz is placed in the file name
instead of the index value, e.g. “Lpd_1_c1_Bpf_6000.mp4” for a sampling fre-
quency of 6000 Hz.
bsFreqRes as described in ISO/IEC 23003-1:2007, Table 39.
stereoConfigIndex as described in ISO/IEC 23003-3:2020, Table 77.
Setup string. May consist of a concatenation of one or more abbreviations as
listed in Table 3. If no decoderSetting is added, no underline character shall
occur after .
Table 2 — Test conditions and abbreviations
Module Test condition Abbrev.
FD window switching test condition Win
Noise filling test condition Nf
Temporal Noise Shaping (TNS) test condition Tns
Varying max_sfb test condition Sfb
FD core mode Handling of extensions condition Ex
Context adaptive arithmetic coder test condition Ac
Non-meaningful FD window switching test condtion Nmf
M/S stereo test condition Ms
Complex prediction stereo test condition Cp
Linear predictive coding (LPC) test condition Lpc
Algebraic code excited linear prediction (ACELP) core mode test condition Ace
Transform coded excitation (TCX) and noise filling test condition Tcx
LPD core mode
LPD mode coverage and FAC test condition Lpd
Bass-post filter test condition Bpf
Algebraic vector quantizer (AVQ) test condition Avq
3
© ISO/IEC 2021 – All rights reserved
---------------------- Page: 7 ----------------------
ISO/IEC FDIS 23003-7:2021(E)
Table 2 (continued)
Module Test condition Abbrev.
FD-LPD transition and FAC test condition Flt
FD/TCX noise filling test condition Cnf
Bass-post filter test condition Cbf
Combined core coding
synchr. FD-LPD transition and FAC test condition Flts
asynchr. FD-LPD transition and FAC test condition Flta
Context adaptive arithmetic coder test condition CAc
Quadrature mirror filter (QMF) accuracy test condition Qma
Envelope adjuster accuracy and SBR preprocessing test condition Eaa
Header and grid control test condition test condition Hgt
Inverse filtering test condition Ift
Additional sine test (missing harmonics) test condition Ast
Sampling rate test condition Sr
Channel mode test condition Cm
eSbr interTes test condition Tes
Predictive vector coding (PVC) test condition Pvc
Harmonic transposition (QMF) test condition Htq
Harmonic transposition (crossproducts) test condition Xp
Transposer toggle test condition Ttt
Envelope shaping toggle (PVC on/off) test condition Est
Varying crossover frequency test condition Xo
stereoConfigIndex test condition Mps
Transient steering decorrelator (TSD) test condition Tsd
Rate mode test condition Rm
Phase coding test condition Pc
Decorrelator configuration. test condition Dc
Downmix (DMX) gain test condition Dm
Mpeg surround 212
Bands phase test condition Bp
Pseudo lr test condition Plr
Residual bands test condition Rb
Temporal Shaping Enabling test condition Tse
Smoothing mode test condition Smg
AudioPreRoll() and streamID condition, immediate play-out frame (IPF) I-foo-
Bitstream extensions Loudness normalization test condition Ln
Dynamic range control test condition Drc
Table 3 — Decoder setting conditions
Decoder setting Abbrev.
Target loudness Lou-
DRC effect type request Eff-
4
© ISO/IEC 2021 – All rights reserved
---------------------- Page: 8 ----------------------
ISO/IEC FDIS 23003-7:2021(E)
4.3 USAC bitstreams
4.3.1 General
4.3.1.1 Characteristics
Characteristics of bitstreams specify the constraints that are applied by the encoder in generating the
bitstreams. These syntactic and semantic constraints may for example restrict the range or the values
of parameters that are encoded directly or indirectly in the bitstreams. The constraints applied to a
given bitstreams may or may not be known a priori.
4.3.1.2 Test procedure
Each USAC bitstream shall meet the syntactic and semantic requirements specified in this document.
The present subclause defines the conformance criteria that shall be fulfilled by a compliant bitstream.
These criteria are specified for the syntactic elements of the bitstream and for some parameters
decoded from the USAC bitstream payload.
For each tool, a set of semantic tests to be performed on the bitstreams is described. To verify whether
the syntax is correct is straightforward and therefore not defined herein after. In the description of
the semantic tests, it is assumed that the tested bitstreams contains no errors due to transmission or
other causes. For each test the condition or conditions that shall be satisfied are given, as well as the
prerequisites or conditions in which the test can be applied.
4.3.2 USAC configuration
4.3.2.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) usacSamplingFrequencyIndex;
b) usacSamplingFrequency;
c) coreSbrFrameLengthIndex;
d) channelConfigurationIndex;
e) presence of configuration extensions;
f) numOutChannels;
g) bsOutputChannelPos;
h) numElements;
i) stereoConfigIndex;
j) use of time warped MDCT;
k) use of noise filling in FD mode;
l) use of the eSBR harmonic transposer;
m) use of the eSBR inter-TES tool;
n) use of the eSBR PVC tool;
o) SBR default header, for details see 4.3.7;
p) MPS config, for details see 4.3.10.
5
© ISO/IEC 2021 – All rights reserved
---------------------- Page: 9 ----------------------
ISO/IEC FDIS 23003-7:2021(E)
4.3.2.2 Test procedure
4.3.2.2.1 UsacConfig()
usacSamplingFrequencyIndex Shall be encoded with a non-reserved value specified in ISO/IEC 23003-
3:2020, Table 72. For further profile and level dependent restrictions see
4.3.11.
usacSamplingFrequency No restrictions apply. For profile and level dependent restrictions, see
4.3.11.
coreSbrFrameLengthIndex No restrictions apply.
channelConfigurationIndex Shall be encoded with a non-reserved value specified in ISO/IEC 23003-
3:2020, Table 73. For further profile and level dependent restrictions see
4.3.11. In the case of channelConfigurationIndex == 0 further restrictions
apply as described in 4.3.2.2.2.
usacConfigExtensionPresent No restrictions apply.
4.3.2.2.2 UsacChannelConfig()
numOutChannels No restrictions apply. For profile and level dependent restrictions, see
4.3.11.
bsOutputChannelPos A bsOutputChannelPos of value 3 or 26 (LFE speaker positions) shall be
associated with an LFE channel. Any other value shall be associated with
a main audio channel.
4.3.2.2.3 UsacDecoderConfig()
numElements The value of this data element shall be such that the accumulated sum
of all channels contained in the bitstream complies with the restrictions
outlined in 4.3.2.2.1.
usacElementType No restrictions apply. For profile and level dependent restrictions, see
4.3.11.
4.3.2.2.4 UsacSingleChannelElementConfig()
No restrictions are applicable to this bitstream element.
4.3.2.2.5 UsacChannelPairElementConfig()
The UsacChannelPairElementConfig() element and all included elements can only be present when
coding more than one output channel (see restrictions applying to UsacConfig() in 4.3.2.2.1).
stereoConfigIndex No restrictions apply.
4.3.2.2.6 UsacLfeElementConfig()
No restrictions are applicable to this bitstream element.
4.3.2.2.7 UsacCoreConfig()
tw_mdct No restrictions apply. For profile and level dependent restrictions, see
4.3.11.
noiseFilling No restrictions apply.
6
© ISO/IEC 2021 – All rights reserved
---------------------- Page: 10 ----------------------
ISO/IEC FDIS 23003-7:2021(E)
4.3.2.2.8 SbrConfig()
harmonicSBR No restrictions apply.
bs_interTes No restrictions apply.
bs_pvc No restrictions apply.
4.3.2.2.9 SbrDfltHeader()
dflt_start_freq No restrictions apply.
dflt_stop_freq No restrictions apply.
dflt_header_extra1 No restrictions apply.
dflt_header_extra2 No restrictions apply.
dflt_freq_scale No restrictions apply.
dflt_alter_scale No restrictions apply.
dftl_nose_bands No restrictions apply.
dflt_limiter_bands No restrictions apply.
dflt_limiter_gains No restrictions apply.
dflt_interpol_freq No restrictions apply.
dflt_smoothing_mode No restrictions apply.
4.3.2.2.10 Mps212Config()
bsFreqRes Shall not be encoded with a value of 0.
bsFixedGainDMX No restrictions apply
bsTempShapeConfig No restrictions apply.
bsDecorrConfig Shall not be encoded with a value of 3.
bsHighRateMode No restrictions apply.
bsPhaseCoding No restrictions apply.
bsOttBandsPhasePresent No restrictions apply.
bsOttBandsPhase Shall not be encoded with a value larger than the value of numBands
as given by ISO/IEC 23003-1:2007, 5.2, Table 39 and depends on
bsFreqRes.
bsResidualBands Shall not be encoded with a value larger than the value of numBands
as given by ISO/IEC 23003-1:2007, 5.2, Table 39 and depends on
bsFreqRes.
bsPseudoLr No restrictions apply.
bsEnvQuantMode Shall be 0.
4.3.2.2.11 UsacExtElementConfig()
7
© ISO/IEC 2021 – All rights reserved
---------------------- Page: 11 ----------------------
ISO/IEC FDIS 23003-7:2021(E)
usacExtElementType No restrictions apply.
usacExtElementConfigLength No restrictions apply.
usacExtElementDefaultLengthPresent No restrictions apply.
usacExtElementDefaultLength No restrictions apply.
usacExtElementPayloadFrag No restrictions apply.
4.3.2.2.12 UsacConfigExtension()
numConfigExtensions No restrictions apply.
usacConfigExtType[] No restrictions apply.
usacConfigExtLength[] No restrictions apply.
fill_byte Should be ‘10100101’.
4.3.3 Framework
4.3.3.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) signalling of independently decodable frames;
b) presence of extension elements;
c) core_mode;
d) presence of TNS.
4.3.3.2 Test procedure
4.3.3.2.1 UsacFrame()
usacIndependencyFlag No restrictions apply.
4.3.3.2.2 UsacSingleChannelElement
No restrictions are applicable to this bitstream element.
4.3.3.2.3 UsacChannelPairElement
No restrictions are applicable to this bitstream element.
4.3.3.2.4 UsacLfeElement
No restrictions are applicable to this bitstream element.
4.3.3.2.5 UsacExtElement
usacExtElementPresent No restrictions apply.
usacExtElementUseDefaultLength No restrictions apply.
8
© ISO/IEC 2021 – All rights reserved
---------------------- Page: 12 ----------------------
ISO/IEC FDIS 23003-7:2021(E)
usacExtElementPayloadLength No restrictions apply.
usacExtElementStart No restrictions apply.
usacExtElementStop No restrictions apply.
usacExtElementSegmentData No restrictions apply.
4.3.3.2.6 UsacCoreCoderData
core_mode No restrictions apply.
tns_data_present No restrictions apply.
4.3.4 Frequency domain coding (FD mode)
4.3.4.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) use of noise filling;
b) window_shape;
c) M/S Stereo;
d) use of TNS;
e) complex prediction stereo coding;
f) max_sfb;
g) use of time warped MDCT;
h) use of long blocks;
i) use of short blocks.
4.3.4.2 Test procedure
4.3.4.2.1 fd_channel_stream
global_gain No restrictions apply.
noise_level No restrictions apply.
noise_offset No restrictions apply.
fac_data_present Shall be 0, if the core_mode of the preceding frame of the same
channel was 0 or if mod[3] of the preceding frame of the same
channel was > 0.
4.3.4.2.2 ics_info
window_sequence A conformant bitstream shall consist of only meaningful window_
sequence transitions. However, decoders are required to handle
non-meaningful window_sequence transitions as well. The mean-
ingful window_sequence transitions are shown in ISO/IEC 23003-
3:2020, Table 138.
9
© ISO/IEC 2021 – All rights reserved
---------------------- Page: 13 ----------------------
ISO/IEC FDIS 23003-7:2021(E)
window_shape A compliant bitstream shall set window_shape to 0 if the next block
is encoded in LPD coding mode. However, decoders are required to
handle both window_shapes for all transitions.
max_sfb Shall be < = num_swb_long or num_swb_short as appropriate for
window_sequence and sampling frequency and core coder frame
length.
scale_factor_grouping No restrictions apply.
4.3.4.2.3 tw_data
tw_data_present No restrictions apply.
tw_ratio No restrictions apply.
4.3.4.2.4 scale_factor_data
hcod_sf Shall only be encoded with the values listed in the scalefactor Huffman
table. Shall be encoded such that the decoded scalefactors sf[g][sfb]
are within the range of zero to 255, both inclusive.
4.3.4.2.5 tns_data
n_filt No restrictions apply.
coef_res No restrictions apply.
length Shall be small enough such that the lower bound of the filtered re-
gion does not exceed the start of the array containing the spectral
coefficients.
order Shall not exceed the values listed in ISO/IE
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.