ISO/IEC 23003-7:2022
(Main)Information technology — MPEG audio technologies — Part 7: Unified speech and audio coding conformance testing
Information technology — MPEG audio technologies — Part 7: Unified speech and audio coding conformance testing
This document specifies conformance criteria for both bitstreams and decoders compliant with the MPEG-D Unified speech and audio coding standard as defined in ISO/IEC 23003‑3. This is done to assist implementers and to ensure interoperability.
Technologies de l'information — Technologies audio MPEG — Partie 7: Titre manque
General Information
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 23003-7
First edition
2022-04
Information technology — MPEG
audio technologies —
Part 7:
Unified speech and audio coding
conformance testing
Reference number
© ISO/IEC 2022
© ISO/IEC 2022
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
© ISO/IEC 2022 – All rights reserved
Contents Page
Foreword .iv
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Conformance testing . .1
4.1 General . 1
4.2 USAC conformance testing . 1
4.2.1 Profiles . 1
4.2.2 Conformance tools and test procedure . 2
4.3 USAC bitstreams . 5
4.3.1 General . 5
4.3.2 USAC configuration . 5
4.3.3 Framework . 8
4.3.4 Frequency domain coding (FD mode) . 9
4.3.5 Linear predictive domain coding (LPD mode) . 11
4.3.6 Common core coding tools .12
4.3.7 Enhanced spectral band replication (eSBR) .12
4.3.8 eSBR – Predictive vector coding (PVC) . 15
4.3.9 eSBR – Inter temporal envelope shaping (inter-TES) .15
4.3.10 MPEG Surround 2-1-2 .15
4.3.11 Configuration Extensions . 18
4.3.12 AudioPreRoll . 18
4.3.13 DRC . 18
4.3.14 Restrictions depending on profiles and levels . 18
4.4 USAC Decoders. 21
4.4.1 General . 21
4.4.2 FD core mode tests . 21
4.4.3 LPD core mode tests . 27
4.4.4 Combined core coding tests . 32
4.4.5 eSBR Tests .34
4.4.6 MPEG Surround 212 Tests . 42
4.4.7 Bitstream Extensions . 45
4.5 Decoder settings . 47
4.5.1 General . 47
4.5.2 Target loudness [Lou-] . 47
4.5.3 DRC effect type request [Eff-] . 47
4.6 Decoding of MPEG-4 file format parameters to support exact time alignment in
file-to-file coding .48
iii
© ISO/IEC 2022 – All rights reserved
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance
are described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria
needed for the different types of document should be noted. This document was drafted in
accordance with the editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives or
www.iec.ch/members_experts/refdocs).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents) or the IEC
list of patent declarations received (see https://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see
www.iso.org/iso/foreword.html. In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
A list of all parts in the ISO/IEC 23003 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards
body. A complete listing of these bodies can be found at www.iso.org/members.html and
www.iec.ch/national-committees.
iv
© ISO/IEC 2022 – All rights reserved
INTERNATIONAL STANDARD ISO/IEC 23003-7:2022(E)
Information technology — MPEG audio technologies —
Part 7:
Unified speech and audio coding conformance testing
1 Scope
This document specifies conformance criteria for both bitstreams and decoders compliant with the
MPEG-D Unified speech and audio coding standard as defined in ISO/IEC 23003-3. This is done to assist
implementers and to ensure interoperability.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 23003-3, Information technology — MPEG audio technologies — Part 3: Unified speech and audio
coding
3 Terms and definitions
No terms and definitions are listed in this document.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
4 Conformance testing
4.1 General
This clause specifies conformance criteria for both bitstreams and decoders compliant with the USAC
standard as defined in this document. This is done to assist implementers and to ensure interoperability.
4.2 USAC conformance testing
4.2.1 Profiles
Profiles are defined in ISO/IEC 23003-3:2020, Subclause 4.5. Some conformance criteria apply to USAC
in general, while others are specific to certain profiles and their respective levels. Conformance shall be
tested for the level of the profile with which a given bitstream or decoder claims to comply.
In addition to the conformance requirements described in this clause, a decoder, which claims to
comply with the Extended HE AAC Profile, shall fulfil conformance for the HE AAC v2 profile according
to ISO/IEC 14496-26.
© ISO/IEC 2022 – All rights reserved
4.2.2 Conformance tools and test procedure
4.2.2.1 General
To test USAC compliant audio decoders, this document provides a number of conformance test
sequences. Supplied sequences cover all profiles as defined in ISO/IEC 23003-3:2020, Subclause 4.5. For
a given test sequence, testing can be performed by comparing the output of a decoder under test with
a reference waveform. For some test sequences, the decoder requires additional input parameters, so-
called decoder settings, which are defined in 4.5. In cases where the decoder under test is followed by
additional operations (e.g. quantizing a signal to a 16 bit output signal) the conformance point is prior
to such additional operations, i.e. it is permitted to use the actual decoder output (e.g. with more than
16 bit) for conformance testing.
Measurements are carried out relative to full scale where the output signals of the decoders are
normalized to be in the range between −1,0 and +1,0.
In ISO/IEC 14496-26, a set of test methods is defined to test the output of the decoder under test against
the reference output. RMS/LSB Measurement, Segmental SNR and PNS conformance criteria are used
for the comparison. A particular test method for a certain test sequence is specified in 4.5.
For elements producing output that cannot be tested with the methods described in ISO/IEC 14496-26
specific conformance testing procedures are described in 4.5.
4.2.2.2 Conformance data
All test sequences and a worksheet (“ISO_IEC_23003-7_Conformance_Tables.xlsx”) that lists all test
sequences for each module are accessible at https:// standards .iso .org/ iso -iec/ 23003/ -7/ ed -1/ en.
NOTE All conformance test sequences for this document are accessible using this link.
For all conformance test sequences, the file names are composed of several parts, which convey
information about:
— which module of the decoder is tested;
— which channelConfigurationIndex is employed;
— which test conditions apply to the test sequence;
— which coreSbrFrameLengthIndex applies to the test sequence;
— which sampling frequency is signalled in the test sequence.
The file naming convention given in Table 1 shall be used.
Table 1 — File name conventions
Module File File name
Fd__c__.mp4
Frequency domain compressed mp4
coding (FD mode),
reference wav
Fd__c__.wav
4.3.4
Lpd__c__.mp4
Linear predictive compressed mp4
domain coding (LPD
reference wav
Lpd__c__.wav
mode), 4.3.5
Cct__c__.mp4
Combined core cod- compressed mp4
ing tools, 4.3.6
Cct__c__.wav
reference wav
Enhanced spectral compressed mp4 eSbr__c__.mp4
band replication
reference wav
eSbr__c__.wav
(eSBR), 4.3.7
© ISO/IEC 2022 – All rights reserved
Table 1 (continued)
Module File File name
Mps__c_fr_Sc__.
MPEG Surround compressed mp4
mp4
2–1-2, 4.3.10
Mps__c_fr_Sc__.
reference wav
wav
Ext__c__.mp4
Bitstream Exten- compressed mp4
sions
Ext__c____.
reference wav
wav
channelConfigurationIndex as described in ISO/IEC 23003-3:2020, Table 73.
Setup string. May consist of a concatenation of one or more abbreviations as
listed in Table 2. If no setup, string is specified the basic test conditions apply.
If no testCase is added, only one single underline character shall occur at that
position.
coreSbrFrameLengthIndex as described in .
usacSamplingFrequencyIndex as described in ISO/IEC 23003-3:2020,
Table 75. If the sampling rate is specified explicitly and signalled by means of
the escape value index the sampling rate value in Hz is placed in the file name
instead of the index value, e.g. “Lpd_1_c1_Bpf_6000.mp4” for a sampling fre-
quency of 6000 Hz.
bsFreqRes as described in ISO/IEC 23003-1:2007, Table 39.
stereoConfigIndex as described in ISO/IEC 23003-3:2020, Table 77.
Setup string. May consist of a concatenation of one or more abbreviations as
listed in Table 3. If no decoderSetting is added, no underline character shall
occur after .
Table 2 — Test conditions and abbreviations
Module Test condition Abbrev.
FD window switching test condition Win
Noise filling test condition Nf
Temporal Noise Shaping (TNS) test condition Tns
Varying max_sfb test condition Sfb
FD core mode Handling of extensions condition Ex
Context adaptive arithmetic coder test condition Ac
Non-meaningful FD window switching test condtion Nmf
M/S stereo test condition Ms
Complex prediction stereo test condition Cp
Linear predictive coding (LPC) test condition Lpc
Algebraic code excited linear prediction (ACELP) core mode test condition Ace
Transform coded excitation (TCX) and noise filling test condition Tcx
LPD core mode
LPD mode coverage and FAC test condition Lpd
Bass-post filter test condition Bpf
Algebraic vector quantizer (AVQ) test condition Avq
© ISO/IEC 2022 – All rights reserved
Table 2 (continued)
Module Test condition Abbrev.
FD-LPD transition and FAC test condition Flt
FD/TCX noise filling test condition Cnf
Bass-post filter test condition Cbf
Combined core coding
synchr. FD-LPD transition and FAC test condition Flts
asynchr. FD-LPD transition and FAC test condition Flta
Context adaptive arithmetic coder test condition CAc
Quadrature mirror filter (QMF) accuracy test condition Qma
Envelope adjuster accuracy and SBR preprocessing test condition Eaa
Header and grid control test condition test condition Hgt
Inverse filtering test condition Ift
Additional sine test (missing harmonics) test condition Ast
Sampling rate test condition Sr
Channel mode test condition Cm
eSbr interTes test condition Tes
Predictive vector coding (PVC) test condition Pvc
Harmonic transposition (QMF) test condition Htq
Harmonic transposition (crossproducts) test condition Xp
Transposer toggle test condition Ttt
Envelope shaping toggle (PVC on/off) test condition Est
Varying crossover frequency test condition Xo
stereoConfigIndex test condition Mps
Transient steering decorrelator (TSD) test condition Tsd
Rate mode test condition Rm
Phase coding test condition Pc
Decorrelator configuration. test condition Dc
Downmix (DMX) gain test condition Dm
Mpeg surround 212
Bands phase test condition Bp
Pseudo lr test condition Plr
Residual bands test condition Rb
Temporal Shaping Enabling test condition Tse
Smoothing mode test condition Smg
AudioPreRoll() and streamID condition, immediate play-out frame (IPF) I-foo-
Bitstream extensions Loudness normalization test condition Ln
Dynamic range control test condition Drc
Table 3 — Decoder setting conditions
Decoder setting Abbrev.
Target loudness Lou-
DRC effect type request Eff-
© ISO/IEC 2022 – All rights reserved
4.3 USAC bitstreams
4.3.1 General
4.3.1.1 Characteristics
Characteristics of bitstreams specify the constraints that are applied by the encoder in generating the
bitstreams. These syntactic and semantic constraints may for example restrict the range or the values
of parameters that are encoded directly or indirectly in the bitstreams. The constraints applied to a
given bitstreams may or may not be known a priori.
4.3.1.2 Test procedure
Each USAC bitstream shall meet the syntactic and semantic requirements specified in this document.
The present subclause defines the conformance criteria that shall be fulfilled by a compliant bitstream.
These criteria are specified for the syntactic elements of the bitstream and for some parameters
decoded from the USAC bitstream payload.
For each tool, a set of semantic tests to be performed on the bitstreams is described. To verify whether
the syntax is correct is straightforward and therefore not defined herein after. In the description of
the semantic tests, it is assumed that the tested bitstreams contains no errors due to transmission or
other causes. For each test the condition or conditions that shall be satisfied are given, as well as the
prerequisites or conditions in which the test can be applied.
4.3.2 USAC configuration
4.3.2.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) usacSamplingFrequencyIndex;
b) usacSamplingFrequency;
c) coreSbrFrameLengthIndex;
d) channelConfigurationIndex;
e) presence of configuration extensions;
f) numOutChannels;
g) bsOutputChannelPos;
h) numElements;
i) stereoConfigIndex;
j) use of time warped MDCT;
k) use of noise filling in FD mode;
l) use of the eSBR harmonic transposer;
m) use of the eSBR inter-TES tool;
n) use of the eSBR PVC tool;
o) SBR default header, for details see 4.3.7;
p) MPS config, for details see 4.3.10.
© ISO/IEC 2022 – All rights reserved
4.3.2.2 Test procedure
4.3.2.2.1 UsacConfig()
usacSamplingFrequencyIndex Shall be encoded with a non-reserved value specified in ISO/
IEC 23003-3:2020, Table 72. For further profile and level dependent
restrictions see 4.3.11.
usacSamplingFrequency No restrictions apply. For profile and level dependent restrictions,
see 4.3.11.
coreSbrFrameLengthIndex No restrictions apply.
channelConfigurationIndex Shall be encoded with a non-reserved value specified in ISO/
IEC 23003-3:2020, Table 73. For further profile and level depend-
ent restrictions see 4.3.11. In the case of channelConfigurationIn-
dex == 0 further restrictions apply as described in 4.3.2.2.2.
usacConfigExtensionPresent No restrictions apply.
4.3.2.2.2 UsacChannelConfig()
numOutChannels No restrictions apply. For profile and level dependent restrictions, see
4.3.11.
bsOutputChannelPos A bsOutputChannelPos of value 3 or 26 (LFE speaker positions) shall
be associated with an LFE channel. Any other value shall be associated
with a main audio channel.
4.3.2.2.3 UsacDecoderConfig()
numElements The value of this data element shall be such that the accumulated sum
of all channels contained in the bitstream complies with the restric-
tions outlined in 4.3.2.2.1.
usacElementType No restrictions apply. For profile and level dependent restrictions, see
4.3.11.
4.3.2.2.4 UsacSingleChannelElementConfig()
No restrictions are applicable to this bitstream element.
4.3.2.2.5 UsacChannelPairElementConfig()
The UsacChannelPairElementConfig() element and all included elements can only be present when
coding more than one output channel (see restrictions applying to UsacConfig() in 4.3.2.2.1).
stereoConfigIndex No restrictions apply.
4.3.2.2.6 UsacLfeElementConfig()
No restrictions are applicable to this bitstream element.
4.3.2.2.7 UsacCoreConfig()
tw_mdct No restrictions apply. For profile and level dependent restrictions,
see 4.3.11.
noiseFilling No restrictions apply.
© ISO/IEC 2022 – All rights reserved
4.3.2.2.8 SbrConfig()
harmonicSBR No restrictions apply.
bs_interTes No restrictions apply.
bs_pvc No restrictions apply.
4.3.2.2.9 SbrDfltHeader()
dflt_start_freq No restrictions apply.
dflt_stop_freq No restrictions apply.
dflt_header_extra1 No restrictions apply.
dflt_header_extra2 No restrictions apply.
dflt_freq_scale No restrictions apply.
dflt_alter_scale No restrictions apply.
dftl_nose_bands No restrictions apply.
dflt_limiter_bands No restrictions apply.
dflt_limiter_gains No restrictions apply.
dflt_interpol_freq No restrictions apply.
dflt_smoothing_mode No restrictions apply.
4.3.2.2.10 Mps212Config()
bsFreqRes Shall not be encoded with a value of 0.
bsFixedGainDMX No restrictions apply
bsTempShapeConfig No restrictions apply.
bsDecorrConfig Shall not be encoded with a value of 3.
bsHighRateMode No restrictions apply.
bsPhaseCoding No restrictions apply.
bsOttBandsPhasePresent No restrictions apply.
bsOttBandsPhase Shall not be encoded with a value larger than the value of numBands
as given by ISO/IEC 23003-1:2007, 5.2, Table 39 and depends on
bsFreqRes.
bsResidualBands Shall not be encoded with a value larger than the value of numBands
as given by ISO/IEC 23003-1:2007, 5.2, Table 39 and depends on
bsFreqRes.
bsPseudoLr No restrictions apply.
bsEnvQuantMode Shall be 0.
4.3.2.2.11 UsacExtElementConfig()
usacExtElementType No restrictions apply.
usacExtElementConfigLength No restrictions apply.
usacExtElementDefaultLengthPre- No restrictions apply.
sent
usacExtElementDefaultLength No restrictions apply.
usacExtElementPayloadFrag No restrictions apply.
© ISO/IEC 2022 – All rights reserved
4.3.2.2.12 UsacConfigExtension()
numConfigExtensions No restrictions apply.
usacConfigExtType[] No restrictions apply.
usacConfigExtLength[] No restrictions apply.
fill_byte Should be ‘10100101’.
4.3.3 Framework
4.3.3.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) signalling of independently decodable frames;
b) presence of extension elements;
c) core_mode;
d) presence of TNS.
4.3.3.2 Test procedure
4.3.3.2.1 UsacFrame()
usacIndependencyFlag No restrictions apply.
4.3.3.2.2 UsacSingleChannelElement
No restrictions are applicable to this bitstream element.
4.3.3.2.3 UsacChannelPairElement
No restrictions are applicable to this bitstream element.
4.3.3.2.4 UsacLfeElement
No restrictions are applicable to this bitstream element.
4.3.3.2.5 UsacExtElement
usacExtElementPresent No restrictions apply.
usacExtElementUseDefaultLength No restrictions apply.
usacExtElementPayloadLength No restrictions apply.
usacExtElementStart No restrictions apply.
usacExtElementStop No restrictions apply.
usacExtElementSegmentData No restrictions apply.
4.3.3.2.6 UsacCoreCoderData
core_mode No restrictions apply.
tns_data_present No restrictions apply.
© ISO/IEC 2022 – All rights reserved
4.3.4 Frequency domain coding (FD mode)
4.3.4.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) use of noise filling;
b) window_shape;
c) M/S Stereo;
d) use of TNS;
e) complex prediction stereo coding;
f) max_sfb;
g) use of time warped MDCT;
h) use of long blocks;
i) use of short blocks.
4.3.4.2 Test procedure
4.3.4.2.1 fd_channel_stream
global_gain No restrictions apply.
noise_level No restrictions apply.
noise_offset No restrictions apply.
fac_data_present Shall be 0, if the core_mode of the preceding frame of the same
channel was 0 or if mod[3] of the preceding frame of the same
channel was > 0.
4.3.4.2.2 ics_info
window_sequence A conformant bitstream shall consist of only meaningful win-
dow_sequence transitions. However, decoders are required to
handle non-meaningful window_sequence transitions as well.
The meaningful window_sequence transitions are shown in ISO/
IEC 23003-3:2020, Table 138.
window_shape A compliant bitstream shall set window_shape to 0 if the next
block is encoded in LPD coding mode. However, decoders are re-
quired to handle both window_shapes for all transitions.
max_sfb Shall be < = num_swb_long or num_swb_short as appropriate for
window_sequence and sampling frequency and core coder frame
length.
scale_factor_grouping No restrictions apply.
4.3.4.2.3 tw_data
tw_data_present No restrictions apply.
tw_ratio No restrictions apply.
© ISO/IEC 2022 – All rights reserved
4.3.4.2.4 scale_factor_data
hcod_sf Shall only be encoded with the values listed in the scalefactor Huff-
man table. Shall be encoded such that the decoded scalefactors sf[g]
[sfb] are within the range of zero to 255, both inclusive.
4.3.4.2.5 tns_data
n_filt No restrictions apply.
coef_res No restrictions apply.
length Shall be small enough such that the lower bound of the filtered re-
gion does not exceed the start of the array containing the spectral
coefficients.
order Shall not exceed the values listed in ISO/IEC 23003-3:2020,
Table 135.
direction No restrictions apply.
coef_compress No restrictions apply.
coef No restrictions apply.
4.3.4.2.6 ac_spectral_data
arith_reset_flag No restrictions apply.
4.3.4.2.7 StereoCoreToolInfo
tns_active No restrictions apply.
common_window No restrictions apply.
common_max_sfb No restrictions apply.
max_sfb1 Shall be < = num_swb_long or num_swb_short as appropriate for win-
dow_sequence and sampling frequency and core coder frame length.
ms_mask_present No restrictions apply.
ms_used No restrictions apply.
common_tw No restrictions apply.
common_tns No restrictions apply.
tns_on_lr No restrictions apply.
tns_present_both No restrictions apply.
tns_data_present No restrictions apply.
4.3.4.2.8 cplx_pred_data
cplx_pred_all No restrictions apply.
cplx_pred_used No restrictions apply.
pred_dir No restrictions apply.
complex_coef No restrictions apply.
use_prev_frame Shall be 0 if the core transform length of previous frame is different
from the core transform length of the current frame or if the core_mode
of the previous frame is 1.
delta_code_time No restrictions apply.
hcod_sf No restrictions apply.
© ISO/IEC 2022 – All rights reserved
4.3.5 Linear predictive domain coding (LPD mode)
4.3.5.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) acelp_core_mode;
b) lpd_mode (use of ACELP, short TCX, medium TCX, and long TCX);
c) activation of bass-post filter.
4.3.5.2 Test procedure
4.3.5.2.1 lpd_channel_stream
acelp_core_mode Shall be encoded with a value in the range of 0 to 5, both inclusive.
lpd_mode Shall be encoded with a non-reserved value listed in ISO/
IEC 23003-3:2020, Table 94.
bpf_control_info No restrictions apply.
core_mode_last Shall be encoded with the value of data element core_mode of the
previous frame.
fac_data_present Shall be 0, if the core_mode of the preceding frame of the same chan-
nel was 0 and mod[0] of the current frame is > 0, or if mod[0] of the
current frame is > 0 and mod[3] of the preceding frame of the same
channel was > 0.
short_fac_flag Shall be encoded with a value of 1 if the window_sequence of the
previous frame was 2 (EIGHT_SHORT_SEQUENCE). Otherwise,
short_fac_flag shall be encoded with a value of 0.
4.3.5.2.2 lpc_data
lpc_first_approximation_index No restrictions apply.
4.3.5.2.3 qn_data
qn The codebook number shall be encoded as described in ISO/
IEC 23003-3:2020, 7.13.7.2.
qn_base No restrictions apply.
qn_ext No restrictions apply.
4.3.5.2.4 get_mode_lpc
binary_code Shall be encoded with the values listed in ISO/IEC 23003-3:2020,
Table 148 in the column Binary Code.
4.3.5.2.5 code_book_indices
code_book_index No restrictions apply.
kv No restrictions apply.
© ISO/IEC 2022 – All rights reserved
4.3.5.2.6 acelp_coding
mean_energy No restrictions apply.
acb_index The adaptive codebook index shall be encoded as described in ISO/
IEC 23003-3:2020, 7.14.5.1.
ltp_filtering_flag No restrictions apply.
icb_index The innovation codebook excitation shall be encoded as described
in ISO/IEC 23003-3:2020, 7.14.5.2.
gains No restrictions apply.
4.3.5.2.7 tcx_coding
noise_factor No restrictions apply.
global_gain No restrictions apply.
arith_reset_flag No restrictions apply.
4.3.6 Common core coding tools
4.3.6.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) use of context adaptive arithmetic coder reset.
4.3.6.2 Test procedure
4.3.6.2.1 arith_data
acod_m Shall be encoded as described in ISO/IEC 23003-3:2020, 7.4.3.
acod_r Shall be encoded as described in ISO/IEC 23003-3:2020, 7.4.3.
s No restrictions apply.
4.3.6.2.2 fac_data
fac_gain No restrictions apply.
4.3.7 Enhanced spectral band replication (eSBR)
4.3.7.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) use of the eSBR harmonic transposer;
b) use of Crossproducts in eSBR harmonic transposer;
c) use of the eSBR inter-TES tool;
d) choice of SBR ratio;
e) choice of amplitude resolution;
f) choice of SBR crossover band;
g) use of SBR preprocessing (prewhitening);
© ISO/IEC 2022 – All rights reserved
h) use of the eSBR PVC tool.
4.3.7.2 Test procedure
4.3.7.2.1 General
The present subclause defines the conformance criteria that shall be fulfilled by a compliant bitstream
that utilize the Enhanced SBR tool.
4.3.7.2.2 UsacSbrData
sbrInfoPresent No restrictions apply.
sbrHeaderPresent No restrictions apply.
sbrUseDfltHeader No restrictions apply.
4.3.7.2.3 SbrInfo
bs_amp_res No restrictions apply.
bs_xover_band Shall define a value that does not exceed the limits defined in ISO/
IEC 14496-3:2009, 4.6.18.3.6.
bs_sbr_preprocessing No restrictions apply.
bs_pvc_mode Shall be encoded with a non-reserved value specified in ISO/
IEC 23003-3:2020, Table 101.
4.3.7.2.4 SbrHeader
bs_start_freq Shall define a frequency band that does not exceed the limits de-
fined in ISO/IEC 23003-3:2020, 7.5.5 and ISO/IEC 14496-3:2009,
4.6.18.3.6.
bs_stop_freq Shall define a frequency band that does not exceed the limits de-
fined in ISO/IEC 23003-3:2020, 7.5.5 and ISO/IEC 14496-3:2009,
4.6.18.3.6.
bs_header_extra1 No restrictions apply.
bs_header_extra2 No restrictions apply.
bs_freq_scale No restrictions apply.
bs_alter_scale No restrictions apply.
bs_noise_bands Shall define a value that does not exceed the limits defined in ISO/
IEC 14496-3:2009, 4.6.18.3.6.
bs_limiter_bands No restrictions apply.
bs_limiter_gains No restrictions apply.
bs_interpol_freq No restrictions apply.
bs_smoothing_mode No restrictions apply.
4.3.7.2.5 sbr_single_channel_element
sbrPatchingMode No restrictions apply.
sbrOversamplingFlag No restrictions apply.
sbrPitchInBinsFlag No restrictions apply.
sbrPitchInBins No restrictions apply.
bs_add_harmonic_flag No restrictions apply.
© ISO/IEC 2022 – All rights reserved
4.3.7.2.6 sbr_channel_pair_element
bs_coupling No restrictions apply.
sbrPatchingMode No restrictions apply.
sbrOversamplingFlag No restrictions apply.
sbrPitchInBinsFlag No restrictions apply.
sbrPitchInBins No restrictions apply.
bs_add_harmonic_flag No restrictions apply.
4.3.7.2.7 sbr_grid
bs_frame_class Shall define a value that does not exceed the limits defined in ISO/
IEC 23003-3:2020, 7.5.1.3 and ISO/IEC 14496-3:2009, 4.6.18.3.6.
tmp (Determines bs_num_env), no restrictions apply.
bs_freq_res No restrictions apply.
bs_pointer Shall be encoded with a value listed in ISO/IEC 14496-3:2009,
Table 4.174.
The restrictions defined in ISO/IEC 14496-26:2010, 7.17.1.2.1.3 sbr_grid() shall be applied to the
following corresponding bitstream elements:
bs_var_bord_0
bs_var_bord_1
bs_num_rel_0
bs_num_rel_1
bs_noise_position Shall be chosen so that the time slot borders for noise floors fall
within the leading and trailing SBR frame borders (i.e. the SBR
frame boundaries).
bs_var_len_hf Shall be encoded with a non-reserved value specified in ISO/
IEC 23003-3:2020, Table 102.
4.3.7.2.8 sbr_envelope
bs_env_start_value_balance No restrictions apply.
bs_env_start_value_level No restrictions apply.
bs_codeword Shall be encoded as defined in sbr_huff_dec() in ISO/IEC 14496-
3:2009, 4.A.6.1.
Additionally, the restrictions defined in ISO/IEC 14496-26:2010, 7.17.1.2.1.5 sbr_envelope() apply.
4.3.7.2.9 dtdf
bs_df_env No restrictions apply.
bs_df_noise No restrictions apply.
4.3.7.2.10 sbr_sinusoidal_coding
bs_add_harmonic No restrictions apply.
bs_sinusoidal_position_flag No restrictions apply.
bs_sinusoidal_position Shall be chosen so that the position of the starting time slot for sinu-
soids fall within the SBR frame boundaries.
© ISO/IEC 2022 – All rights reserved
4.3.7.2.11 sbr_invf
No restrictions are applicable to this bitstream element.
4.3.7.2.12 sbr_noise
The restrictions defined in ISO/IEC 14496-26:2010, 7.17.1.2.1.6 sbr_noise() apply.
4.3.8 eSBR – Predictive vector coding (PVC)
4.3.8.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) activation of PVC;
b) use of IDs from the previous frame;
c) length.
4.3.8.2 Test procedures for pvc_envelope
divMode No restrictions apply.
nsMode No restrictions apply.
Reuse_pvcID Shall be 0 if the bs_pvc_mode of the preceding SBR frame was 0.
pvcID No restrictions apply.
length Shall be chosen so that the time slot borders for pvcid fall within
the SBR frame boundaries.
grid_info The first grid_info (grid_info[0]) shall be 1 if the bs_pvc_mode of
the preceding SBR frame was 0.
4.3.9 eSBR – Inter temporal envelope shaping (inter-TES)
4.3.9.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) activation of inter-TES.
4.3.9.2 Test procedure for sbr_envelope
bs_temp_shape No restrictions apply.
bs_inter_temp_shape_mode No restrictions apply.
4.3.10 MPEG Surround 2-1-2
4.3.10.1 Characteristics
Encoders may apply restrictions to the following parameters of the bitstream:
a) use of phase coding;
b) use of residual coding;
c) use of pseudo LR;
© ISO/IEC 2022 – All rights reserved
d) use of Transient Steering Decorrelator.
4.3.10.2 Test procedure
4.3.10.2.1 Mps212Data
bsIndependencyFlag No restrictions apply.
4.3.10.2.2 FramingInfo
bsFramingType No restrictions apply.
bsNumParamSets Shall have a value not larger than (numSlots-1)/4, where the division
shall be interpreted as an ANSI C integer division.
bsParamSlot Shall be in the range [0…numSlots-1].
4.3.10.2.3 OttData
bsPhaseMode No restrictions apply.
bsOPDSmoothingMode No restrictions apply.
4.3.10.2.4 SmgData
bsSmoothMode No restrictions apply.
bsSmoothTime No restrictions apply.
bsFreqResStrideSmg No restrictions apply.
bsSmgData No restrictions apply.
4.3.10.2.5 TempShapeData
bsTsdEnable No restrictions apply.
bsTempShapeEnable No restrictions apply.
bsTempShapeEnableChannel No restrictions apply.
4.3.10.2.6 TsdData
bsTsdNumTrSlots Shall be encoded with 4 or 5 bits depending on numSlots.
bsTsdCodedPos No restrictions apply.
bsTsdTrPhaseData No restrictions apply.
4.3.10.2.7 EcData
bsXXXdataMode Shall fulfil the requirements outlined in ISO/IEC 23003-1:2007,
6.1.13. Shall not be encoded with a value of 2 if residual coding is
applied. Shall have the value 0 or 3 if ps = = 0 and bsindependency-
flag is set to 1.
bsDataPairXXX Shall have the value 0 if setidx = = data sets-1. No further restric-
tions apply.
bsQuantCoarseXXX No restrictions apply.
bsFreqResStrideXXX No restrictions apply.
© ISO/IEC 2022 – All rights reserved
4.3.10.2.8 EcDataPair
bsPcmCodingXXX No restrictions apply.
4.3.10.2.9 GroupedPcmData
bsPcmWord No restrictions apply.
4.3.10.2.10 DiffHuffData
bsDiffType No restrictions apply.
bsCodingScheme No restrictions apply.
bsPairing No restrictions apply.
bsDiffTimeDirection No restrictions apply.
4.3.10.2.11 HuffData1D
hcodFirstband_XXX bsCodeW shall have a value out of a set of values as defined by col-
umn 'codeword' in ISO/IEC 23003-1:2007, Tables A.2 and A.3, for
CLD and ICC respectively. For IPD, in Table A.2. Shall have a length
as defined by the corresponding entry in column 'length'.
hcod1D_XXX_YY bsCodeW shall have a value out of a set of values as defined by column
'codeword' in ISO/IEC 23003-1:2007, Tables A.5 and A.6, for CLD and
ICC respectively. For IPD, in Table A.3. Shall have a length as defined
by the corresponding entry in column 'length'.
bsSign Do not apply to the encoding of IPD parameters. No further restric-
tions apply.
4.3.10.2.12 HuffData2DFreqPair, HuffData2DTimePair
hcodLavIdx bsCodeW shall have a value out of a set of values as defined by
column 'codeword' in ISO/IEC 23003-1:2007, Tables A.24, and
shall have a length as defined by the corresponding entry in col-
umn 'length'.
hcod2D_XXX_YY_ZZ_LL_escape bsCodeW shall have a value out of a set of values as defined by
column 'codeword' in ISO/IEC 23003-1:2007, Tables A.8 and A.9, for
CLD and ICC respectively. For IPD, in Table A.4. Shall have a length
as defined by the corresponding entry in column 'length'.
hcod2D_XXX_YY_ZZ_LL bsCodeW shall have a value out of a set of values as defined by
column 'codeword' of the applicable table in ISO/IEC 23003-1:2007,
Tables A.11 to A.18, for CLD and ICC. For IPD, in Tables A.5 to A.8.
Shall have a length as defined by the corresponding entry in column
'length'.
4.3.10.2.13 SymmetryData
bsSymBit No restrictions apply.
4.3.10.2.14 LsbData
bsLsb No restrictions apply.
© ISO/IEC 2022 – All rights reserved
4.3.11 Configuration Extensions
4.3.11.1 streamId()
streamIdentifier No restrictions apply.
4.3.11.2 loudnessInfoSet()
The loudnessInfoSet() bitstream structure shall be restricted as specified in ISO/IEC 23003-4.
4.3.12 AudioPreRoll
4.3.12.1 Recursive presence of AudioPreRoll extension payload
An access unit, which is part of an AudioPreRoll, shall not have usacExtElementPresent equal to 1 for
the extension payload type ID_EXT_ELE_AUDIOPREROLL. That means there shall be no recursively
embedded AudioPreRoll extension payload.
4.3.12.2 AudioPreRoll()
configLen No restrictions apply.
applyCrossfade No restrictions apply.
reserved Should be 0.
numPreRollFrames Shall not be larger than 3.
auLen No restrictions apply.
4.3.13 DRC
4.3.13.1 uniDrcConfig()
The uniDrcConfig bitstream structure shall be restricted as specified in ISO/IEC 23003-4.
4.3.13.2 uniDrcGain()
The uniDrcGain bitstream structure shall be restricted as specified in ISO/IEC 23003-4.
4.3.14 Restrictions depending on profiles and levels
4.3.14.1 General
Depending on the profile and level associated with the USAC bitstream, further restrictions may apply.
4.3.14.2 Baseline USAC profile
4.3.14.2.1 usacSamplingFrequencyIndex
For Baseline USAC Profile usacSamplingFrequencyIndex shall be encoded with a value specified in
Table 4.
© ISO/IEC 2022 – All rights reserved
Table 4 — Specification of usacSamplingFrequencyIndex
and usacSamplingFrequency in Baseline USAC Profile
Level
1 2 3 4 5
0x03…0x0c, 0x03…0x0c, 0x03…0x0c, 0x00…0x0c,
0x11…0x1b 0x11…0x1b 0x11…0x1b 0x0f…0x1b
usacSamplingFrequencyIndex/
N / A
usacSamplingFrequency
0x1f 0x1f 0x1f 0x1f
/ ≤ 48000 / ≤ 48000 / ≤ 48000 / ≤ 96000
Furthermore, for the Baseline USAC Profile the employed sampling rates shall be one out of those listed
in ISO/IEC 23003-3:2020, Table 3.
4.3.14.2.2 channelConfigurationIndex
For Baseline USAC Profile channelConfigurationIndex shall be encoded with a value specified in Table 5.
Table 5 — Specification of channelConfigurationIndex
in Baseline USAC Profile
Level
1 2 3 4 5
0.6, 0.6,
channelConfigurationIndex 0, 1 0, 1, 2, 8 N / A
8.10 8.10
4.3.14.2.3 numOutChannels
For Baseline USAC Profile numOutChannels shall be encoded with a value specified in Table 6. Further
restrictions apply to the number of main audio channels (channels conveyed in UsacSCEs and UsacCPEs)
and LFE channels (conveyed in UsacLFEs) as shown in Table 6.
Table 6 — Specification of numOutChannels
for Baseline USAC Profile
Level
1 2 3 4 5
numOutChannels ≤ 1 ≤ 2 ≤ 6 ≤ 6 N / A
number of main audio channels ≤ 1 ≤ 2 ≤ 5
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...