3D Audio baseline profile, corrections and improvements

Profil de base audio 3D, corrections et améliorations

General Information

Status
Published
Publication Date
02-Nov-2020
Current Stage
5060 - Close of voting Proof returned by Secretariat
Start Date
28-Oct-2020
Completion Date
27-Oct-2020
Ref Project

RELATIONS

Buy Standard

Standard
ISO/IEC 23008-3:2019/Amd 2:2020 - 3D Audio baseline profile, corrections and improvements
English language
18 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
ISO/IEC 23008-3:2019/FDAmd 2:Version 13-okt-2020 - 3D Audio baseline profile, corrections and improvements
English language
18 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (sample)

INTERNATIONAL ISO/IEC
STANDARD 23008-3
Second edition
2019-02
AMENDMENT 2
2020-11
Information technology — High
efficiency coding and media delivery
in heterogeneous environments —
Part 3:
3D audio
AMENDMENT 2: 3D Audio baseline
profile, corrections and improvements
Technologies de l'information — Codage à haute efficacité et livraison
des medias dans des environnements hétérogènes —
Partie 3: Audio 3D
AMENDEMENT 2: Profil de base audio 3D, corrections et
améliorations
Reference number
ISO/IEC 23008-3:2019/Amd.2:2020(E)
ISO/IEC 2020
---------------------- Page: 1 ----------------------
ISO/IEC 23008-3:2019/Amd.2:2020(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2020

All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may

be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting

on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address

below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC 23008-3:2019/Amd.2:2020(E)
Foreword

ISO (the International Organization for Standardization) and IEC (the International Electrotechnical

Commission) form the specialized system for worldwide standardization. National bodies that

are members of ISO or IEC participate in the development of International Standards through

technical committees established by the respective organization to deal with particular fields of

technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other

international organizations, governmental and non-governmental, in liaison with ISO and IEC, also

take part in the work.

The procedures used to develop this document and those intended for its further maintenance are

described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for

the different types of document should be noted. This document was drafted in accordance with the

editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).

Attention is drawn to the possibility that some of the elements of this document may be the subject

of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent

rights. Details of any patent rights identified during the development of the document will be in the

Introduction and/or on the ISO list of patent declarations received (see www .iso .org/ patents) or the IEC

list of patent declarations received (see https:// patents .iec .ch).

Any trade name used in this document is information given for the convenience of users and does not

constitute an endorsement.

For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and

expressions related to conformity assessment, as well as information about ISO's adherence to the

World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/

iso/ foreword .html.

This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,

Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.

A list of all parts in the ISO/IEC 23008 series can be found on the ISO website.

Any feedback or questions on this document should be directed to the user’s national standards body. A

complete listing of these bodies can be found at www .iso .org/ members .html.
© ISO/IEC 2020 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/IEC 23008-3:2019/Amd.2:2020(E)
Information technology — High efficiency coding and
media delivery in heterogeneous environments —
Part 3:
3D audio
AMENDMENT 2: 3D Audio baseline profile, corrections and
improvements
Subclause 4.8.2 (Profiles)
After list item 3, add:

4) The baseline profile is a subset of the low-complexity profile which supports channel and object

signals.
Replace Table 2 with:

Table 2 — Summary of the location of and normative reference to the definitions of

MPEG-H 3D audio profiles
MPEG-H 3D
MPEG-H 3D
MPEG-H 3D audio
audio
Defined in Sub- USAC
audio
Tool/Module
Low-
ISO/IEC clause 23003-3
Baseline
High profile complexity
profile
profile
block
14496-3 4.6.11 X X X X
switching
AAC based 14496-3 4.6.11 X X X X
window
Additional
shapes
23003-3 6.2.9.3 X X X X
windows
AAC based 14496-3 4.6.11 X X X X
filter bank
additional USAC 23003-3 7.9 X X X X
TNS 14496-3 4.6.9 X X X X
intensity 14496-3 4.6.8.2
coupling 14496-3 4.6.8.3
perceptual PNS 14496-3 4.6.13
noise
noise filling 23003-3 7.2 X X X X
synthesis
Restrictions apply dependent on the levels.

Implementation of binaural rendering is only mandated if headphone reproduction is supported.

Multi-band DRC-1 shall be applied in the STFT domain of the TD format converter.

The TD format converter downmix shall be applied for downmixing.

In order to achieve target complexity for the LC profile at a given level, study Annex G.

File format encapsulation is independent of the profile that is used for the bitstream. A profile level indicator is part of

the file format specification (see subclause 20.4).
© ISO/IEC 2020 – All rights reserved 1
---------------------- Page: 4 ----------------------
ISO/IEC 23008-3:2019/Amd.2:2020(E)
Table 2 (continued)
MPEG-H 3D
MPEG-H 3D
MPEG-H 3D audio
audio
Defined in Sub- USAC
audio
Tool/Module
Low-
ISO/IEC clause 23003-3
Baseline
High profile complexity
profile
profile
basic mid/side
14496-3 4.6.8.1 X X X X
coding
MDCT based com-
23003-3 7.7.2 X X X X
plex prediction
non-uniform 14496-3 4.6.1 X X X X
quantization
uniform 23003-3 7.1 X X X
Huffman 14496-3 4.6.3
entropy
context adaptive
coding
23003-3 7.4 X X X X
arithmetic coding
base 14496-3 4.6.18 X X
SBR
enhanced 23003-3 7.5 X X
8.6.4 /
parametric stereo 14496-3
8.A
paramet- MPEG surround
ric stereo 2-1-2 (incl. residu- 23003-3 6.2.13 X X
extension al coding)
quad channel
23008-3 5,5 X
element
ACELP 23003-3 7.14 X X X
frequency scale factor based 14496-3 4.6.2 X X X X
domain
noise shap-
LPC based 23003-3 X X X
ing
intelligent
IGF for FD 23008-3 X X X
gap filling
IGF for TCX and
23008-3 X X
improved
TBE in ACELP
LPD coding
LPD stereo 23008-3 X X
frequency-
predictors domain prediction
23008-3 X X X
for FD and time-domain
post-filtering
frequency-
predictors domain prediction
23008-3 X X
for TCX and time-domain
post-filtering
Restrictions apply dependent on the levels.

Implementation of binaural rendering is only mandated if headphone reproduction is supported.

Multi-band DRC-1 shall be applied in the STFT domain of the TD format converter.

The TD format converter downmix shall be applied for downmixing.

In order to achieve target complexity for the LC profile at a given level, study Annex G.

File format encapsulation is independent of the profile that is used for the bitstream. A profile level indicator is part of

the file format specification (see subclause 20.4).
2 © ISO/IEC 2020 – All rights reserved
---------------------- Page: 5 ----------------------
ISO/IEC 23008-3:2019/Amd.2:2020(E)
Table 2 (continued)
MPEG-H 3D
MPEG-H 3D
MPEG-H 3D audio
audio
Defined in Sub- USAC
audio
Tool/Module
Low-
ISO/IEC clause 23003-3
Baseline
High profile complexity
profile
profile
discrete
multi-chan- MCT 23008-3 X X X
nel coding
format
d d
generic downmix 23008-3 10, 24 X X X
converter
immersive ren-
immersive
d d
dering within 23008-3 11, 25 X X X
rendering
format converter
metadata audio
elements (MAE)
and audio scene
static
23008-3 15 X X X
information (ASI)
metadata
decoder and
renderer
object audio
dynamic
metadata (OAM)
object 23008-3 7, 8 X X X
decoder and
metadata
renderer
MPEG surround
MPS 23003-1 10 X
extension
decoder and
SAOC-3D 23008-3 9 X
renderer
decoder and
23008-3 12 X X
renderer
near field
23008-3 X X
compensation
subband direc-
23008-3 X
HOA tional prediction
parametric ambi-
ance replication 23008-3 X
(PAR)
phase-based
23008-3 X
decorrelation
FD-binaural,
b b
23008-3 13 X X X
TD-binaural
Binaural
HOA2Binaural
23008-3 X X
H2B
Restrictions apply dependent on the levels.

Implementation of binaural rendering is only mandated if headphone reproduction is supported.

Multi-band DRC-1 shall be applied in the STFT domain of the TD format converter.

The TD format converter downmix shall be applied for downmixing.

In order to achieve target complexity for the LC profile at a given level, study Annex G.

File format encapsulation is independent of the profile that is used for the bitstream. A profile level indicator is part of

the file format specification (see subclause 20.4).
© ISO/IEC 2020 – All rights reserved 3
---------------------- Page: 6 ----------------------
ISO/IEC 23008-3:2019/Amd.2:2020(E)
Table 2 (continued)
MPEG-H 3D
MPEG-H 3D
MPEG-H 3D audio
audio
Defined in Sub- USAC
audio
Tool/Module
Low-
ISO/IEC clause 23003-3
Baseline
High profile complexity
profile
profile
c c
DRC-1 23003-4 X X X
DRC-2 (single
23003-4 X X X
band)
DRC
DRC-2 (multi
23003-4
band)
DRC-3 (single
23003-4 X X X
band)
sample rate
23008-3 X X X
converter
23008-3
unguided clipping
peak limiter D X X X
prevention
23003-4
loudness metada-
23003-4 6 X X X
ta and handling
loudness
loudness com-
23008-3 X X X
pensation
MPEG-H 3D
23008-3 14 X X X
Audio stream
truncation
MHAS
message and CRC
23008-3 X X X
packet type, ASI
packet type
carriage of
MPEG-H 3D audio
file format 23008-3 f
in ISO base media
file format
interfaces and
interfaces
processing for in-
and process- 23008-3 17,18 X X X
teraction data and
ing
local setup info
carriage of
generic data for
carriage of
the interaction 23008-3 X X
generic data
with system
engine
tonal component
TCC 23008-3 X
coding
IC internal channel 23008-3 X
high resolution
HREP envelope 23008-3 X
processing
Restrictions apply dependent on the levels.

Implementation of binaural rendering is only mandated if headphone reproduction is supported.

Multi-band DRC-1 shall be applied in the STFT domain of the TD format converter.

The TD format converter downmix shall be applied for downmixing.

In order to achieve target complexity for the LC profile at a given level, study Annex G.

File format encapsulation is independent of the profile that is used for the bitstream. A profile level indicator is part of

the file format specification (see subclause 20.4).
4 © ISO/IEC 2020 – All rights reserved
---------------------- Page: 7 ----------------------
ISO/IEC 23008-3:2019/Amd.2:2020(E)

The baseline profile is a subset of the low-complexity profile. If a decoder implementation supports

decoding of low complexity profile level 3 bitstreams and supports the configuration extension

CompatibleProfileLevelSet(), then the decoder shall support decoding of bitstreams encoded according

to the baseline profile level 3. Bitstreams complying to the baseline profile may be signalled using:

— the mpegh3daProfileLevelIndication field set to indicate baseline profile as specified in Table 64, or

alternatively,

— the mpegh3daProfileLevelIndication field set to indicate low complexity profile as specified in

Table 64 and the CompatibleProfileLevelSet configuration extension for indicating compatibility to

baseline profile, as described in Annex P.

Additionally, it is strongly recommended that low complexity profile bitstreams conforming to the

baseline profile, are signalled using the profile and level values for mpegh3daProfileLevelIndication

and CompatibleSetIndication given in Table P.1.
Subclause 4.8.2.4
Add new subclauses 4.8.2.5, 4.8.2.6 and 4.8.2.7 after subclause 4.8.2.4:
4.8.2.5 Levels of the baseline profile
4.8.2.5.1 General

Table AMD2.1 — Levels and their corresponding restrictions for the baseline profile

Max. number of
Max. Max. number of Max. number of
core channels in
Level sampling decoder processed channels in
compressed data
rate core channels referenceLayout
stream
1 48000 10 5 5
2 48000 18 9 9
a b a b
3 48000 32 16 or 24 16 or 24
4 48000 56 28 24
5 96000 56 28 24
No additional complexity restrictions are applied.
Additional complexity restrictions given in 4.8.2.5.1 are applied.

— The use of switch groups determines the subset of core channels from the core channels in the

bitstream that shall be decoded.

— If the mae_AudioSceneInfo() contains switch groups (mae_numSwitchGroups>0), then the

elementLengthPresent flag shall be 1.

— The number of channels of the signalled referenceLayout shall not exceed the values defined in the

levels in Table AMD2.1.

— Object renderer and binaural renderer that perform at least as well as the object and binaural

renderer specified in Clauses 8 and 13 may be integrated using the output interfaces for un-rendered

channels and objects described in subclause 17.10.
© ISO/IEC 2020 – All rights reserved 5
---------------------- Page: 8 ----------------------
ISO/IEC 23008-3:2019/Amd.2:2020(E)

NOTE The performance recommendation covers the behaviour of the decoder over the complete

decoding and rendering chain, especially for the case of configuration changes as described in subclause

5.5.6, mixing of channel and object content or DRC processing, loudness compensation and user interactivity.

— For Level 3 the maximum number of decoder processed core channels and maximum number of

channels signalled in referenceLayout is:
a) 16 if no additional complexity restrictions are applied,
b) 24 if all the complexity restrictions in 4.8.2.5.1 are applied.

4.8.2.5.2 Complexity restrictions for Level 3 with more than 16 decoder processed core channels

— signalGroupType in Signals3d() shall indicate SignalGroupTypeObject (Objects only).

— usacElementType[elemIdx] in mpegh3daDecoderConfig() shall indicate ID_USAC_SCE or ID_

USAC_EXT.

— noiseFilling and enhancedNoiseFilling in mpegh3daCoreConfig() shall be set to "0".

— usacExtElementType in mpegh3daExtElementConfig() shall not be set to ID_EXT_ELE_MCT.

— Long term prediction filter shall not be used, i.e., ltpf_data_present and common_ltpf shall be

set to "0".

— Frequency domain predictor shall not be used, i.e., fdp_data_present shall be set to "0".

4.8.2.6 Restrictions for the baseline profile and levels

All restrictions defined for low complexity profile in subclause 4.8.2.2 shall apply.

The LPD path of the core coder and HOA path are not supported.
Restrictions defined in Table AMD2.2 shall apply.
Table AMD2.2 — Baseline profile restrictions
MPEG-H 3D audio bit field Structure Use description
phaseAlignStrength downmixConfig() Shall have the value “0”
Shall have the value "Signal-
SignalGroupType[grp] Signals3d() GroupTypeChannels" or
"SignalGroupTypeObject"
mpegh3daChannelPair
qceIndex Shall have the value “0”
ElementConfig()
mpegh3daChannelPair
lpdStereoIndex Shall have the value “0”
ElementConfig()
tw_mdct mpegh3daCoreConfig() Shall have the value “0”
fullbandLpd mpegh3daCoreConfig() Shall have the value “0”
core_mode[ch] mpegh3daCoreCoderData() Shall have the value “0”
common_max_sfb StereoCoreToolInfo() Shall have the value “1”
tns_on_lr StereoCoreToolInfo() Shall have the value “1”
common_tw StereoCoreToolInfo() Shall have the value “0”
fac_data_present fd_channel_stream() Shall have the value “0”
6 © ISO/IEC 2020 – All rights reserved
---------------------- Page: 9 ----------------------
ISO/IEC 23008-3:2019/Amd.2:2020(E)
4.8.2.7 Signalling of profile and level compatibility sets
MPEG-H 3d audio bitstreams may comply with multiple profiles and levels and the

CompatibleProfileLevelSet() syntax element defined in Table AMD2.3 may be used to signal the

compatibility to multiple profiles.

The CompatibleProfileLevelSet() syntax element contains a list of profile-level numbers the content

is compatible with. Only the lowest level per profile needs to be present, as higher level decoders are

inherently compatible with lower level content.
Table AMD2.3 — Syntax of CompatibleProfileLevelSet()
Syntax No. of bits Mnemonic
CompatibleProfileLevelSet()
bsNumCompatibleSets; 4 uimsbf
numCompatibleSets = bsNumCompatibleSets + 1;
reserved; 4 uimsbf
for (idx = 0; idx < numCompatibleSets; idx++) {
CompatibleSetIndication; 8 uimsbf
Subclause 5.2.2.3
Replace Table 24 with:
Table 24 — Syntax of mpegh3daConfigExtension()
Syntax No. of bits Mnemonic
mpegh3daConfigExtension()
numConfigExtensions = escapedValue(2,4,8) + 1;
for (confExtIdx=0; confExtIdx usacConfigExtType[confExtIdx] = escapedValue(4,8,16);
usacConfigExtLength[confExtIdx] = escapedValue(4,8,16);
switch (usacConfigExtType[confExtIdx]) {
case ID_CONFIG_EXT_FILL:
while (usacConfigExtLength[confExtIdx]--) {
fill_byte[i]; /* should be '10100101' */ 8 uimsbf
}
break;
case ID_CONFIG_EXT_DOWNMIX:
downmixConfig();
break;
case ID_CONFIG_EXT_LOUDNESS_INFO:
mpegh3daLoudnessInfoSet();
© ISO/IEC 2020 – All rights reserved 7
---------------------- Page: 10 ----------------------
ISO/IEC 23008-3:2019/Amd.2:2020(E)
Table 24 (continued)
Syntax No. of bits Mnemonic
break;
case ID_CONFIG_EXT_AUDIOSCENE_INFO:
mae_AudioSceneInfo();
break;
case ID_CONFIG_EXT_HOA_MATRIX:
HoaRenderingMatrixSet();
break;
case ID_CONFIG_EXT_ICG:
ICGConfig();
break;
case ID_CONFIG_
...

FINAL
ISO/IEC
AMENDMENT
DRAFT
23008-3:2019
FDAM 2
ISO/IEC JTC 1/SC 29
Information technology — High
Secretariat: JISC
efficiency coding and media delivery
Voting begins on:
2020-09-01 in heterogeneous environments —
Voting terminates on:
Part 3:
2020-10-27
3D audio
AMENDMENT 2: 3D Audio baseline
profile, corrections and improvements
Technologies de l'information — Codage à haute efficacité et livraison
des medias dans des environnements hétérogènes —
Partie 3: Audio 3D
AMENDEMENT 2
RECIPIENTS OF THIS DRAFT ARE INVITED TO
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
Reference number
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO­
ISO/IEC 23008-3:2019/FDAM 2:2020(E)
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
NATIONAL REGULATIONS. ISO/IEC 2020
---------------------- Page: 1 ----------------------
ISO/IEC 23008-3:2019/FDAM 2:2020(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2020

All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may

be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting

on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address

below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH­1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC 23008-3:2019/FDAM 2:2020(E)
Foreword

ISO (the International Organization for Standardization) and IEC (the International Electrotechnical

Commission) form the specialized system for worldwide standardization. National bodies that

are members of ISO or IEC participate in the development of International Standards through

technical committees established by the respective organization to deal with particular fields of

technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other

international organizations, governmental and non­governmental, in liaison with ISO and IEC, also

take part in the work.

The procedures used to develop this document and those intended for its further maintenance are

described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for

the different types of document should be noted. This document was drafted in accordance with the

editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).

Attention is drawn to the possibility that some of the elements of this document may be the subject

of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent

rights. Details of any patent rights identified during the development of the document will be in the

Introduction and/or on the ISO list of patent declarations received (see www .iso .org/ patents) or the IEC

list of patent declarations received (see https:// patents .iec .ch).

Any trade name used in this document is information given for the convenience of users and does not

constitute an endorsement.

For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and

expressions related to conformity assessment, as well as information about ISO's adherence to the

World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/

iso/ foreword .html.

This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,

Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.

A list of all parts in the ISO/IEC 23008 series can be found on the ISO website.

Any feedback or questions on this document should be directed to the user’s national standards body. A

complete listing of these bodies can be found at www .iso .org/ members .html.
© ISO/IEC 2020 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/IEC 23008-3:2019/FDAM 2:2020(E)
Information technology — High efficiency coding and
media delivery in heterogeneous environments —
Part 3:
3D audio
AMENDMENT 2: 3D Audio baseline profile, corrections and
improvements
Subclause 4.8.2 (Profiles)
After list item 3, add:

4) The baseline profile is a subset of the low­complexity profile which supports channel and object

signals.
Replace Table 2 with:

Table 2 — Summary of the location of and normative reference to the definitions of

MPEG-H 3D audio profiles
MPEG-H 3D
MPEG-H 3D
MPEG-H 3D audio
audio
Defined in Sub- USAC
audio
Tool/Module
Low-
ISO/IEC clause 23003-3
Baseline
High profile complexity
profile
profile
block
14496-3 4.6.11 X X X X
switching
AAC based 14496-3 4.6.11 X X X X
window
additional
shapes
23003-3 6.2.9.3 X X X X
windows
AAC based 14496-3 4.6.11 X X X X
filter bank
additional USAC 23003-3 7.9 X X X X
TNS 14496-3 4.6.9 X X X X
intensity 14496-3 4.6.8.2
coupling 14496-3 4.6.8.3
perceptual PNS 14496-3 4.6.13
noise
noise filling 23003-3 7.2 X X X X
synthesis
Restrictions apply dependent on the levels.

Implementation of binaural rendering is only mandated if headphone reproduction is supported.

Multi­band DRC­1 shall be applied in the STFT domain of the TD format converter.
The TD format converter downmix shall be applied for downmixing.

In order to achieve target complexity for the LC profile at a given level, study Annex G.

File format encapsulation is independent of the profile that is used for the bitstream. A profile level indicator is part of

the file format specification (see subclause 20.4).
© ISO/IEC 2020 – All rights reserved 1
---------------------- Page: 4 ----------------------
ISO/IEC 23008-3:2019/FDAM 2:2020(E)
Table 2 (continued)
MPEG-H 3D
MPEG-H 3D
MPEG-H 3D audio
audio
Defined in Sub- USAC
audio
Tool/Module
Low-
ISO/IEC clause 23003-3
Baseline
High profile complexity
profile
profile
basic mid/side
14496-3 4.6.8.1 X X X X
coding
MDCT based com-
23003-3 7.7.2 X X X X
plex prediction
non­uniform 14496-3 4.6.1 X X X X
quantization
uniform 23003-3 7.1 X X X
Huffman 14496-3 4.6.3
entropy
context adaptive
coding
23003-3 7.4 X X X X
arithmetic coding
base 14496-3 4.6.18 X X
SBR
enhanced 23003-3 7.5 X X
8.6.4 /
parametric stereo 14496-3
8.A
paramet- MPEG surround
ric stereo 2­1­2 (incl. residu- 23003-3 6.2.13 X X
extension al coding)
quad channel
23008-3 5,5 X
element
ACELP 23003-3 7.14 X X X
frequency scale factor based 14496-3 4.6.2 X X X X
domain
noise shap-
LPC based 23003-3 X X X
ing
intelligent
IGF for FD 23008-3 X X X
gap filling
IGF for TCX and
23008-3 X X
improved
TBE in ACELP
LPD coding
LPD stereo 23008-3 X X
frequency­
predictors domain prediction
23008-3 X X X
for FD and time­domain
post­filtering
frequency­
predictors domain prediction
23008-3 X X
for TCX and time­domain
post­filtering
Restrictions apply dependent on the levels.

Implementation of binaural rendering is only mandated if headphone reproduction is supported.

Multi­band DRC­1 shall be applied in the STFT domain of the TD format converter.

The TD format converter downmix shall be applied for downmixing.

In order to achieve target complexity for the LC profile at a given level, study Annex G.

File format encapsulation is independent of the profile that is used for the bitstream. A profile level indicator is part of

the file format specification (see subclause 20.4).
2 © ISO/IEC 2020 – All rights reserved
---------------------- Page: 5 ----------------------
ISO/IEC 23008-3:2019/FDAM 2:2020(E)
Table 2 (continued)
MPEG-H 3D
MPEG-H 3D
MPEG-H 3D audio
audio
Defined in Sub- USAC
audio
Tool/Module
Low-
ISO/IEC clause 23003-3
Baseline
High profile complexity
profile
profile
discrete
multi­chan- MCT 23008-3 X X X
nel coding
format
d d
generic downmix 23008-3 10, 24 X X X
converter
immersive ren-
immersive
d d
dering within 23008-3 11, 25 X X X
rendering
format converter
metadata audio
elements (MAE)
and audio scene
static
23008-3 15 X X X
information (ASI)
metadata
decoder and
renderer
object audio
dynamic
metadata (OAM)
object 23008-3 7, 8 X X X
decoder and
metadata
renderer
MPEG surround
MPS 23003-1 10 X
extension
decoder and
SAOC-3D 23008-3 9 X
renderer
decoder and
23008-3 12 X X
renderer
near field
23008-3 X X
compensation
subband direc-
23008-3 X
HOA tional prediction
parametric ambi-
ance replication 23008-3 X
(PAR)
phase-based
23008-3 X
decorrelation
FD­binaural,
b b
23008-3 13 X X X
TD-binaural
Binaural
HOA2Binaural
23008-3 X X
H2B
Restrictions apply dependent on the levels.

Implementation of binaural rendering is only mandated if headphone reproduction is supported.

Multi­band DRC­1 shall be applied in the STFT domain of the TD format converter.

The TD format converter downmix shall be applied for downmixing.

In order to achieve target complexity for the LC profile at a given level, study Annex G.

File format encapsulation is independent of the profile that is used for the bitstream. A profile level indicator is part of

the file format specification (see subclause 20.4).
© ISO/IEC 2020 – All rights reserved 3
---------------------- Page: 6 ----------------------
ISO/IEC 23008-3:2019/FDAM 2:2020(E)
Table 2 (continued)
MPEG-H 3D
MPEG-H 3D
MPEG-H 3D audio
audio
Defined in Sub- USAC
audio
Tool/Module
Low-
ISO/IEC clause 23003-3
Baseline
High profile complexity
profile
profile
c c
DRC-1 23003-4 X X X
DRC-2 (single
23003-4 X X X
band)
DRC
DRC­2 (multi
23003-4
band)
DRC-3 (single
23003-4 X X X
band)
sample rate
23008-3 X X X
converter
23008-3
unguided clipping
peak limiter D X X X
prevention
23003-4
loudness metada-
23003-4 6 X X X
ta and handling
loudness
loudness com-
23008-3 X X X
pensation
MPEG-H 3D
23008-3 14 X X X
Audio stream
truncation
MHAS
message and CRC
23008-3 X X X
packet type, ASI
packet type
carriage of
MPEG-H 3D audio
file format 23008-3 f
in ISO base media
file format
interfaces and
interfaces
processing for in-
and process- 23008-3 17,18 X X X
teraction data and
ing
local setup info
carriage of
generic data for
carriage of
the interaction 23008-3 X X
generic data
with system
engine
tonal component
TCC 23008-3 X
coding
IC internal channel 23008-3 X
high resolution
HREP envelope 23008-3 X
processing
Restrictions apply dependent on the levels.

Implementation of binaural rendering is only mandated if headphone reproduction is supported.

Multi­band DRC­1 shall be applied in the STFT domain of the TD format converter.

The TD format converter downmix shall be applied for downmixing.

In order to achieve target complexity for the LC profile at a given level, study Annex G.

File format encapsulation is independent of the profile that is used for the bitstream. A profile level indicator is part of

the file format specification (see subclause 20.4).
4 © ISO/IEC 2020 – All rights reserved
---------------------- Page: 7 ----------------------
ISO/IEC 23008-3:2019/FDAM 2:2020(E)

The baseline profile is a subset of the low­complexity profile. If a decoder implementation supports

decoding of low complexity profile level 3 bitstreams and supports the configuration extension

CompatibleProfileLevelSet(), then the decoder shall support decoding of bitstreams encoded according

to the baseline profile level 3. Bitstreams complying to the baseline profile may be signalled using:

— the mpegh3daProfileLevelIndication field set to indicate baseline profile as specified in Table 64, or

alternatively,

— the mpegh3daProfileLevelIndication field set to indicate low complexity profile as specified in

Table 64 and the CompatibleProfileLevelSet configuration extension for indicating compatibility to

baseline profile, as described in Annex P.

Additionally, it is strongly recommended that low complexity profile bitstreams conforming to the

baseline profile, are signalled using the profile and level values for mpegh3daProfileLevelIndication

and CompatibleSetIndication given in Table P.1.
Subclause 4.8.2.4
Add new subclauses 4.8.2.5, 4.8.2.6 and 4.8.2.7 after subclause 4.8.2.4:
4.8.2.5 Levels of the baseline profile
4.8.2.5.1 General

Table AMD2.1 — Levels and their corresponding restrictions for the baseline profile

Max. number of
Max. Max. number of Max. number of
core channels in
Level sampling decoder processed channels in
compressed data
rate core channels referenceLayout
stream
1 48000 10 5 5
2 48000 18 9 9
a b a b
3 48000 32 16 or 24 16 or 24
4 48000 56 28 24
5 96000 56 28 24
No additional complexity restrictions are applied.
Additional complexity restrictions given in 4.8.2.5.1 are applied.

— The use of switch groups determines the subset of core channels from the core channels in the

bitstream that shall be decoded.

— If the mae_AudioSceneInfo() contains switch groups (mae_numSwitchGroups>0), then the

elementLengthPresent flag shall be 1.

— The number of channels of the signalled referenceLayout shall not exceed the values defined in the

levels in Table AMD2.1.

— Object renderer and binaural renderer that perform at least as well as the object and binaural

renderer specified in Clauses 8 and 13 may be integrated using the output interfaces for un­rendered

channels and objects described in subclause 17.10.
© ISO/IEC 2020 – All rights reserved 5
---------------------- Page: 8 ----------------------
ISO/IEC 23008-3:2019/FDAM 2:2020(E)

NOTE The performance recommendation covers the behaviour of the decoder over the complete

decoding and rendering chain, especially for the case of configuration changes as described in subclause

5.5.6, mixing of channel and object content or DRC processing, loudness compensation and user interactivity.

— For Level 3 the maximum number of decoder processed core channels and maximum number of

channels signalled in referenceLayout is:
a) 16 if no additional complexity restrictions are applied,
b) 24 if all the complexity restrictions in 4.8.2.5.1 are applied.

4.8.2.5.2 Complexity restrictions for Level 3 with more than 16 decoder processed core channels

— signalGroupType in Signals3d() shall indicate SignalGroupTypeObject (Objects only).

— usacElementType[elemIdx] in mpegh3daDecoderConfig() shall indicate ID_USAC_SCE or ID_

USAC_EXT.

— noiseFilling and enhancedNoiseFilling in mpegh3daCoreConfig() shall be set to "0".

— usacExtElementType in mpegh3daExtElementConfig() shall not be set to ID_EXT_ELE_MCT.

— Long term prediction filter shall not be used, i.e., ltpf_data_present and common_ltpf shall be

set to "0".

— Frequency domain predictor shall not be used, i.e., fdp_data_present shall be set to "0".

4.8.2.6 Restrictions for the baseline profile and levels

All restrictions defined for low complexity profile in subclause 4.8.2.2 shall apply.

The LPD path of the core coder and HOA path are not supported.
Restrictions defined in Table AMD2.2 shall apply.
Table AMD2.2 — Baseline profile restrictions
MPEG-H 3D audio bit field Structure Use description
phaseAlignStrength downmixConfig() Shall have the value “0”
Shall have the value "Signal-
SignalGroupType[grp] Signals3d() GroupTypeChannels" or
"SignalGroupTypeObject"
mpegh3daChannelPair
qceIndex Shall have the value “0”
ElementConfig()
mpegh3daChannelPair
lpdStereoIndex Shall have the value “0”
ElementConfig()
tw_mdct mpegh3daCoreConfig() Shall have the value “0”
fullbandLpd mpegh3daCoreConfig() Shall have the value “0”
core_mode[ch] mpegh3daCoreCoderData() Shall have the value “0”
common_max_sfb StereoCoreToolInfo() Shall have the value “1”
tns_on_lr StereoCoreToolInfo() Shall have the value “1”
common_tw StereoCoreToolInfo() Shall have the value “0”
fac_data_present fd_channel_stream() Shall have the value “0”
6 © ISO/IEC 2020 – All rights reserved
---------------------- Page: 9 ----------------------
ISO/IEC 23008-3:2019/FDAM 2:2020(E)
4.8.2.7 Signalling of profile and level compatibility sets
MPEG­H 3d audio bitstreams may comply with multiple profiles and levels and the

CompatibleProfileLevelSet() syntax element defined in Table AMD2.3 may be used to signal the

compatibility to multiple profiles.

The CompatibleProfileLevelSet() syntax element contains a list of profile­level numbers the content

is compatible with. Only the lowest level per profile needs to be present, as higher level decoders are

inherently compatible with lower level content.
Table AMD2.3 — Syntax of CompatibleProfileLevelSet()
Syntax No. of bits Mnemonic
CompatibleProfileLevelSet()
bsNumCompatibleSets; 4 uimsbf
numCompatibleSets = bsNumCompatibleSets + 1;
reserved; 4 uimsbf
for (idx = 0; idx < numCompatibleSets; idx++) {
CompatibleSetIndication; 8 uimsbf
Subclause 5.2.2.3
Replace Table 24 with:
Table 24 — Syntax of mpegh3daConfigExtension()
Syntax No. of bits Mnemonic
mpegh3daConfigExtension()
numConfigExtensions = escapedValue(2,4,8) + 1;
for (confExtIdx=0; confExtIdx usacConfigExtType[confExtIdx] = escapedValue(4,8,16);
usacConfigExtLength[confExtIdx] = escapedValue(4,8,16);
switch (usacConfigExtType[confExtIdx]) {
case ID_CONFIG_EXT_FILL:
while (usacConfigExtLength[confExtIdx]­­) {
fill_byte[i]; /* should be '10100101' */ 8 uimsbf
}
break;
case ID_CONFIG_EXT_DOWNMIX:
downmixConfig();
break;
case ID_CONFIG_EXT_LOUDNESS_INFO:
mpegh3daLoudnessInfoSet();
© ISO/IEC 2020 – All rights reserved 7
---------------------- Page: 10 ----------------------
ISO/IEC 23008-3:2019/FDAM 2:2020(E)
Table 24 (continued)
Syntax No. of bits Mnemonic
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.