ISO/IEC 23003-4:2020
(Main)Information technology - MPEG audio technologies - Part 4: Dynamic range control
Information technology - MPEG audio technologies - Part 4: Dynamic range control
This document specifies technology for loudness and dynamic range control. It is applicable to most MPEG audio technologies. It offers flexible solutions to efficiently support the widespread demand for technologies such as loudness normalization and dynamic range compression for various playback scenarios.
Technologies de l'information — Technologies audio MPEG — Partie 4: Contrôle de gamme dynamique
General Information
Relations
Overview - ISO/IEC 23003-4:2020 (MPEG Audio - Dynamic Range Control)
ISO/IEC 23003-4:2020 specifies technologies for dynamic range control (DRC) and loudness management within MPEG audio ecosystems. As Part 4 of the MPEG audio technologies series, it defines decoder behavior, payload formats and processing models to support loudness normalization and dynamic range compression across a wide range of playback scenarios. The standard is applicable to most MPEG audio technologies and provides flexible mechanisms to meet streaming, broadcast and consumer playback needs.
Key technical topics and requirements
The standard’s structure and mandatory elements include, among others:
- DRC decoder architecture - logical blocks, configuration, and decoder processing models.
- DRC gain payloads and syntax - bitstream formats for transmitting dynamic gain information.
- DRC set selection and application rules - pre-selection, request-based selection, album mode, ducking and precedence.
- Time-domain and sub-band DRC application - framing, time alignment, interpolation (including spline), look-ahead, and node reservoir handling.
- Generation of DRC gain values - algorithms for producing gain sequences at the decoder and combining parametric/non-parametric DRCs.
- Loudness normalization - methods for applying target loudness and equalization support.
- Equalization tools - EQ payloads, filter elements and EQ set selection.
- Complexity management - estimating processing cost for DRC and downmixing, and EQ complexity.
- Syntax, reference software and conformance - normative syntax tables, reference implementations, conformance tests and profiles/levels.
The document contains normative and informative annexes covering tables, interfaces, codec-specific guidance, gain encoding, test data and reference software.
Practical applications and users
ISO/IEC 23003-4 is intended for organizations and professionals who implement or integrate MPEG audio with consistent loudness and dynamic behavior:
- Audio codec and encoder/decoder developers implementing MPEG-D DRC features.
- Streaming and broadcast engineers needing consistent loudness normalization and compression across platforms.
- Consumer device manufacturers (smart TVs, set-top boxes, mobile devices, soundbars) implementing playback DRC.
- Content production and post-production teams aiming for predictable loudness behavior on varied playback systems.
- Accessibility and UX designers ensuring intelligibility for hearing-impaired listeners.
- Test labs and standards bodies performing conformance testing and interoperability verification.
Related standards
- Part of the broader ISO/IEC 23003 (MPEG audio technologies) family; cross-references to codec-specific guidance and other MPEG audio parts are included in the annexes.
Keywords: ISO/IEC 23003-4, MPEG audio, Dynamic Range Control, DRC decoder, loudness normalization, dynamic range compression, DRC gain payload, equalization, conformance.
Frequently Asked Questions
ISO/IEC 23003-4:2020 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - MPEG audio technologies - Part 4: Dynamic range control". This standard covers: This document specifies technology for loudness and dynamic range control. It is applicable to most MPEG audio technologies. It offers flexible solutions to efficiently support the widespread demand for technologies such as loudness normalization and dynamic range compression for various playback scenarios.
This document specifies technology for loudness and dynamic range control. It is applicable to most MPEG audio technologies. It offers flexible solutions to efficiently support the widespread demand for technologies such as loudness normalization and dynamic range compression for various playback scenarios.
ISO/IEC 23003-4:2020 is classified under the following ICS (International Classification for Standards) categories: 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO/IEC 23003-4:2020 has the following relationships with other standards: It is inter standard links to ISO/IEC 23003-4:2020/Amd 2:2023, ISO/IEC 23003-4:2020/Amd 1:2022, ISO/IEC 23003-4:2025, ISO/IEC 23003-4:2015/Amd 1:2017, ISO/IEC 23003-4:2015/Amd 2:2017, ISO/IEC 23003-4:2015. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
You can purchase ISO/IEC 23003-4:2020 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 23003-4
Second edition
2020-06
Information technology — MPEG
audio technologies —
Part 4:
Dynamic range control
Technologies de l'information — Technologies audio MPEG —
Partie 4: Contrôle de gamme dynamique
Reference number
©
ISO/IEC 2020
© ISO/IEC 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved
Contents Page
Foreword . vi
Introduction . vii
1 Scope . 1
2 Normative references . 1
3 Terms, definitions and mnemonics. 1
3.1 Terms and definitions . 1
3.2 Mnemonics . 3
4 Symbols (and abbreviated terms) . 3
5 Technical overview . 4
6 DRC decoder . 6
6.1 DRC decoder configuration . 6
6.1.1 Overview . 6
6.1.2 Description of logical blocks . 7
6.1.3 Derivation of peak and loudness values. 12
6.2 Dynamic DRC gain payload . 16
6.3 DRC set selection . 16
6.3.1 Overview . 16
6.3.2 Pre-selection based on Signal Properties and Decoder Configuration . 17
6.3.3 Selection based on requests . 20
6.3.4 Final selection . 22
6.3.5 Applying multiple DRC sets . 23
6.3.6 Album mode . 23
6.3.7 Ducking. 23
6.3.8 Precedence . 24
6.4 Time domain DRC application . 24
6.4.1 Overview . 24
6.4.2 Framing . 24
6.4.3 Time resolution . 25
6.4.4 Time alignment . 25
6.4.5 Decoding . 26
6.4.6 Gain modifications and interpolation . 29
6.4.7 Spline interpolation . 35
6.4.8 Look-ahead in decoder . 36
6.4.9 Node reservoir . 37
6.4.10 Applying the compression . 38
6.4.11 Dynamic equalization . 41
6.4.12 Multi-band DRC filter bank . 43
6.5 Sub-band domain DRC . 47
6.6 Generation of DRC gain values at the decoder . 51
6.6.1 Overview . 51
6.6.2 Description of logical blocks . 52
6.6.3 Algorithmic details . 53
6.6.4 Combining parametric and non-parametric DRCs . 60
6.7 Loudness equalization support . 61
6.8 Equalization tool . 62
© ISO/IEC 2020 – All rights reserved iii
6.8.1 Overview . 62
6.8.2 EQ payloads . 62
6.8.3 EQ filter elements . 63
6.8.4 EQ set selection . 64
6.8.5 Application of EQ set . 64
6.9 Complexity management . 72
6.9.1 General . 72
6.9.2 DRC and downmixing complexity estimation . 72
6.9.3 EQ complexity estimation . 74
6.10 Loudness normalization . 75
6.10.1 Overview . 75
6.10.2 Loudness normalization based on target loudness . 76
6.11 DRC in streaming scenarios . 79
6.11.1 DRC configuration . 79
6.11.2 Error handling . 79
6.12 DRC configuration changes during active processing . 79
7 Syntax . 81
7.1 Syntax of DRC payload . 81
7.2 Syntax of DRC gain payload . 81
7.3 Syntax of static DRC payload . 82
7.4 Syntax of DRC gain sequence. 109
7.5 Syntax of parametric DRC tool. 110
7.6 Syntax of equalization tools . 117
8 Reference software . 131
8.1 Reference software structure . 131
8.1.1 General . 131
8.2 Bitstream decoding software . 131
8.2.1 General . 131
8.2.2 MPEG-D DRC decoding software . 132
9 Conformance . 132
9.1 General . 132
9.2 Conformance testing . 132
9.2.1 Conformance test data and test procedure . 132
9.2.2 Naming conventions . 134
9.2.3 File format definitions . 136
9.3 Encoder Conformance for MPEG-D DRC bitstreams . 138
9.3.1 Characteristics and test procedure . 138
9.3.2 Configuration payload . 139
9.3.3 Interface payload . 153
9.3.4 Frame Payload . 156
9.3.5 Requirements depending on profiles and levels . 157
9.4 Decoder conformance test categories and conditions . 158
9.4.1 General . 158
9.4.2 Conformance test categories . 158
9.4.3 Conformance test conditions . 158
Annex A (normative) Tables . 167
Annex B (normative) External Interface to DRC tool . 207
Annex C (informative) Audio codec specific information . 220
iv © ISO/IEC 2020 – All rights reserved
Annex D (informative) DRC gain generation and encoding . 225
Annex E (informative) DRC set selection and adjustment at decoder . 236
Annex F (informative) Loudness normalization . 243
Annex G (informative) Peak limiter . 244
Annex H (informative) Equalization . 249
Annex I (normative) Profiles and levels. 251
Annex J (informative) Reference software disclaimer . 260
Annex K (informative) Reference software . 261
Bibliography . 262
© ISO/IEC 2020 – All rights reserved v
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents) or the
IEC list of patent declarations received (see http://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT)
see www.iso.org/iso/foreword.html.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology,
Subcommittee SC 29, Coding of audio, picture, multimedia, and hypermedia.
This second edition cancels and replaces the first edition (ISO 23003-4:2015), which has been
technically revised. It also incorporates the Amendments ISO 23003-4:2015/Amd.1:2017 and
ISO 23003-4:2015/Amd.2:2017. The main changes compared to the previous edition are as follows:
— Amendments to the previous edition that include enhancements, definitions of profiles and levels,
reference software, and conformance are integrated.
A list of all parts in the ISO/IEC 23003 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body.
A complete listing of these bodies can be found at www.iso.org/members.html.
vi © ISO/IEC 2020 – All rights reserved
Introduction
Consumer audio systems and devices are used in a large variety of configurations and acoustical
environments. For many of these scenarios, the audio reproduction quality can be improved by
appropriate control of content dynamics and loudness.
This document provides a universal dynamic range control tool that supports loudness normalization.
The DRC tool offers a bitrate efficient representation of dynamically compressed versions of an audio
signal. This is achieved by adding a low-bitrate DRC metadata stream to the audio signal. The DRC tool
includes dedicated sections for clipping prevention, ducking, and for generating a fade-in and fade-out
to supplement the main dynamic range compression functionality. The DRC effects available at the DRC
decoder are generated at the DRC encoder side. At the DRC decoder side, the audio signal may be played
back without applying the DRC tool, or an appropriate DRC tool effect is selected and applied based on
the given playback scenario.
Loudness normalization is fully integrated with DRC and peak control to avoid clipping. A metadata-
controlled equalization tool is provided to compensate for playback scenarios that impact the spectral
balance, such as downmix or DRC. Furthermore, the DRC tool supports metadata-based loudness
equalization to compensate the effect of playback level changes on the spectral balance.
The International Organization for Standardization (ISO) and International Electrotechnical
Commission (IEC) draw attention to the fact that it is claimed that compliance with this document may
involve the use of a patent.
ISO and IEC take no position concerning the evidence, validity and scope of these patent rights
The holders of these patent rights have assured ISO and IEC that they are willing to negotiate licences
under reasonable and non-discriminatory terms and conditions with applicants throughout the world.
In this respect, the statements of the holders of these patent rights are registered with ISO and IEC.
Information may be obtained from the patent database available at www.iso.org/patents.
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights other than those in the patent database. ISO and IEC shall not be held responsible for
identifying any or all such patent rights.
© ISO/IEC 2020 – All rights reserved vii
INTERNATIONAL STANDARD ISO/IEC 23003-4:2020(E)
Information technology — MPEG audio technologies —
Part 4:
Dynamic range control
1 Scope
This document specifies technology for loudness and dynamic range control. It is applicable to most
MPEG audio technologies. It offers flexible solutions to efficiently support the widespread demand for
technologies such as loudness normalization and dynamic range compression for various playback
scenarios.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 14496-12, Information technology — Coding of audio-visual objects — Part 12: ISO base media
file format
ISO/IEC 14496-26:2010, Information technology — Coding of audio-visual objects — Part 26: Audio
Conformance
ISO/IEC 23008-3:2019, Information technology — High efficiency coding and media delivery in
heterogeneous environments — Part 3: 3D audio
ISO/IEC 23091-3, Information technology — Coding-independent code points — Part 3: Audio
3 Terms, definitions and mnemonics
3.1 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 14496-12 and the
following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https://www.iso.org/obp
— IEC Electropedia: available at http://www.electropedia.org/
3.1.1
DRC sequence
series of DRC gain values that can be applied to one or more audio channels
© ISO/IEC 2020 – All rights reserved 1
3.1.2
DRC set
defined set of DRC sequences that produce a desired effect if applied to the audio signal
3.1.3
album
collection of audio recordings that are mastered in a consistent way
Note 1 to entry: Traditionally, a collection of songs released on a Compact Disk belongs into this category, for
example.
3.1.4
conformance test bitstream
bitstream used for testing the conformance of MPEG-D DRC compliant audio decoders
3.1.5
conformance test case
conformance test category and a combination of one or more conformance test conditions for which a
conformance test sequence is provided
3.1.6
conformance test condition
condition which applies to properties of a conformance test sequence in order to test a certain
functionality of the MPEG-D DRC decoder
3.1.7
conformance test criteria
one or more conformance test tools and corresponding parameters applied to verify the conformance
for a certain conformance test sequence
3.1.8
conformance test sequence
set of a conformance test bitstream, a decoder setting, an input audio file and a corresponding reference
file
3.1.9
decoder input parameters
input parameters that are supplied to an MPEG-D DRC decoder in addition to a conformance test
bitstream, a decoder interface bitstream and an input audio file
3.1.10
decoder setting
combination of a decoder interface bitstream and decoder input parameters that are supplied to an
MPEG-D DRC decoder
3.1.11
input DRC set selection parameters
input parameter set for testing of a DRC gain decoder instance
Note 1 to entry: This parameter set is solely used for conformance testing in the context of the DRC gain decoder
conformance test category (DrcGainDec).
3.1.12
reference audio file
decoded counterpart of a conformance test bitstream, a decoder setting and an input audio file
2 © ISO/IEC 2020 – All rights reserved
3.1.13
reference DRC set selection parameters
decoded counterpart of a conformance test bitstream and a decoder setting fed to the DRC set selection
process
Note 1 to entry: This parameter set is an intermediate result of an MPEG-D DRC compliant decoder
implementation solely used for conformance testing in the context of the DRC selection process test category
(DrcSelProc).
3.1.14
reference file
reference audio file or reference DRC set selection parameters
3.2 Mnemonics
bslbf bit string, left bit first, where “left” is the order in which bit strings are written in the
ISO/IEC 14496 series
NOTE Bit strings are written as a string of 1s and 0s within single quote marks, for
example '1000 0001'. Blanks within a bit string are for ease of reading and have no
significance.
byte_align() number of bits to fill for byte alignment at the offset of n bits:
byte_align(n) = 8 ceil (n/8) – n
uimsbf unsigned integer, most significant bit first
vlclbf variable length code, left bit first, where “left” refers to the order in which the
variable length codes are written
bit(n) a bit string with n bits in the same format as bslbf
unsigned int(n) an unsigned integer with n bits in the same format as uimsbf
signed int(n) a signed integer with n bits, most significant bit first
mod modulo operator: (x mod y) = x-y floor (x/y)
sizeof(x) size operator that returns the bit size of a field x
TRUE/FALSE values of Boolean data type, which correspond to numerical 1 and 0, respectively
4 Symbols
a filter coefficient
i
b band index of DRC filter bank (starting at 0)
b filter coefficient
i
deltaTmin smallest permitted DRC gain sample interval in units of the audio sample interval
© ISO/IEC 2020 – All rights reserved 3
f cross-over frequency in Hz
c
f cross-over frequency expressed as fraction of the audio sample rate
c,norm
f (s) cross-over frequency of audio decoder sub-band s expressed as fraction of
c,norm,SB
the audio sample rate
NOTE The cross-over frequency is the upper band edge frequency of the sub-
band.
f audio sample rate in Hz
s
NOTE If an audio decoder is present, it is the sample rate of the decoded time-
domain audio signal.
M DRC frame size in units of the audio sample interval 1/f
DRC s
N maximum permitted number of DRC samples per DRC frame
DRC
NOTE Identical to the number of intervals with a duration of deltaTmin per DRC
frame.
N codec frame size in units of the audio sample interval 1/f
Codec s
π ratio of a circle’s circumference to its diameter
s audio decoder sub-band index (starting at 0)
z complex variable of the z-transform
5 Technical overview
The technology described in this document is called the “DRC tool”. It provides efficient control of
dynamic range, loudness, and clipping based on metadata generated at the encoder. The decoder can
choose to selectively apply the metadata to the audio signal to achieve a desired result. Metadata for
dynamic range compression consists of encoded time-varying gain values that can be applied to the
audio signal. Hence, the main blocks of the DRC tool include a DRC gain encoder, a DRC gain decoder, a
DRC gain modification block, and a DRC gain application block. These blocks are exercised on a frame-
by-frame basis during audio processing. In addition to encoded time-varying gain values, the DRC gain
decoder can also receive parametric DRC metadata for generation of time-varying gain values at the
decoder. Various DRC configurations can be conveyed in a separate bitstream element, such as
configurations for a downmix or combined DRCs. The DRC set selection block decides based on the
playback scenario and the applicable DRC configurations which DRC gains to apply to the audio signal.
Moreover, the DRC tool supports loudness normalization based on loudness metadata.
A typical system for loudness and dynamic range control in the time domain is shown in Figure 1. A
more complex system including downmixer and peak limiter is shown in Figure 2. The decoder part of
the DRC tool is driven by metadata that efficiently represents the DRC gain samples and parameters for
interpolation. The gain samples can be updated as fast as necessary to accurately represent gain
changes down to at least 1 ms update intervals. In the following, the decoder part of the DRC tool is
referred to as “DRC decoder”, which includes everything except the audio decoder and associated
bitstream de-multiplexing.
4 © ISO/IEC 2020 – All rights reserved
Figure 1 — Block diagram of a typical system with audio decoder and DRC tool modules to
achieve loudness normalization (LN) and dynamic range control
Figure 2 — Block diagram of a more complex system including downmixer and peak limiter
(TD = time-domain, SD = subband-domain)
The DRC tool provides support for loudness equalization, sometimes called “loudness compensation”,
that can be applied to compensate for the effect of the playback level on the spectral balance. For this
purpose, time-varying loudness information can be recovered from DRC gain sequences to dynamically
control the compensation module. While the compensation module is out of scope, the interface
describes in which frequency ranges the loudness information should be applied.
A flexible tool for generic metadata-controlled equalization is provided. The tool can be used to reach
the desired spectral balance of the reproduced audio signal depending on a wide variety of playback
scenarios, such as downmix, DRC, or playback room size. It can operate in the sub-band domain of an
audio decoder and in the time domain.
The DRC tool is specified in Clause 6. The tool may be subject to profiles and levels that shall be in
accordance with Annex I. The bitstream field decoding of the DRC tool shall be in accordance with
Annex A. If an interface for external parameter control of the DRC tool is used, it shall conform to
Annex B.
© ISO/IEC 2020 – All rights reserved 5
6 DRC decoder
6.1 DRC decoder configuration
6.1.1 Overview
The DRC configuration information can be received in-stream using the static payloads uniDrcConfig()
and loudnessInfoSet() described below, or it can be delivered by a higher layer, such as in ISO/IEC
14496-12 (see Table 1). The basic decoding process of the static information is virtually the same. The
difference consists mainly in a few syntax changes and reduced field sizes to increase the bit rate
efficiency of the in-stream configuration. The syntax of the in-stream static payload is given in 7.3. The
associated metadata encoding is given in A.6. The static DRC payload is evaluated once at the beginning
of the decoding process and it is monitored subsequently. For static DRC payload changes during
playback, see 6.12.
Table 1 — Overview of configuration (setup) and separate metadata track in ISO/IEC 14496-12
Sample entry Setup Track reference Sample format
code (in sample entry)
Audio As specified for the DRCInstructions box "adrc" referring to the As specified for the
track audio codec in use using negative values for metadata tracks audio codec in use
(unchanged) drcLocation carrying gain values (unchanged)
Metadata "unid" (none) (none) Each sample is a
track uniDrcGain() payload
The static payload is divided into several logical blocks:
— channelLayout();
— downmixInstructions(), downmixInstructionsV1();
— drcCoefficientsBasic(), drcCoefficientsUniDrc(), drcCoefficientsUniDrcV1();
— drcInstructionsBasic(), drcInstructionUniDrc(), drcInstructionUniDrcV1();
— loudnessInfo(), loudnessInfoV1();
— drcCoefficientsParametricDrc();
— parametricDrcInstructions();
— loudEqInstructions();
— eqCoefficients();
— eqInstructions().
Except for the channelLayout(), drcCoefficientsParametricDrc(), and eqCoefficients(), multiple
instances of a logical block can appear. The DRC decoder combines the information of the matching
instances of the logical blocks for a given playback scenario. Matching instances are found by matching
several identifiers (labels) contained in the blocks.
6 © ISO/IEC 2020 – All rights reserved
From the static payload, the decoder can also extract information about the effect of a particular DRC
and various associated loudness information, if present. If multiple DRCs are available, this information
can be used to select a particular DRC based on target criteria for dynamics and loudness (see 6.3)
uniDrcConfig() contains all blocks except for the loudnessInfo() blocks which are bundled in
loudnessInfoSet(). The last part of the uniDrcConfig() payload can include future extension payloads. In
the event that a uniDrcConfigExtType value is received that is not equal to UNIDRCCONFEXT_TERM, the
DRC tool parser shall read and discard the bits (otherBit) of the extension payload. Similarly, the last
part of the loudnessInfoSet() payload can include future extension payloads. In the event that a
loudnessInfoSetExtType value is received that is not equal to UNIDRCLOUDEXT_TERM, the DRC tool
parser shall read and discard the bits (otherBit) of the extension payload. Each extension payload type
in uniDrcConfig() or loudnessInfoSet() shall not appear more than once in the bitstream if not stated
otherwise. An extension payload of type UNIDRCCONFEXT_V1 shall preceed an extension payload of
type UNIDRCCONFEXT_PARAM_DRC in the bitstream if both payloads are present. For
ISO/IEC 14496-12, configuration extension payloads are provided according to Table 76.
The top level fields of uniDrcConfig() include the audio sample rate, which is a fundamental parameter
for the decoding process (if not present, the audio sample rate is inherited from the employed audio
codec). Moreover, the top level fields of uniDrcConfig() include the number of instances of each of the
logical blocks, except for the channelLayout() block which appears only once. The top level fields of
loudnessInfoSet() only include the number of loudnessInfo() blocks. The logical blocks are described in
the following.
6.1.2 Description of logical blocks
6.1.2.1 channelLayout()
The channelLayout() block includes the channel count of the audio signal in the base layout. It may also
include the base layout unless it is specified elsewhere. For use cases where the base audio signal
represents objects or other audio content, the base channel count represents the total number of base
content channels. The base channel count value shall serve as the value of baseChannelCount for
parsing the downmixInstructions(), downmixInstructionsV1(), drcInstructionsUniDrc(),
drcInstructionsUniDrcV1() and eqInstructions() payloads as specified in Clause 7.
6.1.2.2 downmixInstructions() and downmixInstructionsV1()
This block includes a unique non-zero downmix identifier (downmixId) that can be used externally to
refer to this downmix. The targetChannelCount specifies the number of channels after downmixing to
the target layout. It may also contain downmix coefficients, unless they are specified elsewhere. For use
cases where the base audio signal represents objects or other audio content, the downmixId can be used
to refer to a specific target channel configuration of a present rendering engine. In contrast to
downmixInstructions(), the downmixInstructionsV1() payload includes an offset for all downmix
coefficients and the coefficient decoding does not depend on the LFE channel assignment. The
downmixInstructions() box for ISO/IEC 14496-12 contains the corresponding metadata of either one of
the in-stream payloads as indicated by the version parameter of the box.
6.1.2.3 drcCoefficientsBasic(), drcCoefficientsUniDrc(), and drcCoefficientsUniDrcV1()
A drcCoefficients block describes all available DRC gain sequences in one location. The block can have
the basic format or the uniDrc format. The basic format, drcCoefficientsBasic(), contains a subset of
information included in drcCoefficientsUniDrc() that can be used to describe DRCs other than the ones
specified in this document. drcCoefficientsUniDrc() contains for each sequence several indicators on
how it is encoded, the time resolution, time alignment, the number of DRC sub-bands and
corresponding crossover frequencies and DRC characteristics. The crossover frequencies shall increase
© ISO/IEC 2020 – All rights reserved 7
with increasing band index. Alternatively, explicit indices in a decoder sub-band domain can be
specified for the assignment of DRC sub-bands. The sub-band indices shall also increase with increasing
band index. If the DRC gains are applied in the time-domain by using the multi-band DRC filter bank
specified in 6.4.12, explicit index signalling is not allowed. The index of the DRC characteristic indicates
which compression characteristic was used to produce the gain sequence. The DRC location describes
where these gain sequences can be found in the bitstream. The DRC gain sequences in that location are
inherently enumerated according to their order of appearance starting with 1.
The DRC location field encoding depends on the audio codec. A codec specification may include this
specification, and use values 1 to 4 to refer to codec-specific locations as indicated in Table 2. For
example, for AAC (ISO/IEC 14496-3), the codec-specific values of the DRC location field are encoded as
shown in Table 3.
Table 2 — Encoding of drcLocation for in-stream payload
drcLocation n Payload
0 Reserved
1 Location 1 (Codec-specific use)
2 Location 2 (Codec-specific use)
3 Location 3 (Codec-specific use)
4 Location 4 (Codec-specific use)
n > 4 reserved
Table 3 — Codec-specific encoding of drcLocation for MPEG-4 Audio
drcLocation n Payload
1 uniDrc() (defined in Clause 7)
2 dyn_rng_sgn[i] / dyn_rng_ctl[i] in dynamic_range_info()
(defined in ISO/IEC 14496-3:2009 subpart 4)
3 compression_value in MPEG4_ancillary_data()
(defined in ISO/IEC 14496-3:2009/AMD 4:2013)
4 reserved
The DRC frame size can optionally be specified. It shall be provided if the DRC frame size deviates from
the default size specified in 6.4.2. If not specified, the default frame size is used.
The in-stream drcCoefficient syntax is given in Table 65, Table 67 and Table 68. The syntax for the
corresponding block for ISO/IEC 14496-12 (ISO base media file format) is shown in Table 66 and
Table 69. The corresponding blocks carry essentially the same information. Values that are identically
included in both blocks are coded the same way except for drcLocation.
In ISO base media file format (see ISO/IEC 14496-12), for each codec that can be carried in MP4 files
and that also carries DRC information, there is a specific definition of how the location is coded, using
the DRC_location field (see Table 4). A negative value of DRC_location indicates that a DRC payload is in
an associated meta-data track. That track is the n-th linked via a track reference of type "adrc" (audio
DRC) from the audio track, where n = abs(DRC_location), and the sample-entry type in the meta-data
track indicates in which format the coefficients are stored. Table 3 defines the specific entries of the
drcLocation field for AAC. Some example use cases are discussed in C.10.
8 © ISO/IEC 2020 – All rights reserved
If the uniDrc() payload is stored in a separate track in the ISO base media file format
(ISO/IEC 14496-12), then the track is a metadata track with the sample entry identifier "unid" (uniDrc),
with no required boxes added to the sample entry. The time synchronization with the linked audio track
is the same as if the payload was in-stream.
Table 4 — Encoding of drcLocation for ISO/IEC 14496-12
drcLocation n Payload
n < 0 DRC payload located in |n|-th linked meta-data track
0 reserved
1 Location 1 (Codec-specific use)
2 Location 2 (Codec-specific use)
3 Location 3 (Codec-specific use)
4 Location 4 (Codec-specific use)
n > 4 reserved
The drcCoefficientsUniDrcV1() payload is defined in Table 68. It contains the same information as
drcCoefficientsUniDrc() except for the assignment of DRC gain sequences to gain sets and the optional
specification of a number of parametric DRC characteristics. The drcCoefficientsUniDrc() payload
assigns gain sequences in order of transmission. In contrast, the drcCoefficientsUniDrcV1() payload
maps a gain sequence by index to gainSets. The latter permits to refer to the same gain sequence for
multiple DRC bands which is not possible when using drcCoefficientsUniDrc(). If a
drcCoefficientsUniDrcV1() payload is present, any drcCoefficientsUniDrc() payload for the same
location is ignored.
The drcCoefficientsUniDrcV1() payload can also include information about dynamic equalization filters
if the field shapingFiltersPresent==1. There can be a number of filters that are indexed in order of
appearance. The DRC sets defined in drcInstructionsUniDrcV1() can refer to specific filters using their
indices (see 6.4.11).
6.1.2.4 drcInstructionsBasic(), drcInstructionsUniDrc(), and drcInstructionsUniDrcV1()
A drcInstructions block includes information about one specific DRC set that can be applied to achieve a
desired effect. This block can have the basic format or the uniDrc format. The basic format,
drcInstructionsBasic(), contains a subset of information included in drcInstructionsUniDrc() that can be
used to describe DRCs other than the ones specified in this document. The information included in
drcInstructionsUniDrc() consists mainly of pre-defined description elements such as the DRC set effect
and the DRC gain sequences that are applied. The drcSetEffect field contains several effect bits as listed
in Table A.45. Multi
...
La norme ISO/IEC 23003-4:2020, intitulée « Technologies audio MPEG - Partie 4 : Contrôle de la plage dynamique », établit des directives claires et précises pour le contrôle de la loudness et de la plage dynamique. Son champ d'application couvre la plupart des technologies audio MPEG, ce qui en fait un document essentiel pour les professionnels du secteur. Parmi les principales forces de cette norme, on peut souligner la flexibilité qu'elle offre. En effet, ISO/IEC 23003-4:2020 répond efficacement à la demande croissante de solutions telles que la normalisation de la loudness et la compression de la plage dynamique. Ces fonctionnalités sont devenues indispensables pour garantir une expérience audio cohérente à travers différents scénarios de lecture, que ce soit pour la diffusion, le streaming ou d'autres applications multimédias. De plus, la norme s'inscrit dans un contexte technologique en constante évolution, ce qui souligne sa pertinence à l'heure où la qualité audio est primordiale pour les utilisateurs finaux. En offrant des solutions robustes pour le contrôle dynamique, ISO/IEC 23003-4:2020 permet aux développeurs et aux ingénieurs du son d'optimiser le rendu audio, tout en respectant les exigences de loudness contemporaines. En conclusion, ISO/IEC 23003-4:2020 est un document incontournable qui contribue significativement à l'harmonisation des pratiques dans le domaine des technologies audio MPEG, tout en apportant des outils essentiels pour le contrôle de la loudness et de la plage dynamique.
ISO/IEC 23003-4:2020は、情報技術に関する標準であり、MPEGオーディオ技術の第4部として、音量とダイナミックレンジ制御の技術を規定しています。本標準は、音声メディアの品質を向上させるための重要な要素であり、特に音圧の標準化やダイナミックレンジ圧縮といった広範なニーズに応えるための柔軟なソリューションを提供します。 本標準の強みは、その適用範囲の広さにあります。ISO/IEC 23003-4は、ほとんどのMPEGオーディオ技術に対応しており、様々な再生シナリオにおいても効率的に機能します。これにより、異なる再生環境やデバイスにおいても一貫した音質を維持できるため、コンシューマーエレクトロニクスや放送業界において極めて重要な文書となっています。 さらに、音量の正規化やダイナミックレンジ圧縮に関する技術的要件が明確に定義されていることで、開発者やエンジニアがその技術を容易に採用できる点も大きな利点です。ISO/IEC 23003-4は、MPEGオーディオ関連の技術に精通しているすべての関係者にとって、信頼性の高い基準となることを目指しています。 本標準は、音声コンテンツの配信と消費におけるユーザー体験を向上させるために非常に関連性が高いものであり、音質の一貫性と聴感上の快適さを追求する上で欠かせないリソースとなっています。音量とダイナミックレンジ制御の分野において、ISO/IEC 23003-4は確かな指針を提供し、業界全体の発展に寄与することでしょう。
Die ISO/IEC 23003-4:2020 ist ein bedeutendes Dokument im Bereich der Informationstechnologie, das sich mit MPEG-Audiotechnologien beschäftigt. Der Schwerpunkt dieser Norm liegt auf der Regelung der Lautstärke und dem dynamischen Bereich, was für die aktuelle Audio-Industrie von entscheidender Bedeutung ist. Der Geltungsbereich der Norm erstreckt sich über die meisten MPEG-Audiotechnologien und ist daher äußerst relevant für verschiedene Anwendungen und Plattformen, die audiovisuellen Inhalt bereitstellen. Eine der Stärken dieser Norm ist ihre Flexibilität. Die ISO/IEC 23003-4:2020 bietet praktikable Lösungen zur Unterstützung der wachsenden Nachfrage nach Technologien wie Lautstärke-Normalisierung und dynamischer Bereichskompression. Diese Lösungen sind besonders wichtig für verschiedene Wiedergabeszenarien, bei denen Lautstärkeschwankungen und dynamische Inhalte berücksichtigt werden müssen, um ein konsistentes Hörerlebnis zu gewährleisten. Die Norm dient als grundlegender Leitfaden für Audiotechnologen, Entwickler und Anbieter von Medieninhalten, um sicherzustellen, dass ihre Produkte den bestehenden Standards für Lautstärkeregelung und dynamischen Bereich entsprechen. Dies trägt nicht nur zur Harmonie innerhalb der Audioindustrie bei, sondern fördert auch die Benutzerfreundlichkeit und Gesamtqualität der аудио-Produktionen. Zusammenfassend lässt sich sagen, dass die ISO/IEC 23003-4:2020 eine wesentliche Ressource ist, die auf die Bedürfnisse eines sich schnell entwickelnden Marktes reagiert und gleichzeitig hohe Qualität und Kompatibilität in der Nutzung von MPEG-Audiotechnologien gewährleistet.
The ISO/IEC 23003-4:2020 standard, titled "Information technology - MPEG audio technologies - Part 4: Dynamic range control," offers a comprehensive framework for loudness and dynamic range control, specifically within the context of MPEG audio technologies. The scope of this standard is notably broad, as it is designed to address the growing need for effective loudness normalization and dynamic range compression across a variety of playback environments. This relevance is underscored by the increasing demand for audio technologies that can cater to diverse listening experiences, whether in home theaters, mobile devices, or professional audio settings. One of the strengths of the ISO/IEC 23003-4:2020 standard lies in its flexibility. By providing adaptive solutions for loudness and dynamic range control, the standard enables content creators and audio engineers to optimize their audio output for different dynamic ranges, enhancing listener experience without sacrificing sound quality. This adaptability is crucial in an era where content consumption occurs across multiple platforms and devices with varying audio capabilities. Furthermore, the standard's alignment with MPEG audio technologies reinforces its significance in the industry. As a recognized suite of standards for audio coding, MPEG continues to be integral to digital media. The integration of dynamic range control protocols outlined in ISO/IEC 23003-4:2020 into existing MPEG technologies solidifies its role as a key player in ensuring compatibility and efficacy in modern audio applications. In summary, ISO/IEC 23003-4:2020 stands out for its comprehensive approach to loudness and dynamic range control. Its flexibility, applicability, and alignment with MPEG standards make it a vital resource for stakeholders in the information technology and audio production sectors, addressing the essential needs of contemporary audio playback scenarios.
ISO/IEC 23003-4:2020 표준은 정보 기술 및 MPEG 오디오 기술에 관한 중요한 문서로, 특히 음향의 음량 및 다이나믹 레인지 제어 기술을 구체적으로 명시하고 있습니다. 이 표준의 범위는 다양하고 현대적인 오디오 기술을 포괄하며, 특히 대중에게 널리 요구되는 음량 정규화 및 다이나믹 레인지 압축 기술을 효율적으로 지원하는 유연한 솔루션을 제공합니다. 이 문서의 강점은 MPEG 오디오 기술의 대부분에 적용할 수 있는 점입니다. 이는 다양한 재생 시나리오에서의 요구를 충족시킬 수 있는 적응성을 제공하며, 사용자가 음향 경험을 향상시키는 데 도움을 줍니다. 또한, ISO/IEC 23003-4:2020은 기술의 발전에 따라 변화하는 음향의 요구 사항을 반영하고 있어, 오디오 콘텐츠 제작자와 소비자 모두에게 유용합니다. 결론적으로 이 표준은 음향 기술의 최신 동향을 반영하며, 다이나믹 레인지 제어 및 음량 조정과 관련하여 필수적인 기준을 제공하므로, 현재 오디오 기술 환경에서의 중요성을 갖추고 있습니다.
기사 제목: ISO/IEC 23003-4:2020 - 정보 기술 – MPEG 오디오 기술 – 제 4부: 동적 범위 제어 기사 내용: 이 문서는 음량과 동적 범위 제어에 대한 기술을 명시하고 있습니다. 이는 대부분의 MPEG 오디오 기술에 적용됩니다. 이 표준은 음량 정규화 및 동적 범위 압축과 같은 다양한 재생 시나리오를 효율적으로 지원하기 위한 유연한 솔루션을 제공합니다.
The article discusses ISO/IEC 23003-4:2020, which is a standard that focuses on loudness and dynamic range control in MPEG audio technologies. The standard provides flexible solutions to meet the increasing demand for technologies like loudness normalization and dynamic range compression in different playback scenarios.
記事タイトル:ISO/IEC 23003-4:2020- 情報技術−MPEGオーディオ技術−第4部: ダイナミックレンジコントロール 記事内容:この文書は、音量とダイナミックレンジコントロールの技術を規定しています。これは、ほとんどのMPEGオーディオ技術に適用されます。この標準は、さまざまな再生シナリオでの音量正規化やダイナミックレンジ圧縮などの技術への広範なニーズを効率的にサポートする柔軟な解決策を提供します。








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...