Information technology — Coding of audio-visual objects — Part 26: Audio conformance

ISO/IEC 14496-26:2010 specifies how tests can be designed to verify whether compressed data and decoders meet requirements specified by ISO/IEC 14496-3. In ISO/IEC 14496-26:2010, encoders are not addressed specifically. An encoder may be said to be an ISO/IEC 14496 encoder if it generates compressed data compliant with the syntactic and semantic bitstream payload requirements specified in ISO/IEC 14496-3. Characteristics of compressed data and decoders are defined for ISO/IEC 14496-3. The compressed data characteristics define the subset of the standard that is exploited in the compressed data. Examples are the applied values or range of the sampling rate and bitrate parameters. Decoder characteristics define the properties and capabilities of the applied decoding process. An example of a property is the applied arithmetic accuracy. The capabilities of a decoder specify which compressed data the decoder can decode and reconstruct, by defining the subset of the standard that may be exploited in the decodable compressed data. Compressed data can be decoded by a decoder if the characteristics of the compressed data are within the subset of the standard specified by the decoder capabilities. Procedures are described for testing conformance of compressed data and decoders to the requirements defined in ISO/IEC 14496-3. Given the set of characteristics claimed, the requirements that must be met are fully determined by ISO/IEC 14496-3. ISO/IEC 14496-26:2010 summarises the requirements, cross references them to characteristics, and defines how conformance with them can be tested. Guidelines are given on constructing tests to verify decoder conformance. Some examples of compressed data implemented according to these guidelines are provided as an electronic annex to this document usually together with their uncompressed counterparts (reference waveforms).

Technologies de l'information — Codage des objets audiovisuels — Partie 26: Conformité audio

General Information

Status
Published
Publication Date
10-May-2010
Current Stage
9599 - Withdrawal of International Standard
Start Date
08-Nov-2024
Completion Date
30-Oct-2025
Ref Project

Relations

Standard
ISO/IEC 14496-26:2010 - Information technology -- Coding of audio-visual objects
English language
247 pages
sale 15% off
Preview
sale 15% off
Preview
Standard
ISO/IEC 14496-26:2010 - Information technology -- Coding of audio-visual objects
English language
247 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


INTERNATIONAL ISO/IEC
STANDARD 14496-26
First edition
2010-05-01
Information technology — Coding of
audio-visual objects —
Part 26:
Audio conformance
Technologies de l'information — Codage des objets audiovisuels —
Partie 26: Conformité audio
Reference number
©
ISO/IEC 2010
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.

©  ISO/IEC 2010
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2010 – All rights reserved

Contents Page
Foreword .v
Introduction.vii
1 Scope.1
2 Normative references.1
3 Terms and definitions .2
4 Conformance Points .2
5 Profiles.4
6 Conformance data .4
6.1 File name conventions.4
6.2 Content.6
7 Audio Object Types.7
7.1 General.7
7.2 Null.14
7.3 AAC-based scalable configurations.14
7.4 AAC (main, LC, ER LC, SSR, LTP, ER LTP, ER LD, scalable, ER scalable).15
7.5 TwinVQ and ER_TwinVQ .40
7.6 ER BSAC.44
7.7 CELP.52
7.8 ER CELP.56
7.9 HVXC.61
7.10 ER HVXC.71
7.11 ER HILN and ER Parametric .74
7.12 TTSI.89
7.13 General MIDI.91
7.14 Wavetable Synthesis.92
7.15 Algorithmic Synthesis and AudioFX .93
7.16 Main Synthetic.100
7.17 SBR.102
7.18 PS (Parametric Stereo).113
7.19 SSC (SinuSoidal Coding).115
7.20 DST (Lossless coding of oversampled audio) .121
7.21 Layer-3.123
7.22 ALS (Audio lossless coding).125
7.23 SLS (Scalable Lossless Coding).127
7.24 Layer-1 and Layer 2.130
7.25 Low Delay SBR.131
8 Audio EP tool .134
8.1 Compressed data .134
8.2 Decoders.137
9 Audio Composition .142
9.1 AudioBIFS v1.142
9.2 Advanced Audio BIFS nodes .153
9.3 AudioBIFS v3 Nodes .179
10 MPEG-4 audio transport stream .197
10.1 General.197
10.2 Compressed Data.198
10.3 Decoders.198
© ISO/IEC 2010 – All rights reserved iii

11 Upstream.199
11.1 Compressed data.199
11.2 Decoders.200
12 Conformance test sequence assignment to profiles and levels .200
12.1 Overview.200
12.2 Audio.200
12.3 Systems.216
Annex A (informative) Complexity measurement criteria and tool for level definitions of
algorithmic synthesis and AudioFX Object Type.221
Annex B (informative) Test bitstreams for the CELP object type .242

iv © ISO/IEC 2010 – All rights reserved

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
ISO/IEC 14496-26 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
This part of ISO/IEC 14496 cancels and replaces:
⎯ ISO/IEC 14496-4:2004, Clause 6,
⎯ ISO/IEC 14496-4:2004/Cor.5,
⎯ ISO/IEC 14496-4:2004/Cor.6,
⎯ ISO/IEC 14496-4:2004/Amd.8:2005, including ISO/IEC 14496:2004/Amd.8:2005/Cor.1:2008,
⎯ ISO/IEC 14496-4:2004/Amd.11:2006, including ISO/IEC 14496-4:2004/Amd.11:2006/Cor.1:2008,
⎯ ISO/IEC 14496-4:2004/Amd.11:2006/Cor.2:2007,
⎯ ISO/IEC 14496-4:2004/Amd.11:2006/Cor.3:2008,
⎯ ISO/IEC 14496:2004-4/Amd.13:2007, including ISO/IEC 14496-4:2004/Amd.13:2007/Cor.1:2007,
⎯ ISO/IEC 14496:2004-4/Amd.13:2007/Cor.2:2007,
⎯ ISO/IEC 14496-4:2004/Amd.14:2007,
⎯ ISO/IEC 14496-4:2004/Amd.15:2007,
⎯ ISO/IEC 14496-4:2004/Amd.18:2007,
⎯ ISO/IEC 14496-4:2004/Amd.19:2007, including ISO/IEC 14496-4:2004/Amd.19:2007/Cor.1:2008,
⎯ ISO/IEC 14496-4:2004/Amd.20:2008, and
⎯ ISO/IEC 14496-4:2004/Amd.22:2008.
© ISO/IEC 2010 – All rights reserved v

ISO/IEC 14496 consists of the following parts, under the general title Information technology — Coding of
audio-visual objects:
⎯ Part 1: Systems
⎯ Part 2: Visual
⎯ Part 3: Audio
⎯ Part 4: Conformance testing
⎯ Part 5: Reference software
⎯ Part 6: Delivery Multimedia Integration Framework (DMIF)
⎯ Part 7: Optimised reference software for coding of audio-visual objects
⎯ Part 8: Carriage of ISO/IEC 14496 contents over IP networks
⎯ Part 9: Reference hardware description
⎯ Part 10: Advanced Video Coding
⎯ Part 11: Scene description and application engine
⎯ Part 12: ISO base media file format
⎯ Part 13: Intellectual Property Management and Protection (IPMP) extensions
⎯ Part 14: MP4 file format
⎯ Part 15: Advanced Video Coding (AVC) file format
⎯ Part 16: Animation Framework eXtension (AFX)
⎯ Part 17: Streaming text format
⎯ Part 18: Font compression and streaming
⎯ Part 19: Synthesized texture stream
⎯ Part 20: Lightweight Application Scene Representation (LASeR) and Simple Aggregation Format (SAF)
⎯ Part 21: MPEG-J Graphics Framework eXtensions (GFX)
⎯ Part 22: Open Font Format
⎯ Part 23: Symbolic Music Representation
⎯ Part 24: Audio and systems interaction [Technical Report]
⎯ Part 25: 3D Graphics Compression Model
⎯ Part 26: Audio conformance
⎯ Part 27: 3D Graphics conformance
vi © ISO/IEC 2010 – All rights reserved

Introduction
ISO/IEC 14496-3 specifies coded representations of audio information. ISO/IEC 14496-3 allows for large
flexibility, achieving suitability of ISO/IEC 14496 for many different applications. The flexibility is obtained by
including parameters in the bitstream that define the characteristics of coded bitstreams. Examples are the
audio sampling frequency bitrate parameters, synchronisation timestamps, the association of bitstreams and
synthetic objects within objects.
This part of ISO/IEC 14496 specifies how tests can be designed to verify whether bitstreams and decoders
meet the requirements as specified in ISO/IEC 14496-3 and allow interoperability with remote terminals in
interactive, broadcast and local (with stored contents) sessions. These tests can be used for various purposes
such as
⎯ manufacturers of encoders, and their customers, can use the tests to verify whether the encoder
produces bitstreams compliant with ISO/IEC 14496-3,
⎯ manufacturers of decoders and their customers can use the tests to verify whether the decoder meets the
requirements specified in ISO/IEC 14496-3 for the claimed decoder capabilities,
⎯ manufacturers and customers of terminals supporting interactive, broadcast and local sessions over a
multitude of transport protocols and networks, can use the tests to verify whether the claimed
functionalities are compliant with ISO/IEC 14496-6,
⎯ manufacturers of test equipments, and their customers can use the tests to verify compliance with
ISO/IEC 14496-3.
© ISO/IEC 2010 – All rights reserved vii

INTERNATIONAL STANDARD ISO/IEC 14496-26:2010(E)

Information technology — Coding of audio-visual objects —
Part 26:
Audio conformance
1 Scope
This part of ISO/IEC 14496 specifies how tests can be designed to verify whether compressed data and
decoders meet requirements specified by ISO/IEC 14496-3. In this part of ISO/IEC 14496, encoders are not
addressed specifically. An encoder may be said to be an ISO/IEC 14496 encoder if it generates compressed
data compliant with the syntactic and semantic bitstream payload requirements specified in ISO/IEC 14496-3.
Characteristics of compressed data and decoders are defined for ISO/IEC 14496-3. The compressed data
characteristics define the subset of the standard that is exploited in the compressed data. Examples are the
applied values or range of the sampling rate and bitrate parameters. Decoder characteristics define the
properties and capabilities of the applied decoding process. An example of a property is the applied arithmetic
accuracy. The capabilities of a decoder specify which compressed data the decoder can decode and
reconstruct, by defining the subset of the standard that may be exploited in the decodable compressed data.
Compressed data can be decoded by a decoder if the characteristics of the compressed data are within the
subset of the standard specified by the decoder capabilities.
Procedures are described for testing conformance of compressed data and decoders to the requirements
defined in ISO/IEC 14496-3. Given the set of characteristics claimed, the requirements that must be met are
fully determined by ISO/IEC 14496-3. This part of ISO/IEC 14496 summarises the requirements, cross
references them to characteristics, and defines how conformance with them can be tested. Guidelines are
given on constructing tests to verify decoder conformance. Some examples of compressed data implemented
according to these guidelines are provided as an electronic annex to this document usually together with their
uncompressed counterparts (reference waveforms).
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO/IEC 11172-3, Information technology — Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 Mbit/s — Part 3: Audio
ISO/IEC 11172-4, Information technology — Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 Mbit/s — Part 4: Compliance testing
ISO/IEC 13818-3, Information technology — Generic coding of moving pictures and associated audio
information — Part 3: Audio
ISO/IEC 13818-4, Information technology — Generic coding of moving pictures and associated audio
information — Part 4: Conformance testing
ISO/IEC 13818-7, Information technology — Generic coding of moving pictures and associated audio
information — Part 7: Advanced Audio Coding (AAC)
© ISO/IEC 2010 – All rights reserved 1

ISO/IEC 14496-1, Information technology — Coding of audio-visual objects — Part 1: Systems
ISO/IEC 14496-3, Information technology — Coding of audio-visual objects — Part 3: Audio
ISO/IEC 14496-11, Information technology — Coding of audio-visual objects — Part 11: Scene description
and application engine
3 Terms and definitions
For the purposes of this document the terms, defintions, symbols and abbreviated terms given in
ISO/IEC 14496-1, ISO/IEC 14496-3 and the following apply.
3.1
conformance data
conformance test sequences and conformance tools
3.2
conformance tool
tool to check certain conformance criteria
NOTE Conformance tools are provided in the electronic attachments to this part of ISO/IEC 14496.
3.3
conformance test sequence
superset of compressed data and its reference waveforms
NOTE Examples of conformance test sequences are provided in the electronic attachments to this part of
ISO/IEC 14496.
3.4
compressed data
data encoded in accordance with ISO/IEC 14496-3
3.5
reference waveform
decoded counterparts of the compressed data
4 Conformance Points
All audio decoders except the LATM-based decoders are part of the MPEG-4 framework. Table 1 gives an
overview about the interfaces that have to be provided to test the audio decoders using the MPEG-4 System.
Table 1 — Conformance points
conformance point/interface data flow description/reference
direction
AudioSpecificConfig in audio related decoder specific information, see ISO/IEC 14496-3:2009, (1.6.2.1
AudioSpecificConfig)
audio access units in audio related bitstream payload, see ISO/IEC 14496-1:2004 (7.1.2.3 Access Units (AU))
BIFS/AudioSource node in see ISO/IEC 14496-11: 2005 (7.2.2.15 Audio Source)
private test info in to control some elements which are usually generated by random number generators
audio composition units out see ISO/IEC 14496-1: 2004 (7.2.8 Composition Units (CU))

2 © ISO/IEC 2010 – All rights reserved

Figure 1 gives an overview about the test bench (MPEG-4 System), the system under test (Audio decoder),
and the interfaces between them. Figure 2 gives a more detailed view on the audio decoder, consisting of
error protection (EP) decoder and audio core decoder.

MPEG-4 System
AudioSpecificConfig private test info
*.mp4 audio presentation
BIFS(AudioSource)
file format
Node fields
Audio Decoder
audio composition units
audio access units
(incl. time stamps for SA, TTS)

Figure 1 — Audio Conformance Points

Audio Decoder
EP Decoder Audio
directMapping
Core
epConfig
2/3
Decoder
Figure 2 — Audio decoder structure

Clause 7 describes:
The conformance criteria of the audio core decoder.
The conformance criteria of the compressed data not requiring the EP decoder (epConfig == 0 || epConfig
== 1).
The properties of the examples of compressed data with (epConfig == 0 || epConfig == 1).
Clause 8 describes:
The conformance criteria of the EP decoder
The conformance criteria of the compressed data requiring the EP decoder (epConfig == 2 || epConfig == 3).
The properties of the examples of compressed data with (epConfig == 2 || epConfig == 3).
Compressed data with different epConfig settings might be available referring to the same reference
waveforms. Here, the output of a conforming decoder shall be equal, independently of the used epConfig
setting.
© ISO/IEC 2010 – All rights reserved 3

For some of the compressed data containing scalable configurations, conformance points are defined at the
PCM output of the decoder for m layers being decoded from an n-layer input, where m is an integer in the
range 0 (base layer conformance) to n-1. The reference PCM decoder output signals corresponding to these
conformance points are listed in the respective conformance tables.
5 Profiles
ISO/IEC 14496-3 defines several profiles and several levels within each profile. Conformance is always tested
against a certain level within a certain profile. Audio profiles always comprise a set of audio object types.
Nevertheless the conformance criteria as described within this document are based on audio object types.
The assignment of object types to profiles as well as the level definitions can be found in ISO/IEC 14496-3.
The conformance of a certain level within a certain profile is fulfilled, if the conformance of each object type
belonging to this profile is fulfilled. The assignment of the provided test sequences to profiles and levels can
be found in Clause 12.
6 Conformance data
6.1 File name conventions
For all conformance test sequences, the file name convention given in Table 2 is used.
Table 2 — File name conventions
object type name/ tool name File Name (compressed) File Name (uncompressed)
AdvancedAudioBIFS aabper -- not applicable --
- perceptual apporach
AdvancedAudioBIFS aabphy -- not applicable --
- physical approach
AudioBIFS ab_ ab_
AudioBIFS v3 ABv3_ -- not applicable --
AAC scalable ac ac[_lay]
AAC LC al_ al_[_cut_boost]
[_level][_]
AAC main am_ am_[_cut_boost]
[_level][_]
AAC LTP ap_ ap_
AAC SSR as_ as_[_]
CELP ce ce[_lay]
ER AAC scalable er_ac_ep er_ac[_lay]
[]
ER AAC LD er_ad__ep er_ad_
[]
ER AAC LC er_al__ep er_al_
[]
ER AAC LTP er_ap__ep er_ap_
[]
SBR (+AAC LC) al_sbr___[_fsaac al_sbr____
][_sig] [_fsaac][_sig][_]
SBR (+AAC LC with al960_sbr___ al960_sbr____
960 samples per frame ) [_fsaac][_sig] [_fsaac][_sig][_]
PS (+SBR+AAC LC) al_sbr_ps_ al_sbr_ps_[_]
SSC ssc__[_sig] ssc___[_sig][_]
DST dst__[_sig] dst___[_sig][_]
Layer-3 l3_ l3_
4 © ISO/IEC 2010 – All rights reserved

ER BSAC er_bs__ep er_bs_[_lay]
[]
ER CELP er_ce_ep er_ce[_lay]
[]
ER HILN er_hi_ep er_hi[_lay]
[] [_s][_p ER HVXC er_hv_ep er_hv[_lay]_
[]
ER Parametric er_pa_ep er_pa[_lay]_
[]
ER Twin VQ er_tv_ep er_tv[_lay]
[]
HVXC hv hv[_lay]_ref
Algorithmic Synthesis sy sy
and Audio FX
TTSI tts tts
TwinVQ tv tv[_lay]
ALS als__ als__
SLS sls__ sls__
Layer-1 l1_ l1_
Layer-2 l2_ l2_
ER AAC ELD er_eld__ep er_eld_
[]
can be 16 or 24 and indicates the bit resolution of the coded wavefile
indicates the channel for multi-channel sequences (f - number of the front channel,
b- number of the back channel, s - number of the side channel, l - number of
the LSF channel).
indicates the coder used to encode the content (ce – CELP, sa – Structured Audio, pcm – PCM)
refers to a certain audio coder setup. It is most likely a number, but might also contain
characters.
refers to the decoder delay, it can become “ld” (low delay) or “nd” (normal delay).
can be 0, 1, 2 or 3, depending on epConfig (defined in AudioSpecificConfig).
is required if (epConfig==2 || epConfig==3). It refers to a certain error protection setup.
sampling frequency (08, 11, 12, 16, 22, 24, 32, 44, 48, 64, 88 or 96).
_level refers to the level with regard to DRC.
_cut_boost referes to the cut and boost factors with regard to DRC.
_lay is required for any scalable configuration. It marks the highest layer of the scalable
configuration used for decoding (starting with 0 for the core layer).
_p is a number refering to the decoder configuration with regard to the pitch factor.
_ref is a number refering to the decoder configuration with regard to delay mode, speed and pitch
change.
_s is a number refering to the decoder configuration with regard to the speed factor.
© ISO/IEC 2010 – All rights reserved 5

indicates the SBR module mainly targeted by the test sequence. Possible values are “e” for testing the
envelope adjuster “s” for testing sine addition, “gh” for testing time-grid transitions in combination with changes
of SBR header data, “i” for testing inverse filtering, “qmf” for testing the QMF implementation, “cm” for testing
various channel modes, “sig” for testing SBR signaling, “twi” for QMF identification, and “sr” for testing various
combinations of sampling rates.
is the abbreviation of one of the AudioBIFS v3 node names.
corresponds to the number of channels present in the conformance test sequence. It is either a
single integer, in which case it refers to the number of main audio channels, or two integers separated by a “.”,
in which case the first integer equals the number of main audio channels, while the second number equals the
number of low frequency enhancement channels.
fsaac corresponds to the sampling rate of the underlying AAC-LC data. If it is omitted, it is half the
sampling rate given as output sampling rate.
is an integer describing the kind of signalling used according to the table below. If this value is omitted,
backwards compatible explicit signalling of SBR is used.
file name conventions
sig Signalling method used
0 Implicit signalling of SBR
1 Hierarchical explicit signalling of SBR
2 Backwards compatible explicit signalling of SBR

is either “hq” or “lp” for the high quality or the low power version of the SBR decoding algorithm
respectively.
is either “bl” or “ur” for the baseline or the unrestricted version of the parametric stereo decoding
algorithm respectively.
With respect to file extensions, the following rules are applied:
Compressed MPEG-4 file format .mp4
Compressed native MPEG-1/2 Audio storage format .mpg
Compressed Audio data interchange format .adif
Compressed Audio data transport stream .adts
Compressed AudioSyncStream .ass
Compressed EPAudioSyncStream .ess
Compressed AudioPointerStream .aps
Uncompressed HILN Conformance Test Parameters .ctp
Uncompressed WAVE format (uncompressed PCM format) .wav
Uncompressed TTSI decoded text and control digits .txt

6.2 Content
The test set includes a set of sine sweeps, a set of musical/speech test sequences and a set of noise-like test
sequences. The supplied sine sweeps with an amplitude of -20dB relative to full scale have an absolute
amplitude of +/- 0.1.
6 © ISO/IEC 2010 – All rights reserved

7 Audio Object Types
7.1 General
This Clause lists all audio object types. It starts with a general description, which may be related to more than
one object type.
This Clause contains general descriptions for conformance testing on compressed data and decoders. Unless
explicitly restricted, these descriptions are related to all object types.
7.1.1 Compressed Data
7.1.1.1 Characteristics
Characteristics of compressed data specify the constraints that are applied by the encoder in generating the
compressed data. These syntactic and semantic constraints may, for example, restrict the range or the values
of parameters that are encoded directly or indirectly in the compressed data. The constraints applied to a
given compressed data may or may not be known a priori.
Decoder relevant compressed data may consist of the following parts:
decoder specific information (AudioSpecificConfig)
BIFS/AudioSource node (field information)
audio access units (establishing the bitstream payload)
7.1.1.1.1 ESC instance configuration
In case of epConfig=1, each instance of each sensitivity category belonging to one frame is stored separately
within a single access unit, i.e. there exist as many elementary streams as instances defined within a frame.
Note: In case of epConfig=3, the mapping between EP classes and ESC instances is signaled by the data
element directMapping. In case of directMapping=1, the restrictions regarding the ESC instance configuration
apply accordingly to the EP class configuration.
The following table gives an overview about the valid configurations:
© ISO/IEC 2010 – All rights reserved 7

Table 3 — Number of ESC instances that build a frame in case of epConfig==1
Audio object type number of ESC instances to build a frame
ER AAC see Table 4
ER Twin VQ non-scalable or base layer: 2
any enhancement layer: 2
ER BSAC base layer: 2
any large-step enhancement layer: 1
ER CELP base layer: 5
any enhancement layer: 1
ER HVXC 2 kbit/s, non-scalable or base layer: 4
4 kbit/s, non-scalable: 5
any enhancement layer: 3
ER HILN base layer: 5
any enhancement/extension layer: 1
ER Parametric PARAmode==0,1 base layer: 5
PARAmode==2,3 base layer: 15
any enhancement/extension layer: 1

Table 4 — Number of ESC instances that build elements/layers of
an ER AAC frame in the case of epConfig==1
aacScalefactorDataResilienceFlag
0 1
single channel element (SCE) / mono layer 3 4
channel pair element (CPE) / stereo layer 7 9
extension payload (EPL) 2
Depending on the value of the data element channelConfiguration, an AAC frame might cover several
instances of SCE, CPE or EPL. This leads to the following valid configurations:
Table 5 — Number of ESC instances that build an ER AAC frame/layer in the case of epConfig==1
aacScalefactorDataResilienceFlag
AOT 0 1
17 19 20 23 channelConfiguration main payload N extension payloads
x x x x 1 3 4
x x x x 2 7 9
x x x 3 3+7 4+9
+2*N
x x x 4 3+7+3 4+9+4
x x x 5 3+7+7 4+9+9
x x x 6 3+7+7+3 4+9+9+4
x x x 7 3+7+7+7+3 4+9+9+9+4
7.1.1.2 Test procedure
Each compressed data shall meet the syntactic and semantic requirements specified in ISO/IEC 14496-3. For
each audio object type a set of semantic tests to be performed on the compressed data is described. To verify
whether the syntax is correct is straightforward and therefore not defined herein after. In the description of the
semantic tests it is assumed that the tested compressed data contains no errors due to transmission or other
causes. For each test the condition or conditions that must be satisfied are given, as well as the prerequisites
or conditions in which the test can be applied.
8 © ISO/IEC 2010 – All rights reserved

7.1.2 Decoders
7.1.2.1 Characteristics
The decoder characteristics are defined by the profiles and levels being tested.
7.1.2.2 Test procedure
To test audio decoders, ISO/IEC JTC 1/SC 29/WG 11 supplies a number of test sequences. Supplied
sequences cover all profile decoders. For a supplied test sequence, testing can be done by comparing the
output of a decoder under test with a reference output also supplied by ISO/IEC JTC 1/SC 29/WG 11. In
cases where the decoder under test is followed by additional operations (e.g. quantizing a signal to a 16 bit
output signal) the conformance point is prior to such additional operations, i.e. it is permitted to use the actual
decoder output (e.g. with more than 16 bit) for conformance testing.
Measurements are carried out relative to full scale where the output signals of the decoders are normalized to
be in the range between −1.0 and +1.0.
The following Subclauses define a set of test methods. A particular test method for a certain test sequence is
specified in the object type specific subclauses.
For elements producing output that cannot be tested with the methods described below, specific conformance
testing procedures are described in the object type specific subclauses.
7.1.2.2.1 RMS/LSB Measurement
To fulfill the “RMS/LSB Measurement” test at an accuracy level of “K bit”, an ISO/IEC 14496-3 decoder shall
provide an output waveform such that the RMS level of the difference signal between the output of the
-(K-1)
decoder under test and the supplied reference output is less than 2 /sqrt(12). In addition, the difference
-(K-2)
signal shall have a maximum absolute value of at most 2 relative to full-scale. The “RMS/LSB
Measurement” test shall be carried out for an accuracy level of K=16 bit unless a different accuracy level is
explicitly stated.
7.1.2.2.1.1 Calculation of RMS
For the calculation of the RMS level, all measurements are carried out relative to full scale where the output
signals of the decoder and supplied test sequences are normalized to be in the range between -1.0 and +1.0.
The supplied reference waveforms have a precision (P) of 24 bits, where the most significant bit (MSB) will be
labeled bit 0 and the least-significant bit (LSB) will be labeled bit 23. The most significant bit (bit 0) represents
the value of –1, the second most significant bit (bit 1) represents the value of +1/2, etc.
value of bit 0 (MSB) =−= − 1
1 1
value of bit 1 = =
2 2
1 1
value of bit 2 ==
2 4

1 1
value of bit 23 (LSB) = =
2 8,,388 608
© ISO/IEC 2010 – All rights reserved 9

The output waveform of the decoder under test is required to be in the same format. In the case that the
output of the decoder has a precision of P' bits and if P' is smaller than 24, then the output is extended to 24
bits by setting bit P' through bit 23 to zero. In the next step, the difference (diff) of the samples of these signals
has to be calculated. Every channel of a multichannel waveform shall be tested. The total number of samples
for each channel is N.
diff()n== ' output signal of decoder under test (n)' - ' supplied test sequence (n)' , for n 1 to N
The values of all difference samples shall be squared, summed, divided by N and then the square-root shall
be calculated. This calculation finally gives the RMS level.
N
rms = diff (n)

N
n=1
This test only verifies the computational accuracy of an implementation.
Software is provided for performing this verification procedure.
7.1.2.2.2 Segmental SNR
This criterion is designed to test decoders decoding the object types CELP, ER CELP, HVXC, ER HVXC,
TwinVQ, ER TwinVQ and ER HILN.
Definition:
th
x (i) : i sample of reference output signal (normalized in a range between –1.0 and 1.0).
a
th
x (i) : i sample of output signal of a decoder under test normalized in a range between –1.0 and 1.0.
b
L : the length of segment
N : the total number of segments
th
SS(k) : SNR of k segment
SSNR : segmental SNR
L−1
⎛ ⎞
⎜ x (k ×L +i) ⎟
∑ a
i=0
⎜ ⎟
SS(k) = log 1+
L−1
⎜ ⎟
−13
10 L +()x (k ×L +i) −x (k ×L +i)
⎜ ∑ a b ⎟
⎝ i=0 ⎠
N −1
⎛ ⎞
SS (k ) /N

⎜ ⎟
k =0
SSNR = 10 × log 10 −1.0
⎜ ⎟
⎜ ⎟
⎝ ⎠
7.1.2.2.3 Frequency domain criterion based on cepstrum analysis
This criterion is designed to test decoders decoding the object types CELP, ER CELP, TwinVQ, ER TwinVQ
and ER HILN.
10 © ISO/IEC 2010 – All rights reserved

The cepstrum analysis procedure is defined by means of the functions lpc2cepstrum
and calculate_lpc provided in pseude C code below.

#define LPC_ORDER 16 /*  LPC order         */
#define CEPSTRUM_ORDER 32 /*  Cepstrum order       */
#define BW 0.0125F /*  Bandwidth scalefactor   */

void lpc2cepstrum (float  lpc_coef[], /*  in:  LPC coefficients (a-parameters)  */
float  C[]) /*  out:  LPC cepstrum            */

{
float ss;
int  i, m;
/* it is assumed that lpc_coef[0] is 1 ! */

C[1] = -lpc_coef[1];
for (m = 2; m <= LPC_ORDER; m++)
{
ss= -lpc_coef[m] * m;
for (i = 1; i < m; i++)
{
ss -= lpc_coef[i] * C[m-i];
}
C[m] = ss;
}
for (m = LPC_ORDER + 1; m <= CEPSTRUM_ORDER; m++)
{
ss = 0.0F;
for (i = 1; i<= LPC_ORDER; i++)
{
ss -= lpc_coef[i] * C[m-i];
}
C[m] = ss;
}
for (m = 2; m <= CEPSTRUM_ORDER; m++)
{
C[m] /= m;
}
}
void calculate_lpc (float  *in,     /*  in:  input PCM audio data        */
int   frame_size,  /*  in:  analysis frame length in samples  */
float  *lpc_coef)  /*  out:  LPC coefficients          */

{
int   ip;
float  wvpowfr, cor[LPC_ORDER + 1];
float  wlag [LPC_ORDER + 1];
float  *wdw;
wdw = (float*) malloc (sizeof (float) * frame_size);

if (wdw == NULL)
{
printf ("Memory allocation error in calculate_lpc.\n");
exit (1);
}
hamwdw (wdw, frame_size);
for (ip = 0; ip < frame_size; ip++)
{
in[ip] *= wdw[ip];
}
sigcor (in, frame_size, &wvpowfr, cor, LPC_ORDER);

lagwdw (wlag, LPC_ORDER, BW);
© ISO/IEC 2010 – All rights reserved 11

for (ip = 1; ip <= LPC_ORDER; ip++)
{
cor[ip] *= wlag[ip];
}
corref (LPC_ORDER, cor, lpc_coef);

free (wdw);
}
void hamwdw (float  wdw[],
int   n)
{
int    i;
float   d, pi = 3.141592653589793F;

d = (float) (2.0 * pi/n);
for (i = 0; i < n; i++)
{
wdw[i] = (float) (0.54 - 0.46 * cos (d * i));
}
}
void lagwdw (float  wdw[],
int   n,
float  h)
{
int   i;
float  pi = 3.141592653589793F;
float  a, b, w;
a = (float) (log (0.5) * 0.5 / log (cos (0.5 * pi * h)));
a = (float) ((int) a);
w = 1.0F;
b = a;
wdw[0] = 1.0F;
for (i = 1; i <= n; i++)
{
b += 1.0F;
w *= a / b;
wdw[i] = w;
a -= 1.0F;
}
}
void sigcor (float  *sig,
int   n,
float  *_pow,
float  cor[],
int   p)
{
int   k, ij;
float  c, dsqsum;
float  sqsum = 1.0e-35F;
if (n > 0)
{
for (ij = 0; ij < n; ij++)
{
sqsum += (sig[ij] * sig[ij]);
}
dsqsum = (float) (1.0 / sqsum);

for (k = 1; k <= p; k++)
{
c = 0.0;
for(ij = k; ij < n; ij++)
{
c += (sig[ij - k] * sig[ij]);
}
cor[k] = c * dsqsum;
}
k = p;
}
*_pow = (float) ((sqsum - 1.e-35) / (float)n);
12 © ISO/IEC 2010 – All rights reserved

cor[0] = 1.0F;
}
void corref (int   p,      /*  in:  LPC analysis order       */
float  cor[],    /*  in:  correlation coefficients    */
float  alf[])    /*  out:  linear predictive coefficients */

{
int   i, j, k;
float  resid, r, a;
float  ref[LPC_ORDER + 1];
ref[1] = cor[1];
alf[1] = -ref[1];
resid = (float) ((1.0 - ref[1]) * (1.0 + ref[1]));

for (i = 2; i <= p; i++)
{
r = cor[i];
for (j = 1; j < i; j++)
{
r += alf[j] * cor[i-j];
}
alf[i] = -(ref[i] = (r /= resid ));
j = 0;
k = i;
while (++j <= --k)
{
a = alf[j];
alf[j] -= r * alf[k];
if (j < k)
{
alf[k] -= r * a;
}
}
resid = (float) (resid * (1.0 - r) * (1.0 + r));
}
}
7.1.2.2.4 PNS conformance criteria
Two tests
...


INTERNATIONAL ISO/IEC
STANDARD 14496-26
First edition
2010-05-01
Information technology — Coding of
audio-visual objects —
Part 26:
Audio conformance
Technologies de l'information — Codage des objets audiovisuels —
Partie 26: Conformité audio
Reference number
©
ISO/IEC 2010
PDF disclaimer
PDF files may contain embedded typefaces. In accordance with Adobe's licensing policy, such files may be printed or viewed but shall
not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading a PDF file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create the PDF file(s) constituting this document can be found in the General Info relative to
the file(s); the PDF-creation parameters were optimized for printing. Every care has been taken to ensure that the files are suitable for
use by ISO member bodies. In the unlikely event that a problem relating to them is found, please inform the Central Secretariat at the
address given below.
©  ISO/IEC 2010
All rights reserved. Unless required for installation or otherwise specified, no part of these DVDs may be reproduced, stored in a retrieval
system or transmitted in any form or by any means without prior permission from ISO. Requests for permission to reproduce this product
should be addressed to
ISO copyright office • Case postale 56 • CH-1211 Geneva 20 • Switzerland
Internet copyright@iso.org
Reproduction may be subject to royalty payments or a licensing agreement.
Violators may be prosecuted.
Published in Switzerland
ii © ISO/IEC 2010 – All rights reserved

These DVDs contain:
1) the publication ISO/IEC 14496-26:2010 in portable document format (PDF), which can be viewed
using Adobe® Acrobat® Reader;
2) electronic attachments for audio conformance.
Adobe and Acrobat are trademarks of Adobe Systems Incorporated.

Installation
If this publication has been packaged as a zipped file, do NOT open the file from the DVD, but copy it to the
desired location in your local environment. Once the file has been copied to your local environment, open the
file to unzip its contents. For compound documents (e.g. HTML documents comprising several files or folders,
documents that have been subdivided owing to the total file size, etc.), in order for the links between
documents to function properly, the file and folder names must be maintained and all the files stored in the
same folder.
Where the zip file contains a Readme file, it is essential to consult this file to understand the way in which the
document has been structured.
© ISO
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...