Information technology — Coding of audio-visual objects — Part 3: Audio — Amendment 1: Bandwidth extension

Technologies de l'information — Codage des objets audiovisuels — Partie 3: Codage audio — Amendement 1: Extension de largeur de bande

General Information

Status
Withdrawn
Publication Date
03-Nov-2003
Withdrawal Date
03-Nov-2003
Current Stage
9599 - Withdrawal of International Standard
Completion Date
14-Mar-2006
Ref Project

Relations

Buy Standard

Standard
ISO/IEC 14496-3:2001/Amd 1:2003 - Bandwidth extension
English language
120 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO/IEC
STANDARD 14496-3
Second edition
2001-12-15
AMENDMENT 1
2003-11-01


Information technology — Coding of
audio-visual objects —
Part 3:
Audio
AMENDMENT 1: Bandwidth extension
Technologies de l'information — Codage des objets audiovisuels —
Partie 3: Codage audio
AMENDEMENT 1: Extension de largeur de bande




Reference number
ISO/IEC 14496-3:2001/Amd.1:2003(E)
©
ISO/IEC 2003

---------------------- Page: 1 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.


©  ISO/IEC 2003
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland

ii © ISO/IEC 2003 — All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 1 to ISO/IEC 14496-3:2001 was prepared by Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia
information.
© ISO/IEC 2003 — All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
Introduction
This document specifies the first Amendment to the ISO/IEC 14496-3:2001 standard. The document specifies
the normative syntax of the SBR tool and the decoding process. An informative encoder description is given
as well. Furthermore, this document specifies two new profiles, one based on the AAC LC Audio Object Type
and one based on AAC in combination with SBR.

iv © ISO/IEC 2003 — All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
Information technology — Coding of audio-visual objects —
Part 3:
Audio
AMENDMENT 1: Bandwidth extension
In ISO/IEC 14496-3:2001, Introduction, MPEG-4 general audio coding tools, add:

MPEG-4 SBR, (Spectral Band Replication) is a bandwidth extension tool used in combination with the AAC
general audio codec. When integrated into the MPEG AAC codec, a significant improvement of the
performance is available, which can be used to lower the bitrate or improve the audio quality. This is achieved
by replicating the highband, i.e. the high frequency part of the spectrum. A small amount of data representing
a parametric description of the highband is encoded and used in the decoding process. The data rate is by far
below the data rate required when using conventional AAC coding of the highband.

Amendment Subpart 1

In Part 3: Audio, Subpart 1, in subclause 1.3 Terms and Definitions, add:

206. SBR: Spectral Band Replication.

and increase the index-number of subsequent entries.

In Part 3: Audio, Subpart 1, in subclause 1.5.1.1 Audio object type definition, replace table 1.1 with the
following table:
© ISO/IEC 2003 — All rights reserved 1

---------------------- Page: 5 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)

Table 1.1 – Audio object definition


Tools/
Modules



Audio Object Type
Null                 0
AAC main X X X X X X X X X  X        2) 1
AAC LC X X X X X X X X  X        2
AAC SSR X X X  X X X X X X  X        3
AAC LTP X X X X X X X X X  X        2) 4
SBR                X 5
AAC Scalable X X X X X X  X X X X X X        6) 6
TwinVQ  X X X X X   X  X        7
CELP            X     8
HVXC             X    9
(Reserved)                 10
(Reserved)                 11
TTSI                X  12
Main synthetic              X X X  3) 13
Wavetable              X X  4) 14
synthesis
General MIDI               X   15
Algorithmic              X    16
Synthesis and
Audio FX
ER AAC LC X X X X X  X X  X X X X      17
(Reserved)                 18
ER AAC LTP X X X X X X  X X  X X X X     5) 19
ER AAC scalable X X X X X  X X X X X X X X X     6) 20
ER TwinVQ X X X X   X  X X X      21
ER BSAC X X X X X  X X   X X X      22
ER AAC LD  X X X X X  X X  X X X X      23
ER CELP           XXXX     24
ER HVXC           X X X X    25
ER HILN           X X    X  26
ER Parametric           X X X X  X  27
(Reserved)                 28
(Reserved)                 29
(Reserved)                 30
(Reserved)                 31



2 © ISO/IEC 2003 — All rights reserved

gain control
block switching
window shapes - standard
window shapes – AAC LD
filterbank - standard
filterbank – SSR
TNS
LTP
intensity
coupling
MPEG-2 prediction
PNS
MS
SIAQ
FSS
upsampling filter tool
quantisation&coding - AAC
quantisation&coding - TwinVQ
quantisation&coding - BSAC
AAC ER Tools
ER payload syntax
EP Tool 1)
CELP
Silence Compression
HVXC
HVXC 4kbs VR
SA tools
SASBF
MIDI
HILN
TTSI
SBR
Remark
Object Type ID

---------------------- Page: 6 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
In Part 3: Audio, Subpart 1, in subclause 1.5.1.2 Description, add after 1.5.1.2.5, add:

1.5.1.2.6 SBR-Object
The SBR Object contains the SBR-Tool and can be combined with the audio object types indicated in Table
1.2A
Table 1.2A – Audio object types that can be combined with the SBR Tool
Audio Object Type Combination with Object Type ID
SBR Tool permitted
Null 0
AAC main X 1
AAC LC X 2
AAC SSR X 3
AAC LTP X 4
SBR 5
AAC Scalable X 6
TwinVQ  7
CELP 8
HVXC 9
(Reserved) 10
(Reserved) 11
TTSI 12
Main synthetic 13
Wavetable synthesis 14
General MIDI 15
Algorithmic Synthesis 16
and Audio FX
ER AAC LC X 17
(Reserved) 18
ER AAC LTP X 19
ER AAC scalable X 20
ER TwinVQ 21
ER BSAC 22
ER AAC LD 23
ER CELP 24
ER HVXC 25
ER HILN 26
ER Parametric 27
(Reserved) 28
(Reserved) 29
(Reserved) 30
(Reserved) 31

© ISO/IEC 2003 — All rights reserved 3

---------------------- Page: 7 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
In Part 3: Audio, Subpart 1, subclause 1.5.2.1 (Profiles), replace:

Eight Audio Profiles have been defined:

with

Ten Audio Profiles have been defined:

and add after item 8:

9. The AAC Profile contains the audio object type 2 (AAC-LC).
10. The High Efficiency AAC Profile contains the audio object types 5 (SBR) and 2 (AAC LC) The High
Efficiency AAC Profile is a superset of the AAC Profile


4 © ISO/IEC 2003 — All rights reserved

---------------------- Page: 8 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
In Part 3: Audio, Subpart 1, replace Table 1.2 (Audio Profiles definition) with the following table:

Table 1.2 – Audio Profiles definition

Mobile
High Low High
AAC
Main Scalable Speech Synthetic Natural Audio Object
Quality Delay Efficiency
Profile
Audio Object Audio Audio Audio Audio Audio Internet- Type
Audio Audio AAC
Type Profile Profile Profile Profile Profile working ID
Profile
Profile Profile
Profile
Null      0
AAC main X   X  1
AAC LC X X  X X X X 2
AAC SSR X   X  3
AAC LTP X X  X X  4
SBR     X 5
AAC Scalable X X  X X  6
TwinVQ X X   X  7
CELP X X X X X X  8
HVXC X X X  X X  9
(reserved)      10
(reserved)      11
TTSI X X X X X X  12
Main synthetic X  X    13
Wavetable      14
synthesis
General MIDI      15
Algorithmic      16
Synthesis and
Audio FX
ER AAC LC   X X X  17
(reserved)      18
ER AAC LTP   X X  19
ER AAC
  X X X  20
Scalable
ER TwinVQ    X X  21
ER BSAC    X X  22
ER AAC LD   X X X  23
ER CELP   X X X  24
ER HVXC   X X  25
ER HILN    X  26
ER Parametric    X  27
(reserved)      28
(reserved)      29
(reserved)      30
(reserved)      31


© ISO/IEC 2003 — All rights reserved 5

---------------------- Page: 9 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
In Part 3: Audio, Subpart 1, subclause 1.5.2.2 (Complexity units), replace table 1.3 by the table below:

Table 1.3 – Complexity of Audio Object Types and SR conversion
Object Type Parameters PCU (MOPS) RCU Remarks
AAC Main fs = 48 kHz 5 5 1)
AAC LC fs = 48 kHz 3 3 1)
AAC SSR fs = 48 kHz 4 3 1)
AAC LTP fs = 48 kHz 4 4 1)
SBR fs = 24/48 kHz (in/out) 3 2.5 1)
(SBR tool)
fs = 24/48 kHz (in/out) 2 1.5 1)
(Low Power SBR tool)
fs = 48/48 kHz (in/out) 4.5 2.5 1)
(Down Sampled SBR tool)
fs = 48/48 kHz (in/out) 3 1.5 1)
(Low Power Down Sampled
SBR tool)
AAC Scalable fs = 48 kHz 5 4 1), 2)
TwinVQ fs = 24 kHz 2 3 1)
CELP fs = 8 kHz 1 1
CELP fs = 16 kHz 2 1
CELP fs = 8/16 kHz 3 1
(bandwidth scalable)
HVXC fs = 8 kHz 2 1
TTSI - - 4)
General MIDI 4 1
Wavetable Synthesis fs = 22.05 kHz depends on depends on
bitstreams (3) bitstreams (3)
Main Synthetic depends on depends on
bitstreams (3) bitstreams (3)
Algorithmic Synthesis depends on depends on
and AudioFX bitstreams (3) bitstreams (3)
Sampling Rate rf = 2, 3, 4, 6 2 0.5 7)
Conversion
ER AAC LC fs = 48 kHz 3 3 1)
ER AAC LTP fs = 48 kHz 4 4 1)
ER AAC Scalable fs = 48 kHz 5 4 1), 2)
ER TwinVQ fs = 24 kHz 2 3 1)
ER BSAC fs = 48 kHz 4 4 1)
(input buffer size=26000bits)
fs = 48 kHz 4 8
(input buffer size=106000bits)
ER AAC LD fs = 48 kHz 3 2 1)
ER CELP fs = 8 kHz 2 1
fs = 16 kHz 3 1
ER HVXC fs = 8 kHz 2 1
ER HILN fs = 16 kHz, ns=93 15 2 6)
fs = 16 kHz, ns=47 8 2
ER Parametric fs = 8 kHz, ns=47 4 2 5),6)

6 © ISO/IEC 2003 — All rights reserved

---------------------- Page: 10 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
In Part 3: Audio, Subpart 1, subclause 1.5.2.3 (Levels within the profiles), add at the end:

• Levels for the AAC Profile
Table 1.7A - Levels for the AAC Profile
Level Max. Max. Max. PCU Max. RCU
channels/ sampling
object rate [kHz]
1 2 24 3 5
2 2 48 6 5
3 NA NA NA NA
4 5 48 19 15
5 5 96 38 15

For the audio object type 2 (AAC LC), mono or stereo mixdown elements are not permitted.
The NA (Not Applicable) levels are introduced to emphasize the hierarchical structure of the AAC Profile and
the High Efficiency AAC Profile. Hence, a decoder supporting the High Efficiency AAC Profile at a given level
can decode an AAC Profile stream of the same or a lower level. The NA levels are not indicated in the
audioProfileLevelIndication table (Table 1.7z).

• Levels for the High Efficiency AAC Profile
© ISO/IEC 2003 — All rights reserved 7

---------------------- Page: 11 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
Table 1.8A - Levels for the High Efficiency AAC Profile
Level Max. Max. AAC Max. AAC Max. SBR Max. PCU Max. RCU Max. PCU Max. RCU
channels/ sampling sampling sampling Low power Low power
object rate, SBR rate, SBR rate [kHz] SBR SBR
not present present [kHz] (in/out)

[kHz]
1 NA NA NA NA NA NA NA NA
2 2 48 24 24/48 9 10 7 8
3 2 48 48 48/48 (Note 1)15 10 12 8
4 5 48 24/48 (Note 2) 48/48 (Note 1) 25 28 20 23
5 5 96 48 48/96 49 28 39 23
Note 1: For level 3 and level 4 decoders, it is mandatory to operate the SBR tool in downsampled mode if
the sampling rate of the AAC core is higher than 24kHz. Hence, if the SBR tool operates on a 48kHz AAC
signal, the internal sampling rate of the SBR tool will be 96kHz, however, the output signal will be
downsampled by the SBR tool to 48kHz.
Note 2: For one or two channels the maximum AAC sampling rate, with SBR present, is 48kHz. For more
than two channels the maximum AAC sampling rate, with SBR present, is 24kHz.

For the audio object type 2 (AAC LC), mono or stereo mixdown elements are not permitted.

In Part 3: Audio, Subpart 1, subclause 1.5.2.4 (Table 1.7z - audioProfileLevelIndication Values), replace the
row:

0x28-0x7F reserved for ISO use -

with

0x28 AAC Profile L1
0x29 AAC Profile L2
0x2A AAC Profile L4
0x2B AAC Profile L5
0x2C High Efficiency AAC Profile L2
0x2D High Efficiency AAC Profile L3
0x2E High Efficiency AAC Profile L4
0x2F High Efficiency AAC Profile L5
0x30-0x7F reserved for ISO use -

In Part 3: Audio, Subpart 1, in subclause 1.6.2.1 AudioSpecificConfig, replace table 1.8 with the following
table:

8 © ISO/IEC 2003 — All rights reserved

---------------------- Page: 12 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
Table 1.8 – Syntax of AudioSpecificConfig()
Syntax No. of bits Mnemonic
AudioSpecificConfig ()
{
audioObjectType; 5 uimsbf
samplingFrequencyIndex; 4 uimsbf
if ( samplingFrequencyIndex==0xf )
 samplingFrequency; 24 uimsbf
channelConfiguration; 4 uimsbf

sbrPresentFlag = -1;

if ( audioObjectType == 5 ) {

 extensionAudioObjectType = audioObjectType;
 sbrPresentFlag = 1;
 extensionSamplingFrequencyIndex; 4 uimsbf
 if ( extensionSamplingFrequencyIndex==0xf )

  extensionSamplingFrequency; 24 uimsbf
 audioObjectType; 5 uimsbf
}
else {
 extensionAudioObjectType = 0;
}

if ( audioObjectType == 1 || audioObjectType == 2 ||
 audioObjectType == 3 || audioObjectType == 4 ||
 audioObjectType == 6 || audioObjectType == 7 )
 GASpecificConfig();

if ( audioObjectType == 8 )

 CelpSpecificConfig();
if ( audioObjectType == 9 )
 HvxcSpecificConfig();
if ( audioObjectType == 12 )
 TTSSpecificConfig();
if ( audioObjectType == 13 || audioObjectType == 14 ||
 audioObjectType == 15 || audioObjectType==16)
 StructuredAudioSpecificConfig();

/* the following Objects are Amendment 1 Objects */
if ( audioObjectType == 17 || audioObjectType == 19 ||

 audioObjectType == 20 || audioObjectType == 21 ||
 audioObjectType == 22 || audioObjectType == 23 )
 GASpecificConfig();
if ( audioObjectType == 24)
 ErrorResilientCelpSpecificConfig();
if ( audioObjectType == 25)

 ErrorResilientHvxcSpecificConfig();

if ( audioObjectType == 26 || audioObjectType == 27)

 ParametricSpecificConfig();

if ( audioObjectType == 17 || audioObjectType == 19 ||
 audioObjectType == 20 || audioObjectType == 21 ||
 audioObjectType == 22 || audioObjectType == 23 ||
 audioObjectType == 24 || audioObjectType == 25 ||
 audioObjectType == 26 || audioObjectType == 27 ) {
 epConfig; 2 uimsbf
 if ( epConfig == 2 || epConfig == 3 ) {
  ErrorProtectionSpecificConfig();
 }
 if ( epConfig == 3 ) {
© ISO/IEC 2003 — All rights reserved 9

---------------------- Page: 13 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
  directMapping; 1 uimsbf
  if ( ! directMapping ) {
  /* tbd */
  }
 }
}

if ( extensionAudioObjectType != 5 &&
  bits_to_decode() >= 16 ) {
 syncExtensionType; 11 bslbf
 if (syncExtensionType == 0x2b7) {
  extensionAudioObjectType; 5 uimsbf
  if ( extensionAudioObjectType == 5 ) {
  sbrPresentFlag; 1 uimsbf
  if (sbrPresentFlag == 1) {

   extensionSamplingFrequencyIndex; 4 uimsbf
   if ( extensionSamplingFrequencyIndex == 0xf )

   extensionSamplingFrequency; 24 uimsbf
  }
  }
 }
}
}


10 © ISO/IEC 2003 — All rights reserved

---------------------- Page: 14 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
In Part 3: Audio, Subpart 1, in subclause 1.6.2.2.1 Overview, replace table 1.9 by the following table:

Table 1.9 – Audio Object Types
Audio Object Type Object definition of elementary stream Mapping of audio payloads to
Type ID payloads and detailed syntax access units and elementary
streams
AAC MAIN 1 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.2
AAC LC 2 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.2
AAC SSR 3 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.2
AAC LTP 4 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.2
SBR 5 ISO/IEC 14496-3 subpart 4
AAC scalable 6 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.3
TwinVQ 7 ISO/IEC 14496-3 subpart 4
CELP 8 ISO/IEC 14496-3 subpart 3
HVXC 9 ISO/IEC 14496-3 subpart 2
TTSI 12 ISO/IEC 14496-3 subpart 6
Main synthetic 13 ISO/IEC 14496-3 subpart 5
Wavetable synthesis 14 ISO/IEC 14496-3 subpart 5
General MIDI 15 ISO/IEC 14496-3 subpart 5
Algorithmic Synthesis 16 ISO/IEC 14496-3 subpart 5
and Audio FX
ER AAC LC 17 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.4
ER AAC LTP 19 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.4
ER AAC scalable 20 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.4
ER Twin VQ 21 ISO/IEC 14496-3 subpart 4
ER BSAC 22 ISO/IEC 14496-3 subpart 4
ER AAC LD 23 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.4
ER CELP 24 ISO/IEC 14496-3 subpart 3
ER HVXC 25 ISO/IEC 14496-3 subpart 2
ER HILN 26 ISO/IEC 14496-3 subpart 7
ER Parametric 27 ISO/IEC 14496-3 subpart 2 and 7



In Part 3: Audio, Subpart 1, under 1.6.3 Semantics, after 1.6.3.6 Direct Mapping add:

1.6.3.7 extensionSamplingFrequencyIndex
A four bit field indicating the output sampling frequency of the extension tool corresponding to the
extensionAudioObjectType, according to Table 1.10.
1.6.3.8 extensionSamplingFrequency
The output sampling frequency of the extension tool corresponding to the extensionAudioObjectType. Either
transmitted directly, or coded in the form of extensionSamplingFrequencyIndex.
© ISO/IEC 2003 — All rights reserved 11

---------------------- Page: 15 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
1.6.3.9 bits_to_decode
A helper function; returns the number of bits not yet decoded in the current AudioSpecificConfig(), if the length
of this element has been signaled by a system/transport layer. If the length of this element is unknown,
bits_to_decode() returns 0.
1.6.3.10 syncExtensionType
Syncword which marks the beginning of appended extension configuration data. This configuration data
corresponds to an extension tool of which the coded data is embedded (in a backward compatible manner) in
that of the underlying audioObjectType. If syncExtensionType is present, the configuration data of the
extension tool is separated from that of the underlying audioObjectType, which allows for backward
compatible signaling (see subclause 1.6.5). Decoders that do not support the extension tool can ignore the
extension tool configuration data. Note that this backward compatible signaling can only be used in MPEG-4
based systems that convey the length of the AudioSpecificConfig().
1.6.3.11 sbrPresentFlag
A flag indicating the presence or absence of SBR data in case of extensionAudioObjectType==5 (i.e. explicit
SBR signaling, see subclause 1.6.5). The value –1 indicates that the sbrPresentFlag was not conveyed in the
AudioSpecificConfig(). In this case, a High Efficiency AAC Profile decoder shall be able to detect the presence
of SBR data in the Elementary Stream (i.e. implicit SBR signaling, see subclause 1.6.5).
1.6.3.12 extensionAudioObjectType
A five bit field indicating the extension audio object type. This object type corresponds to an extension tool,
which is used to enhance the underlying audioObjectType.

In Part 3: Audio, Subpart 1, after 1.6.4 Upstream, add the following subclause:

1.6.5 Signaling of SBR
1.6.5.1 Generating and Signaling AAC+SBR Content
The SBR tool in combination with the AAC coder provides a significant increase of audio compression
efficiency. At the same time it allows for compatibility with existing AAC-only decoders. However, the audio
quality for decoders without the SBR tool will of course be significantly lower than for those supporting the
SBR tool. Therefore, depending on the application, a content provider or content creator will want to choose
between the two alternatives given below. In general, the SBR data is always embedded in the AAC stream in
a AAC compatible way (in the extension_payload), and SBR is a pure post processing step in the decoder.
Therefore, compatibility can be achieved. However, by means of different signaling the content creator can
select between the full-quality mode and the backward compatibility mode as follows.
1.6.5.1.1 Ensuring Full Audio Quality of AAC+SBR for the Listener
To ensure that all listeners get the full audio quality of AAC+SBR, the stream generated shall only play on
SBR capable decoders (decoders that support the HE AAC Profile, hereinafter referred to as HE AAC Profile
decoders). This is achieved by indicating the HE AAC profile and using the explicit, hierarchical signalling
(signaling 2.A. as described below). As a result, decoders without SBR support will not play such streams.
With regard to AAC-only streams, an HE AAC Profile decoders will decode all AAC Profile streams of the
appropriate level, as the HE AAC Profile is a superset of the AAC Profile.
12 © ISO/IEC 2003 — All rights reserved

---------------------- Page: 16 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
1.6.5.1.2 Achieving Backward Compatibility with Existing AAC-only Decoders
The aim of this mode is to get all AAC-based decoders to play the stream, even if they don't support the SBR
tool. Compatible streams can be created using the following two signaling methods:
a) indicating a profile containing AAC (e.g. the AAC Profile), except the HE AAC Profile, and using the
explicit backward compatible signalling (2.B. as described below). This method is recommended for
all MPEG-4 based systems in which the length of the AudioSpecificConfig() is known in the decoder.
As this is not the case for LATM with audioMuxVersion==0 (see clause 1.7), this method cannot be
used for LATM with audioMuxVersion==0. In explicit backward compatible signaling, SBR-specific
configuration data is added at the end of the AudioSpecificConfig(). Decoders that do not know about
SBR will ignore these parts, while HE AAC Profile decoders will detect its presence and configure the
decoder accordingly.
b) indicating a profile containing AAC (e.g. the AAC Profile, or an MPEG-2 AAC profile), except the HE
AAC Profile, and using implicit signalling. In this mode, there is no explicit indication of the presence
of SBR data. Instead, decoders check the presence while decoding the stream and use the SBR tool
if SBR data is found. This is possible because SBR can be decoded without SBR-specific
configuration data if a certain way of handling decoder output sample rate is obeyed, as described
below for HE AAC Profile decoders.
Both methods lead to the result that the AAC part of an AAC+SBR streams will be decoded by AAC-only
decoders. AAC+SBR decoders will detect the presence of SBR and decode the full quality AAC+SBR stream.

1.6.5.2 Implicit and Explicit Signaling of SBR
This subclause outlines the different signaling methods of SBR, and the decoder behavior for different types of
signaling.
There are several ways to signal the presence of SBR data:
1. implicit signaling: If EXT_SBR_DATA or EXT_SBR_DATA_CRC extension_payload() elements are
detected in the bitstream, this implicitly signals the presence of SBR data. The ability to detect and
decode implicitly signaled SBR is mandatory for all High Efficiency AAC Profile (HE AAC Profile)
decoders.
2. explicit signaling: The presence of SBR data is signaled explicitly by means of the SBR Audio
Object Type in the AudioSpecificConfig(). When explicit signaling is used, implicit signaling shall not
occur. Two different types of explicit signaling are available:
2.A. hierarchical signaling: If the first audioObjectType (AOT) signaled is the SBR AOT, a second
audio object type is signaled which indicates the underlying audio object type. This signaling
method is not backward compatible.
2.B. backward compatible signaling: The extensionAudioObjectType is signaled at the end of the
AudioSpecificConfig(). This method shall only be used in systems that convey the length of the
AudioSpecificConfig(). Hence, it shall not be used for LATM with audioMuxVersion==0.
Table 1.15A shows the decoder behavior depending on profile and audio object type indication when implicit
or explicit signaling is used.
© ISO/IEC 2003 — All rights reserved 13

---------------------- Page: 17 ----------------------
ISO/IEC 14496-3:2001/Amd.1:2003(E)
Table 1.15A – SBR Signaling and Corresponding Decoder Behavior
Bitstream characteristics Decoder behavior
(Note 4)
Profile extension sbrPresent raw_data_block AAC decoders AAC decoders
indication AudioObjectType Flag not supporting supporting
HE AAC Profile HE AAC Profile
Profiles != SBR -1 AAC Play AAC Play AAC
with AAC (signaling 1) (Note 1)
AAC+SBR Play AAC Play at least
support
AAC,
other than
should play
High
AAC+SBR
Efficiency
== SBR 0 AAC Play AAC Play AAC
AAC Profile
(signaling 2.B) (Note 2)
1 AAC+SBR Play AAC Play at least
(Note 3) AAC,
should play
AAC+SBR
High == SBR 1 AAC+SBR Unsupported Play AAC+SBR
Efficiency (signaling 2.A or (Note 3) Profile -
AAC Profile 2.B) Don’t play
Note 1: Implicit signaling, check payload in order to determine output sampling frequency, or assume
the presence of SBR data in the payload, giving an output sampling frequency of twice the sampling
frequency indicated by samplingFrequency in the AudioSpecificConfig() (unless the down sampled SBR
Tool is operated, or twice the sampling frequency indicated by samplingFrequency exceeds the
maximum allowed output sampling frequency of the current level, in which case the output sampling
frequency is the same as indicated by samplingFrequency).
Note 2: Explicitly signals that there is no SBR data, hence no implicit signaling is present, and the output
sampling frequency is given by samplingFrequency in the AudioSpecificConfig().
Note 3: Output sampling frequency is the extensionSamplingFrequency in AudioSpecificConfig().
Note 4: In all cases a decoder has to support the Profile and Level indicated in the bitstream in order to
be able to decode and play the content of the bitstream.

The upper part of Table 1.15A displays bitstream characteristics and decoder behavior if the profile indication
is any profile with AAC, apart from the High Efficiency AAC Profile. The lower part displays bitstream
characteristics and decoder behavior if the profile indication is the High Efficiency AAC Profile.

1.6.5.3 HE AAC Profile Decoder Behavior in Case of Implicit Signaling
If the presence of SBR data is backward compatible implicitly signaled (signaling 1, in the list above) the
extensionAudioObjectType is not the SBR AOT, and the sbrPresentFlag is set to –1, indicating that implicit
signaling may occur.
Since the HE AAC Profile decoder is a dual rate system, with the SBR Tool operating at twice the sample rate
of the underlying AAC decoder, the output sample rate cannot be assumed to be that of the AAC decoder just
because SBR is not explicitly signaled. The decoder shall determine the output sample rate by either of the
following two methods:
• Check for the presence of SBR data in the bitstream prior to decoding. If no SBR data is found, the
output sample rate is equal to that signaled as samplingFrequency in
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.