Information technology — Coding of audio-visual objects — Part 3: Audio — Amendment 3: Scalable Lossless Coding (SLS)

Technologies de l'information — Codage des objets audiovisuels — Partie 3: Codage audio — Amendement 3: Codage extensible sans perte (SLS)

General Information

Status
Withdrawn
Publication Date
08-Jun-2006
Withdrawal Date
08-Jun-2006
Current Stage
9599 - Withdrawal of International Standard
Completion Date
26-Aug-2009
Ref Project

Relations

Buy Standard

Standard
ISO/IEC 14496-3:2005/Amd 3:2006 - Scalable Lossless Coding (SLS)
English language
73 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO/IEC
STANDARD 14496-3
Third edition
2005-12-01
AMENDMENT 3
2006-06-01



Information technology — Coding of
audio-visual objects —
Part 3:
Audio
AMENDMENT 3: Scalable Lossless Coding
(SLS)
Technologies de l'information — Codage des objets audiovisuels —
Partie 3: Codage audio
AMENDEMENT 3: Codage extensible sans perte (SLS)





Reference number
ISO/IEC 14496-3:2005/Amd.3:2006(E)
©
ISO/IEC 2006

---------------------- Page: 1 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.


©  ISO/IEC 2006
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland

ii © ISO/IEC 2006 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 3 to ISO/IEC 14496-3:2005/Amd. 3:2005 was prepared by Joint Technical Committee
ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and
hypermedia information.
This Amendment specifies Audio Scalable Lossless Coding (SLS).

© ISO/IEC 2006 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
Information technology — Coding of audio-visual objects —
Part 3:
Audio
AMENDMENT 3: Scalable Lossless Coding (SLS)
In ISO/IEC 14496-3, Introduction, add the following to the end of the subclause "MPEG-4 general audio
coding tools":
MPEG-4 SLS (Scalable Lossless Coding) is a tool used in combination with optional MPEG-4 General Audio
coding tools to provide fine-grain scalable to numerical lossless coding of digital audio waveform.

In Part 3: Audio, Subpart 1, in subclause 1.3 Terms and Definitions, add:
SLS: Audio Scalable to Lossless Coding
and increase the index-number of subsequent entries.

In Part 3: Audio, Subpart 1, in subclause 1.5.1.1 Audio object type definition, amend table 1.1 with the updates
in the table below:


Tools/
Modules



Audio Object
Type

(escape)   X 31

SLS X X X X X 37
SLS non-core  X X 38
...
Note: (*) marks new columns

© ISO/IEC 2006 – All rights reserved 1
Error Mapping (*)
Integer TNS (*)
Integer M/S (*)
IntMDCT (*)
BPGC/CBAC/LEMC (*)
Remark
Object Type ID

---------------------- Page: 4 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
In Part 3: Audio, Subpart 1, subclause 1.4 (Symbols and Abbreviations) add the following subclause:
1.4.9 Arithmetic data types
INT32 32 bit signed integer using two’s complement
INT64 64 bit signed integer using two’s complement

In Part 3: Audio, Subpart 1, subclause 1.5 add the following subclauses:
1.5.1.2.31 SLS object type
The SLS object is supported by the scalable to lossless tool which provides fine-grain scalable to lossless
enhancement of MPEG perceptual audio codecs, such as AAC, allowing multiple enhancement steps from the
audio quality of the core codec up to near-lossless and lossless signal representation. It also provides stand-
alone lossless audio coding when the core audio codec is omitted.

1.5.1.2.32 SLS Non-Core object type
The SLS non-core object is supported by the scalable to lossless tool. It is similar to the SLS object type but
the core audio codec is omitted.

In Part 3: Audio, Subpart 1, in subclause 1.6.2.1 AudioSpecificConfig, amend table 1.8 with the updates in the
table below:
Syntax No. of bits Mnemonic
AudioSpecificConfig ()
{

switch (audioObjectType) {
case 37:
case 38:
 SLSSpecificConfig();
 break;

}

}

In Part 3: Audio, Subpart 1, in subclause 1.6.2.1 add the following subclause:
1.6.2.1.13 SLSSpecificConfig
Defined in ISO/IEC 14496-3 subpart 12.

In Part 3: Audio, Subpart 1, in subclause 1.6.2.2.1 Overview, add the following to table 1.14:
Audio Object Type Object Definition of elementary stream Mapping of audio payloads to
Type ID payloads and detailed syntax access units and elementary
streams

SLS 37 ISO/IEC 14496-3 subpart 12
SLS non_core 38 ISO/IEC 14496-3 subpart 12
2 © ISO/IEC 2006 – All rights reserved

---------------------- Page: 5 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
Create Part 3: Audio, Subpart 12:
Subpart 12: Technical description of scalable lossless coding
12.1 Scope
This subpart of ISO/IEC 14496-3 describes the MPEG-4 scalable lossless coding algorithm for audio signals.
This description partially relies on the specification as given in subpart 4.
12.2 Terms and definitions
12.2.1 Definitions
The following definitions are used in this subpart.
Core Layer The MPEG-4 GA T/F coder used as the first layer in SLS . The audio object
types AAC LC, AAC Scalable (without LTP), ER AAC LC, ER AAC Scalable
and ER BSAC are supported.
LLE Layer Lossless enhancement layer used in SLS to enhance the quality of the core
layer towards lossless coding.
Bit-Plane Position of specific bit in binary data word, starting with 0 as the position of
the least significant bit (LSB). For example, the binary bit-plane symbols from
bit-plane 0, 1, 2, and 3 of data word 0x0011 1101 (0x3d) are 1, 0, 1, and 1
respectively.
BPGC Bit-Plane Golomb Code
CBAC Context Based Arithmetic Code
LEMC Low Energy Mode Code
Implicit Band A scale factor band for which the quantized spectral data presented in the
core layer bit-stream will be used in determining part of the necessary side
information for the LLE layer.
Explicit Band A scale factor band for which the quantized spectral data presented in the
core layer bit-stream will not be used in determining the necessary side
information for the LLE layer. All the side information will be coded explicitly
in the LLE payload.
Oversampling Factor (osf) Ratio between sampling rates of LLE Layer and Core Layer, possible values
are 1, 2 and 4.
Oversampling Range High frequency range covered only by the LLE Layer, comprises
(osf-1)*1024 resp. (osf-1)*128 frequency values per window.
Reserved All fields labelled Reserved are reserved for future standardization. All
Reserved fields must be set to zero.
© ISO/IEC 2006 – All rights reserved 3

---------------------- Page: 6 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
12.2.2 Notations
In order to make the description stringent, the following notations are used in this subpart:
• Vectors are indicated by bold lower-case names, e.g. vector.
• Matrices (and vectors of vectors) are indicated by bold upper-case single letter names, e.g. M.
• Variables are indicated by italics, e.g. variable.
• Functions are indicated as func(x)
12.2.3 Definitions
DIV(m,n) Integer division with truncation of the result of m/n to an integer value towards −∞.
• The floor operation. Returns the largest integer that is less than or equal to the real-valued
⎢⎥
⎣⎦
argument.
12.3 Payloads for the audio object
Table 12.1 – Syntax of SLSSpecificConfig
Syntax No. of bits Mnemonics
SLSSpecificConfig(samplingFrequencyIndex,
   channelConfiguration,
   audioObjectType)
{
pcmWordLength; 3 uimsbf
1 uimsbf
aac_core_present;
1 uimsbf
lle_main_stream;
1 uimsbf
   reserved_bit;
3 uimsbf
frameLength;
if (!channelConfiguration){
 program_config_element();
}
}

Table 12.2 – Top layer payload for lle stream
Syntax No. of bits Mnemonics
lle_element()
{
 for (ch=0;ch   if (is_channel_pair(ch)) {
  lle_channel_pair_element();
  ch += 2;
  } else {
  lle_single channel_element();
  ch++;
  }
 }
}

4 © ISO/IEC 2006 – All rights reserved

---------------------- Page: 7 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
Table 12.3 – Syntax of lle_single_channel_element
Syntax No. of bits Mnemonics
lle_single_channel_element()
{
lle_individual_channel_stream(1);
}

Table 12.4 – Syntax of lle_channel_pair_element
Syntax No. of bits Mnemonics
lle_channel_pair_element()
{
lle_individual_channel_stream(1);
lle_individual_channel_stream(0);
}

Table 12.5 – Syntax of lle_individual_channel_stream
Syntax No. of bits Mnemonics
lle_individual_channel_stream(is_first_channel)
{
lle_ics_length; 16 uimsbf
if (is_first_channel) {
 element_instance_tag; 4 uimsbf
}
lle_reserved_bit; 1 uimsbf
if (lle_main_stream) {
 lle_header(is_first_channel);
 lle_side_info();
}
lle_data();
byte_align();
}

© ISO/IEC 2006 – All rights reserved 5

---------------------- Page: 8 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
Table 12.6 – Syntax of lle_header()
Syntax No. of bits Mnemonics
lle_header(is_first_channel)
{
if (lle_channel_pair_element && common_window &&
is_first_channel) {
1 uimsbf
       use_stereo_intmdct;
    }
if (aac_core_present) {
 band_type_signaling; 2 uimsbf
 if (band_type_signaling==1) {
  for(g=0;g   for(sfb=0;sfb    band_type[g][sfb]; 1 uimsbf
  }
  }
 }
} else {
      if (is_first channel) {
   windows_sequence; 2 uimsbf
 }
}
}

Table 12.7 – Syntax of lle_side_info
Syntax No. of bits Mnemonics
lle_side_info()
{
For(g=0;g  for(sfb=0;sfb   if (band_type[g][sfb]==Explicit_Band) {
  vcod_dpcm_max_bp[g][sfb]; 1.17 bslbf
  }
  if (max_bp[g][sfb] != -1) {
  vcod_lazy_bp[g][sfb]; 1. 2 bslbf
  }
 }
}
cb_cbac; 1 uimsbf
}

Table 12.8 – Syntax of lle_data
Syntax No. of bits Mnemonics
lle_data()
{
BPGC/CBAC data; varies bslbf
LEMC data; varies bslbf
}

6 © ISO/IEC 2006 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
12.4 Semantics
Data elements:
aac_core_present Indicates, whether the lossless enhancement operates on top of an MPEG-4
GA T/F core (aac_core_present=1) or in non-core mode
(aac_core_present=0).
lle_main_stream Indicates, whether the current stream represents an LLE main stream
including all the necessary side information or an LLE extension stream that
extends the previous LLE stream.
pcmWordlength Quantization word length of the original PCM waveform.
Table 12.9 – Word length of original PCM waveform
pcmWordlength Word length of original PCM
waveform
0 8
1 16
2 20
3 24
4 – 7 Reserved

frameLength Length of the IntMDCT frame in the LLE layer.
Table 12.10 – Length of the IntMDCT frame
frameLength Length of the IntMDCT frame Oversampling factor of the
IntMDCT filterbank (osf)
0 1024 1
1 2048 2
2 4096 4
3-7 Reserved Reserved

element_instance_tag Unique instance tag for syntactic elements. All syntactic elements containing
instance tags may occur more than once, but must have a unique
element_instance_tag in each audio frame. When the MPEG-4 GA T/F core
is present, syntactic elements of SLS and MPEG-4 GA T/F from the same
audio channel use the same element_instance_tag.
lle_ics_length Length of LLE individual channel stream (LLE_ICS) for the current frame; in
bytes.
band_type_signaling By default, the band type for a scale factor band is defined as follows: A scale
factor band that is in a section coded with the zero codebook (ZERO_HCB),
Intensity Stereo (IS) coded, or Perceptual Noise Substitution (PNS) coded is
an Explicit_Band. Otherwise it is an Implicit_Band.
Scale factor bands above max_sfb and in the oversampling range are always
Explicit_Band.
This default band type can by overwritten by band_type_signaling in the
following way:
© ISO/IEC 2006 – All rights reserved 7

---------------------- Page: 10 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
Table 12.11 – Band type signaling
Value of band type
band_type_signaling
00 Use default
01 Band type signaling for each sfb follows
10 All sfb are Explicit_Band
11 Reserved

band_type[g][sfb] Band type signaling for each scale factor band when
band_type_signaling==01. A scale factor band is set to Explicit_Band if
band_type[g][sfb] is 0.
Table 12.12 –Band type
Value Band type
0 Explicit_Band
1 Default

vcod_dpcm_max_bp[g][sfb] The variable length coded maximum bit-plane for scale factor band sfb and
group g.

vcod_lazy_bp[g][sfb] The variable length coded lazy bit-plane for non-zero scale factor band sfb
and group g.

cb_cbac Indication of frequency table that will be used in the LLE decoding process.

Table 12.13 – cb_cbac table
cb_cbac Frequency table
0 BPGC
1 CBAC

bpgc/cbac_data The binary bit-stream of the bpgc/cbac coded residual spectrum data

low_energy_mode_data The binary bit-stream of the LEMC mode coded residual spectrum data

12.5 SLS decoder tool
12.5.1 Overview
The block diagram of the scalable lossless (SLS) decoder is given in Figure 12.1. The core layer MPEG-4 GA
stream is decoded by a deterministic Core Layer decoder. Its output, which is a deterministic spectrum in the
MDCT domain, is sent to the inverse error mapping process. Meanwhile, the residual IntMDCT spectrum,
carried in the LLE layer streams, is decoded and sent to the inverse error mapping process to reconstruct the
IntMDCT spectrum. An inverse integer Mid/Side (M/S) and an inverse integer TNS process are then invoked
and performed on the IntMDCT coefficients if necessary. Finally, its output is inversely transformed by using
the inverse IntMDCT process to produce the PCM audio samples. A detailed description of each process is
given in the subsequent sections.

8 © ISO/IEC 2006 – All rights reserved

---------------------- Page: 11 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)

MPEG - 4
Deterministic
GA stream
MPEG -4 GA
decoder
MPEG - 4
Bitstream
SLS
LLE Output PCM
Payload
Inverse
BPGC / Inverse Inverse
Inverse
stream samples
Parser
Integer
CBAC Error Integer
IntMDCT
TNS
Decoder Mapping M/S
Low
Energy
Mode
Decoder

Figure 12.1 – SLS decoder block diagram
12.5.1.1 Non-core Mode
In the non-core mode SLS works as a stand-alone codec without AAC core. In case of the SLS audio object
type this is signalled by aac_core_present=0 for the non-core mode and aac_core_present=1 for the core-
based mode. In case of the SLS non-core audio object type it is always aac_core_present=0.
In the non-core mode the following default values are used:
• window_shape = 0 (sine window)
• if (lle_channel_pair_element) common_window = 1 (on)
• if (use_stereo_intmdct) all M/S flags are on, else all M/S flags are off
• if (window_sequence == EIGHT_SHORT_SEQUENCE) grouping = {2,2,2,2}


LLE Output PCM
BPGC/
Inverse
samples
stream
CBAC
IntMDCT
SLS decoder
Bitstream
stream
Payload
Parser
osf*1024 osf* 1024
Low
Energy
Mode
Decoder

Figure 12.2 – SLS non-core decoder block diagram

© ISO/IEC 2006 – All rights reserved 9

---------------------- Page: 12 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
12.5.2 Oversampling technique
The core layer is allowed to operate at a lower sampling rate than the LLE layers. The following table shows
some possible sampling rate combinations.
Table 12.14 – Example combinations of sampling rates for Core and LLE layers
Core@ 48 kHz Core@ 96 kHz Core@ 192 kHz
LLE@ 48 kHz X (osf = 1)
LLE@ 96 kHz X (osf = 2) X (osf = 1)
LLE@ 192 kHz X (osf = 4) X (osf = 2) X (osf = 1)

This technique is referred to as “Oversampling” in the following.
The scalability of the codec using different sampling rates is achieved by changing the length of the inverse
IntMDCT in the decoder accordingly. While the AAC core processes 1024 values in each frame, the SLS
codec needs to process osf*1024 values per frame. This is achieved by extending the length of the inverse
IntMDCT in the decoder to osf*1024 spectral lines. The 1024 inverse quantized spectral values from the AAC
core are added to the 1024 low-frequency values of the SLS residual spectrum. This is illustrated in Figure
12.3.

MPEG-4
Deterministic
GA stream
MPEG-4 GA
decoder
1024
AAC +
LLE
Bitstream
stream
LLE Output PCM
Payload
Inverse
BPGC/ Inverse Inverse
stream Inverse samples
Parser
Integer
CBAC Error Integer
IntMDCT
TNS
decoder Mapping M/S
 osf *102 4 osf*1024 osf*1024 osf * 1024
Low
Energy
Mode
Decoder

Figure 12.3 – Structure of SLS decoder with oversampling

12.5.3 SLS with Scalable AAC Core
If the core layer is AAC Scalable, the spectral data decoded from the SLS layers are added to the spectral
data decoded from the AAC Scalable streams with a deterministic inverse AAC quantizer. The resulting
spectral data is then processed with inverse integer M/S and inverse integer TNS process if necessary.
Finally, the output is transformed by the inverse IntMDCT to produce the PCM audio samples. The decoding
process is illustrated in Figure 12.4.
10 © ISO/IEC 2006 – All rights reserved

---------------------- Page: 13 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)


Deterministic L'



Integer
Scal able I nverse



Inverse
AAC Q uantization
FSS
L



TNS L
(SIAQ )


M'


Left/Mid



(One or more layers)

Integer


R'
Inverse

Bitstream Deterministic

TNS R


Demultiplex SIAQ

FSS
R
Right / Side

(One or more layers) Integer


Inverse

Integer

S' MS


Inverse


TNS M

Deterministic
M

SIAQ M''



M'''
Mono

Integer

(One or more

Inverse

layers)

S
MDCT R



L"

SLS

Integer


Right/Side

Inverse
Output Time
(One or more S"

TNS M
Signal (Right)

layers)


R" Integer
SLS

Inverse
Left/Mid

M"''
MDCT L

(One or more

layers)

Output Time
Signal (Left)


Figure 12.4 – Structure of SLS decoder with Scalable AAC core layer streams


12.5.4 Decoding of lle_single_channel_element (LLE_SCE) and lle_channel_pair_element
(LLE_CPE)
12.5.4.1 Definitions
lle_ics_length Length of LLE individual channel stream (LLE_ICS) in bytes.
vcod_dpcm_max_bp[g][sfb] The variable length coded maximum bit-plane for scale factor band sfb
and group g. This element is only present for insignificant scale factor
bands.
vcod_lazy_bp[g][sfb] The variable length coded lazy bit-plane for non-zero scale factor band
sfb and group g.
g Group index.
sfb Scale factor band within group.
win Window index.
bin Frequency bin index.
num_window_groups Number of groups of windows which share one set of scale factors.
© ISO/IEC 2006 – All rights reserved 11

---------------------- Page: 14 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
num_sfb Number of scale factor bands per short window in case of
EIGHT_SHORT_SQEUENCE, number of scale factor bands for long
windows otherwise.
num_osf_sfb Number of scale factor bands per window in the oversampling range. The
oversampling range is covered by (osf-1)*16 bands with a width of 64 in
case of long windows resp. (osf-1)*4 bands with a width of 32 in case of
short windows.
max_bp[g][sfb] The maximum bit-plane for group g and scale factor band sfb.
lazy_bp[g][sfb] The lazy bit-plane for group g and scale factor band sfb.
read_bits(n) Read n consecutive bits from the inputs bit-stream in the order of bslbf.
quant[g][win][sfb][bin] AAC quantized spectral data.
interval[g][win][sfb][k] Quantization intervals in the core AAC encoder.

12.5.4.2 Decoding process
12.5.4.2.1 LLE_SCE and LLE_CPE
An LLE_SCE is composed of an lle_individual_channel_stream (LLE_ICS) while an LLE_CPE has two
lle_individual_channel_streams (LLE_ICS).
12.5.4.2.2 Decoding an LLE_ICS
In the LLE_ICS, the order of the decoding process is given in the following flowchart:

Get ll_ics_len
Get LLE decoding
side information
Get BPGC/CBAC
data
Get LEMC data

Figure 12.5 – Process of decoding LLE_ICS

For SLS bit-stream composed of an lle_main stream (lle_main_stream = 1) and multiple (>=1) lle_extension
stream (lle_main_stream = 0), for each LLE_ICS, the lle_data() is constructed by concatenating the lle_data()
elements from the lle_main stream, and all the available lle_extension streams in sequences as shown in the
following figure:

12 © ISO/IEC 2006 – All rights reserved

---------------------- Page: 15 ----------------------
l
ISO/IEC 14496-3:2005/Amd.3:2006(E)

LLE
decoding
...
lle_data() lle_data() lle_data()
side
information
...
lle_main lle_extension lle_extension
(layer 1) (layer N)

Figure 12.6 – Construction of LLE_ICS for from multiple LLE streams

If there is an intermediate LLE_extension stream missing, the data in lle_data() of the subsequent streams
can not be used.
12.5.4.2.3 Recovering BPGC/CBAC side information
For each scale factor band of band type Explicit_Band, a maximum bit-plane (max_bp) is transmitted. In
addition, for each scale factor band, a lazy bit-plane (lazy_bp) is transmitted unless the residual spectral data
is all zero for this scale factor band (which is signalled by maximum bit-plane = -1). The max_bp is coded
using variable length coded DPCM relative to the previously transmitted maximum bit-plane. The first value in
each window group is coded using 5 bits PCM. The max_bp value is coded in unary representation. The
following table gives some examples of how the DPCM value of max_bp is coded.
Table 12.15 – Codeword for decoding the DPCM value of max_bp
DPCM max_bp codeword codeword length
0 1 1
(s)1 01(s) 3
(s)2 001(s) 4
… … …
(s)10 00000000001(s) 12
… … …

The difference between max_bp and lazy_bp, whose value is within the range {1, 2, 3} is decoded as follows:
Table 12.16 – Codeword for decoding the difference between max_bp and lazy_bp
max_bp - lazy_bp codeword codeword length
1 10 2
2 0 1
3 11 2

© ISO/IEC 2006 – All rights reserved 13
lle_ics_ ength

---------------------- Page: 16 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
The following pseudo code illustrates the decoding process for max_bp and lazy_bp.
for (g = 0;g < num_window_groups; g++)
  init = 0;
for (sfb = 0; sfb  if (band_type[g][sfb]== Explicit_Band) {
  if (!init){
  max_bp[g][sfb] = read_bits(5) - 1; init ++;
  }
  else {
  m = 0;
  while (read_bits(1) == 0) m++;
  if (m) {
   if (read_bits(1)) m = -m;
  }
  max_bp[g][sfb] = m0 - m;
  }
  m0 = max_bp[g][sfb];
 }
 if (max_bp[g][sfb]>=0) {
  if (read_bits(1)==0)
   lazy_bp[g][sfb] = max_bp[g][sfb] - 2;
  else {
   if (read_bits(1)==0) lazy_bp[g][sfb] = max_bp[g][sfb] - 1;
   else lazy_bp[g][sfb] = max_bp[g][sfb] - 3;
  }
  }
 }

For Implicit_Bands, max_bp[g][sfb] is calculated from the quantization thresholds of the core layer quantizer
as follows:
As the first step, the maximum bit-plane M for each residual spectral bin for significant scale factor bands can
be calculated from
M[g][win][sfb][bin]= INT log interval[g][win][sfb][bin]
[ ]
{ }
2
where interval[]g[win][sfb][bin] is the quantization interval that is given by:
interval[g][win][sfb][bin]=+thr quant[g][win][sfb][bin] 1−thr quant[g][win][sfb][bin]+1
( ) ( )
.
Here thr(x) and inv_quant(x) are, respectively, the deterministic quantization threshold and the corresponding
deterministic inverse quantization for AAC quantizer. They are calculated as in the following pseudo code:
If (x==0)
thr(x)=0;
else
thr(x) = (thrMantissa(|x|-1, scale_res))<<(12+scale_int);

inv_quant(x) = (invQuantMantissa(|x|,scale_res))<<(12+scale_int);

where
scale_int = DIV(scale,4)
scale_res = scale - scale_int*4, and
scale=scale_factor(sfb)+core_scaling_factor+scale_osf-118.
14 © ISO/IEC 2006 – All rights reserved

---------------------- Page: 17 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
The value of core_scaling_factor is given in Table 12.17.
Table 12.17 – Table for core_scaling_factor
Word Length
16 20 24
sfb Type
Long Window (2048), M/S 0 16 32
Long Window (2048), non M/S 2 18 34
Short Window (256), M/S 6 22 38
Short Window (256), non M/S 8 24 40

Table 12.18 – Table for scale_osf
osf 1 2 4
scale_osf 0 2 4

The functions thrMantissa() and invQuantMantissa() are defined in 12.5.11.
For scalefactor bands coded with IS or PNS the value of inv_quant(x) is set to 0.
The maximum bit-plane max_bp for each sfb is the maximum value of M for spectral data that belongs to that
sfb:
max_bp[g][sfb]= max M[g][win][sfb][bin]
( )

12.5.5 Decoding of lle_data
12.5.5.1 Definitions
lle_data() Part of the bit-stream which contains the coded residual spectrum data.
window_group_len[g] Number of windows in each group.
is_lle_ics_eof() An auxiliary function to detect the end of LLE_ICS.
read_bits(n) Read n consecutive bits from the input bit-stream in the order of bslbf. If there
exists no bit to be fed in the bitstream, it returns ‘0’ by default.
cur_bp[g][sfb] The current decoded bit-plane.
res[g][win][sfb][k] The reconstructed residual integer spectral data vector.
amp[g][win][sfb][k] The amplitude of the reconstructed residual integer spectral data vector.
sign[g][win][sfb][k] The sign of the reconstructed residual integer spectral data vector.
determine_frequency() The function to determine the probability of the symbol '1' according to either
the CBAC or the BPGC frequency table.
ambiguity_check(f) The function to detect ambiguity for the arithmetic decoding. The argument f
indicates the probability of the symbol '1'.
terminate_decoding() The function to terminate decoding of the LLE data when ambiguity occurs.
© ISO/IEC 2006 – All rights reserved 15

---------------------- Page: 18 ----------------------
ISO/IEC 14496-3:2005/Amd.3:2006(E)
smart_decoding_cbac_bpgc() The function to decode additional symbols in the absence of incoming bits in
the cbac/bpgc mode decoding. This decoding continues up to the point where
there exists no ambiguity. It includes ambiguity_check(f) and
terminate_decoding().
smart_decoding_low_energy() The function to decode additional symbols in the absence of incoming bits in
the low energy mode decoding. It also includes ambiguity_check(f) and
terminate_decoding().
12.5.5.2 Decoding process
12.5.5.2.1 Overview
The residual integer spectral data vector is decoded from the LLE data stream lle_data(). Firstly, all scale
factor bands with lazy_bp > 0 are BPGC/CBAC decoded, where the amplitude of the residual spectral data
res is bit-plane decoded starting from the maximum bit-plane max_bp and progressing to lower bit-planes until
bit-plane 0 for each scale factor band. Subsequently, the low energy mode decoding is invoked to decode the
remaining scale factor bands with lazy_bp <= 0.
The SLS decoder can provide the functionality of fine-grain scalability (FGS) by truncating the LLE bitstream.
Moreover, it allows to decode additional symbols beyond the point of truncation by exploiting the properties of
arithmetic coding.

12.5.5.2.2 BPGC/CBAC decoding process
The BPGC decoding or CBAC decoding process is performed on scale factor bands for which lazy_bp>0. The
BPGC/CBAC bit-plane decoding process is used to decode the
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.