ISO/IEC 14496-3:2001/Amd 6:2005
(Amendment)Information technology — Coding of audio-visual objects — Part 3: Audio — Amendment 6: Lossless coding of oversampled audio
Information technology — Coding of audio-visual objects — Part 3: Audio — Amendment 6: Lossless coding of oversampled audio
Technologies de l'information — Codage des objets audiovisuels — Partie 3: Codage audio — Amendement 6: Codage sans perte d'audio suréchantillonné
General Information
Relations
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 14496-3
Second edition
2001-12-15
AMENDMENT 6
2005-07-01
Information technology — Coding of
audio-visual objects —
Part 3:
Audio
AMENDMENT 6: Lossless coding of
oversampled audio
Technologies de l'information — Codage des objets audiovisuels —
Partie 3: Codage audio
AMENDEMENT 6: Codage sans perte d'audio suréchantillonné
Reference number
ISO/IEC 14496-3:2001/Amd.6:2005(E)
©
ISO/IEC 2005
---------------------- Page: 1 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2005
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2005 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 6 to ISO/IEC 14496-3:2001 was prepared by Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia
information.
© ISO/IEC 2005 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
Introduction
This document specifies the 6-th Amendment to the ISO/IEC 14496-3:2001 standard. It contains the text for
the Final Draft Amendment on lossless coding of 1-bit oversampled audio signals. It contains the DSD and
DST definitions as described in the Super Audio CD Specification Version 1.3. Note that in the context of SA-
CD, only an oversampling ratio of 64 is defined. In this description, also oversampling ratios other than 64 are
supported.
iv © ISO/IEC 2005 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
Information technology — Coding of audio-visual objects —
Part 3:
Audio
AMENDMENT 6: Lossless coding of oversampled audio
In ISO/IEC 14496-3:2001, Introduction, add:
MPEG-4 DST, Direct Stream Transfer for lossless coding of oversampled audio signals.
Amendment subpart 1
In Part 3: Audio, Subpart 1, in subclause 1.3 Terms and Definitions, add:
DST: Direct Stream Transfer
and increase the index-number of subsequent entries.
© ISO/IEC 2005 – All rights reserved 1
---------------------- Page: 5 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
In Part 3: Audio, Subpart 1, in subclause 1.5.1.1 Audio object type definition, amend table 1.1 according to the
update in the table below:
Table 1.1 — Audio object definition
Tools/
Modules
Audio Object Type
Null 0
AAC main 2) 1
AAC LC 2
AAC SSR 3
AAC LTP 2) 4
SBR X 5
AAC Scalable 6) 6
TwinVQ 7
CELP 8
HVXC 9
(Reserved) 10
(Reserved) 11
TTSI 12
Main synthetic 3) 13
Wavetable synthesis 4) 14
General MIDI 15
Algorithmic Synthesis and Audio FX 16
ER AAC LC 17
(Reserved) 18
ER AAC LTP 5) 19
ER AAC scalable 6) 20
ER TwinVQ 21
ER BSAC 22
ER AAC LD 23
ER CELP 24
ER HVXC 25
ER HILN 26
ER Parametric 27
SSC X 28
(Reserved) 29
(Reserved) 30
(Reserved) 31
(Reserved) 32
(Reserved) 33
DST X 35
2 © ISO/IEC 2005 – All rights reserved
|
SBR
SSC
DST
Remark
Object Type ID
---------------------- Page: 6 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
In Part 3: Audio, Subpart 1, replace Table 1.2 (Audio Profiles definition) with the following table:
Table 1.2 — Audio Profiles definition
Mobile
High Low High
Main Scalable Speech Synthetic Natural Audio AAC Object
Audio Object Quality Delay Efficiency
Audio Audio Audio Audio Audio Internet- Profile Type
Type Audio Audio AAC
Profile Profile Profile Profile Profile working ID
Profile Profile Profile
Profile
Null 0
AAC main X X 1
AAC LC X X X X X X 2
AAC SSR X X 3
AAC LTP X X X X 4
SBR X 5
AAC X X X X 6
Scalable
TwinVQ X X X 7
CELP X X X X X X 8
HVXC X X X X X 9
(reserved) 10
(reserved) 11
TTSI X X X X X X 12
Main X X 13
synthetic
Wavetable 14
synthesis
General 15
MIDI
Algorithmic 16
Synthesis
and Audio
FX
ER AAC LC X X X 17
(reserved) 18
ER AAC LTP X X 19
ER AAC X X X 20
Scalable
ER TwinVQ X X 21
ER BSAC X X 22
ER AAC LD X X X 23
ER CELP X X X 24
ER HVXC X X 25
ER HILN X 26
ER X 27
Parametric
SSC 28
(reserved) 29
(reserved) 30
(reserved) 31
(reserved) 32
(reserved) 33
DST 35
© ISO/IEC 2005 – All rights reserved 3
---------------------- Page: 7 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
In Part 3: Audio, Subpart 1, in subclause 1.6.2.1 AudioSpecificConfig, adapt Table 1.8 according to the
modification in the table below:
Table 1.8 — Syntax of AudioSpecificConfig()
Syntax No. of bits Mnemonic
AudioSpecificConfig ()
{
----
if ( audioObjectType == 26 || audioObjectType == 27)
ParametricSpecificConfig();
if ( audioObjectType == 35)
DSTSpecificConfig();
if ( audioObjectType == 17 || audioObjectType == 19 ||
audioObjectType == 20 || audioObjectType == 21 ||
audioObjectType == 22 || audioObjectType == 23 ||
audioObjectType == 24 || audioObjectType == 25 ||
audioObjectType == 26 || audioObjectType == 27 ) {
----
}
4 © ISO/IEC 2005 – All rights reserved
---------------------- Page: 8 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
In Part 3: Audio, Subpart 1, in subclause 1.6.2.2.1 Overview, replace Table 1.9 by the following table:
Table 1.9 — Audio Object Types
Audio Object Type Object definition of elementary stream Mapping of audio payloads to
Type ID payloads and detailed syntax access units and elementary
streams
AAC MAIN 1 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.2
AAC LC 2 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.2
AAC SSR 3 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.2
AAC LTP 4 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.2
SBR 5 ISO/IEC 14496-3 subpart 4
AAC scalable 6 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.3
TwinVQ 7 ISO/IEC 14496-3 subpart 4
CELP 8 ISO/IEC 14496-3 subpart 3
HVXC 9 ISO/IEC 14496-3 subpart 2
TTSI 12 ISO/IEC 14496-3 subpart 6
Main synthetic 13 ISO/IEC 14496-3 subpart 5
Wavetable synthesis 14 ISO/IEC 14496-3 subpart 5
General MIDI 15 ISO/IEC 14496-3 subpart 5
Algorithmic Synthesis 16 ISO/IEC 14496-3 subpart 5
and Audio FX
ER AAC LC 17 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.4
18
ER AAC LTP 19 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.4
ER AAC scalable 20 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.4
ER Twin VQ 21 ISO/IEC 14496-3 subpart 4
ER BSAC 22 ISO/IEC 14496-3 subpart 4
ER AAC LD 23 ISO/IEC 14496-3 subpart 4 see subclause 1.6.2.2.2.1.4
ER CELP 24 ISO/IEC 14496-3 subpart 3
ER HVXC 25 ISO/IEC 14496-3 subpart 2
ER HILN 26 ISO/IEC 14496-3 subpart 7
ER Parametric 27 ISO/IEC 14496-3 subpart 2 and 7
SSC 28 ISO/IEC 14496-3 subpart 8
…
DST 35 ISO/IEC 14496-3 subpart 10
© ISO/IEC 2005 – All rights reserved 5
---------------------- Page: 9 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
Create Part 3: Audio, Subpart 10:
Subpart 10: Technical description of lossless coding of
oversampled audio
10.1 Scope
This part of ISO/IEC 14496 describes the MPEG-4 lossless coding algorithm for oversampled audio signals.
10.2 Terms and definitions
The following definitions are used in this document.
Audio Channel The stream of DSD bits intended for one loudspeaker.
Audio Frame A Frame containing Audio data.
Audio Channel Number The sequence number assigned to an Audio Channel. Audio Channel Numbers are
contiguously numbered starting with one.
Frame A block of data belonging to a certain Time Code. The playing time of a Frame is
1/75 Sec.
Reserved All fields labelled Reserved are reserved for future standardization. All Reserved
fields must be set to zero.
Silence Pattern A digitally generated DSD pattern with the following properties:
• All Audio Bytes (see 10.6.1.1) have the same value.
• Each Audio Byte must contain 4 bits equal to zero and 4 bits equal to one.
Direct Stream Digital A one bit oversampled representation of the audio signal.
Direct Stream Transfer The lossless coding technique used for DSD signals in Super Audio CD.
DSD See Direct Stream Digital.
DST See Direct Stream Transfer.
Half Probability Half Probability defines for each Audio Channel in an Audio Frame whether the first
DSD bits are arithmetically encoded using the Ptable values, or using a probability
equal to ½.
Mapping Mapping defines, for each Segment, the Prediction Filter and Probability Table that
is used.
Prediction Filter A Prediction Filter is a transversal filter used to predict the value of the next DSD bit.
A Prediction Filter is characterized by a prediction order and by coefficients.
Probability Table A Probability Table contains the probability that the value of a DSD bit is predicted
erroneously for a given output of the prediction filter.
Ptable See Probability Table.
6 © ISO/IEC 2005 – All rights reserved
---------------------- Page: 10 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
Sampling Frequency The sampling frequency of the DSD signal shall be 64 * 44.1 kHz, 128 * 44.1 kHz or
256 * 44.1 kHz.
Segmentation Each Audio Channel in an Audio Frame can be partitioned into Segments.
10.3 Conventions
In this document the conventions as described in this subclause are used.
10.3.1 Arithmetic and bit operations
a>>b Right shift a over b bits. The new msb bits are set to ‘0’.
a<
a | b Bitwise OR of a and b.
a & b Bitwise AND of a and b.
min(a,b) Minimum value of a and b.
max(a,b) Maximum value of a and b.
a mod b Value of a modulo b.
trunc(a) Value of a, rounded downwards.
| a | Absolute value of a.
a==b Evaluate if a is equal to b.
a!=b Evaluate if a is not equal to b.
a=b Variable a is set to the value of b.
a++ a = a + 1.
a -= b a = a – b.
a += b a = a + b.
10.3.2 Bit ordering
The graphical representation of all multiple-bit quantities is such that the most significant bit (msb) is on the
left, and the least significant bit (lsb) is on the right. Figure 10.1 defines the bit position in a Byte.
msb lsb
b7 b6 b5 b4 b3 b2 b1 b0
Figure 10.1 — Bit ordering in a Byte
10.3.3 Bit sequence
In all places where a bit sequence is used, a most significant bit first notation is used.
© ISO/IEC 2005 – All rights reserved 7
---------------------- Page: 11 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
10.3.4 Decimal notation
All Decimal values are preceded by a blank space or the range indicator (.) when included in a range. The
most significant digit is on the left, the least significant digit is on the right.
10.3.5 DSD bit order
The first sampled DSD bit is stored in the most significant bit of a byte. See subclause 10.6.1.1.
10.3.6 DSD Polarity
A DSD bit equal to one means "plus". A DSD bit equal to zero means "minus".
10.3.7 Hex notation
All Hexadecimal values are preceded by a $. The most significant nibble is on the left, the least significant
nibble is on the right.
10.3.8 Range
Constant_1.Constant_2 denotes the range from and including Constant_1 up to and including Constant_2, in
increments of 1.
10.3.9 Until
Until is used in figures to indicate that for a structure Byte Positions are used upto (not including) a given value.
At Byte Position B1, the expression “until B2” specifies B2-B1 bytes. At Byte Position B1, the expression “until
eos” specifies the number of bytes from B1 up to and including the last byte of the current Sector. Note that
Byte Position is specified relative to the start of the current, or a previous, Sector.
10.4 Basic Types
10.4.1 BsMsbf
Bit Sequence, Most Significant Bit First, must be interpreted as a Bit String.
10.4.2 Char
A one-byte character, encoded according to ISO 646. The NUL ($00) character is not allowed for Char.
10.4.3 SiMsbf
Bit sequence, Most Significant Bit First, must be interpreted as Signed Integer using two’s complement
notation.
10.4.4 UiMsbf
Bit sequence, Most Significant Bit First, must be interpreted as Unsigned Integer.
10.4.5 Uintn
An n bit, binary encoded, unsigned numerical value.
8 © ISO/IEC 2005 – All rights reserved
---------------------- Page: 12 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
10.4.6 Uint8
An 8 bit, binary encoded, unsigned numerical value. A Uint8 value must be recorded in a one-byte field.
10.4.7 Uint16
A 16-bit, binary encoded, unsigned numerical value. A Uint16 value, represented by the hexadecimal
representation $wxyz, must be recorded in a two-byte field as $wx $yz (most significant byte first).
10.4.8 Uint32
A 32-bit, binary encoded, unsigned numerical value. A Uint32 value, represented by the hexadecimal
representation $stuvwxyz, must be recorded in a four-byte field as $st $uv $wx $yz (most significant byte first).
10.5 Payloads for the audio object
Table 10.1 — Syntax of Audio_Frame()
Syntax Bits Mnemonics
DSTSpecificConfig( channelConfiguration ) {
if (DSD_Coded)
{
DSD() DSD
}
if (DST_Coded)
{
DST() DST
}
}
Table 10.2 – Syntax of DSD
Syntax Bits Mnemonics
DSD() {
For (Byte_Nr=0; Byte_Nr
{
For (Channel_Nr=1; Channel_Nr<=N_Channels;
Channel_Nr++)
{
DSD_Byte[Channel_Nr][Byte_Nr] 1 Audio_Byte
}
}
}
© ISO/IEC 2005 – All rights reserved 9
---------------------- Page: 13 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
Table 10.3 – Syntax of DST
Syntax Bits Mnemonics
DST() {
Processing_Mode 1 BsMsbf
if (Processing_Mode == 0)
{
DST_X_Bit 1 BsMsbf
Reserved 6 BsMsbf
DSD() DSD
}
else
{
Segmentation()
Segmentation
Mapping()
Mapping
Half_Probability() Half_Probability
Filter_Coef_Sets() Filter_Coef_Sets
Probability_Tables() Probability_Tables
Arithmetic_Coded_Data() Arithmetic_Coded_Data
}
}
Table 10.4 — Syntax of Segmentation
Syntax Bits Mnemonics
Segmentation() {
Same_Segmentation 1
if(Same_Segmentation == 0)
{
Filter_Segmentation() Segment_Alloc
Ptable_Segmentation()
Segment_Alloc
}
else
{
Filter_And_Ptable_Segmentation() Segment_Alloc
}
}
Table 10.5 — Syntax of Segments
Syntax Bits Mnemonics
Segment_Alloc() {
Resolution_Read = false
Same_Segm_For_All_Channels 1
if(Same_Segm_For_All_Channels == 0)
{
for (Channel_Nr=1; Channel_Nr<=N_Channels;
Channel_Nr++)
{
Channel_Segmentation()[Channel_Nr]
Channel_Segmentation
}
}
else
{
Channel_Segmentation()[1] Channel_Segmentation
}
}
10 © ISO/IEC 2005 – All rights reserved
---------------------- Page: 14 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
Table 10.6 — Syntax of Channel_Segmentation
Syntax Bits Mnemonics
Channel_Segmentation() {
Nr_Of_Segments = 1
Start[1] = 0
End_Of_Channel_Segm 1
while(End_Of_Channel_Segm == 0)
{
if (Resolution_Read == false)
{
Resolution 13 UiMsbf
Resolution_Read = true
}
Scaled_Length[Nr_Of_Segments] 1.13 UiMsbf
Segment_Length[Nr_Of_Segments] = Resolution *
Scaled_Length[Nr_Of_Segments]
Start[Nr_Of_Segments+1] = Start[Nr_Of_Segments] +
Segment_Length[Nr_Of_Segments]
Nr_Of_Segments++
End_Of_Channel_Segm 1
}
Segment_Length[Nr_Of_Segments] =
Frame_Length - Start[Nr_Of_Segments]
}
Table 10.7 — Syntax of Mapping
Syntax Bits Mnemonics
Mapping() {
Same_Mapping 1
if(Same_Mapping == 0)
{
Filter_Mapping() Maps
Ptable_Mapping() Maps
}
else
{
Filter_And_Ptable_Mapping() Maps
}
}
Table 10.8 — Syntax of Maps
Syntax Bits Mnemonics
Maps() {
Nr_Of_Elements = 0
Same_Maps_For_All_Channels 1
if(Same_Maps_For_All_Channels == 0)
{
for (Channel_Nr=1; Channel_Nr<=N_Channels;
Channel_Nr++)
{
Channel_Mapping()[Channel_Nr] Channel_Mapping
}
}
else
{
Channel_Mapping()[1] Channel_Mapping
}
}
© ISO/IEC 2005 – All rights reserved 11
---------------------- Page: 15 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
Table 10.9 — Syntax of Channel_Mapping
Syntax Bits Mnemonics
Channel_Mapping() {
for(Seg_Nr=1; Seg_Nr<=Nr_Of_Segments[Channel_Nr]; Seg_Nr++)
{
Element[Channel_Nr][Seg_Nr] 0.4 UiMsbf
if (Element[Channel_Nr][Seg_Nr] == Nr_Of_Elements)
{
Nr_Of_Elements++
}
}
}
Table 10.10 — Syntax of Half_Probability
Syntax Bits Mnemonics
Half_Probability() {
for (Channel_Nr=1; Channel_Nr<=N_Channels; Channel_Nr++)
{
Half_Prob[Channel_Nr] 1 BsMsbf
}
}
Table 10.11 — Syntax of Arithmetic_Coded_Data
Syntax Bits Mnemonics
Arithmetic_Coded_Data() {
j=0
do
{
A_Data[j] 1 BsMsbf
j++
} until end of Audio_Frame
}
Table 10.12 — Syntax of Filter_Coef_Sets
Syntax Bits Mnemoni
cs
Filter_Coef_Sets() {
for (Filter_Nr=0; Filter_Nr
{
Coded_Pred_Order 7 UiMsbf
Pred_Order[Filter_Nr]=Coded_Pred_Order+1
Coded_Filter_Coef_Set 1 BsMsbf
if (Coded_Filter_Coef_Set==0)
{
for (Coef_Nr=0; Coef_Nr
{
Coef[Filter_Nr][Coef_Nr] 9 SiMsbf
}
}
else
{
CC_Method 2 BsMsbf
for (Coef_Nr=0; Coef_Nr
{
Coef[Filter_Nr][Coef_Nr] 9 SiMsbf
}
CCM 3 UiMsbf
12 © ISO/IEC 2005 – All rights reserved
---------------------- Page: 16 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
for (Coef_Nr=CCPO; Coef_Nr
{
Run_Length=0
do
{
RL_Bit 1 BsMsbf
if (RL_Bit==0)
{
Run_Length++
}
} while (RL_Bit==0)
LSBs 0.6 UiMsbf
Delta=(Run_Length<
if (Delta!=0)
{
Sign 1 BsMsbf
if (Sign==1)
{
Delta = −Delta
}
}
Coef[Filter_Nr][Coef_Nr] = Delta
Delta8 = 0
for (Tap_Nr=0; Tap_Nr
{
Delta8 += 8*CCPC[Tap_Nr]*Coef[Filter_Nr][Coef_Nr-Tap_Nr-1]
}
if (Delta8>=0)
{
Coef[Filter_Nr][Coef_Nr] -= trunc((Delta8+4)/8)
}
else
{
Coef[Filter_Nr][Coef_Nr] += trunc((-Delta8+3)/8)
}
}
}
}
}
Table 10.13 — Syntax of Probability_Tables
Syntax Bits Mnemonics
Probability_Tables() {
for (Ptable_Nr=0; Ptable_Nr
{
Coded_Ptable_Len 6 UiMsbf
Ptable_Len[Ptable_Nr] = Coded_Ptable_Len+1
if (Ptable_Len[Ptable_Nr] == 1)
{
P_one[Ptable_Nr][0] = 128
}
else
{
Coded_Ptable 1 BsMsbf
if (Coded_Ptable==0)
{
for (Entry_Nr=0; Entry_Nr
Entry_Nr++)
{
© ISO/IEC 2005 – All rights reserved 13
---------------------- Page: 17 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
Coded_P_one 7 UiMsbf
P_one[Ptable_Nr][Entry_Nr] = Coded_P_one+1
}
}
else
{
PC_Method 2 BsMsbf
for (Entry_Nr=0; Entry_Nr
{
Coded_P_one 7 UiMsbf
P_one[Ptable_Nr][Entry_Nr] = Coded_P_one+1
}
PCM 3 UiMsbf
for (Entry_Nr=PCPO;Entry_Nr
{
Run_Length=0
do
{
RL_Bit 1 BsMsbf
if (RL_Bit==0)
{
Run_Length++
}
} while (RL_Bit==0)
LSBs 0.4 UiMsbf
Delta = (Run_Length<
if (Delta != 0)
{
Sign 1 BsMsbf
if (Sign==1)
{
Delta = − Delta
}
}
P_one[Ptable_Nr][Entry_Nr] = Delta
for (Tap_Nr=0; Tap_Nr
{
P_one[Ptable_Nr][Entry_Nr] −=
PCPC[Tap_Nr]*P_one[Ptable_Nr][Entry_Nr-Tap_Nr-1]
}
}
}
}
}
}
10.6 Semantics
10.6.1 Audio Streams
An Audio Stream contains the DSD audio signal. Plain DSD or DST coded DSD. An Audio Stream is the
concatenation of all Audio Frames in a Byte Stream.
14 © ISO/IEC 2005 – All rights reserved
---------------------- Page: 18 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
10.6.1.1 DSD Sampled Bit Stream Data
For a 2-Channel Audio signal, the DSD Sampled Bit sequence is defined as follows: L , L , L , L , L , L , L , L ,
0 1 2 3 4 5 6 7
R , R , R , R , R , R , R , R , L , L , L , … Where L is the first sampled bit of the left channel of an Audio
0 1 2 3 4 5 6 7 8 9 10 0
Frame.
Per Audio Channel, eight successively sampled bits are grouped into one Audio Byte. The most significant bit
of an Audio Byte is the first sampled bit of that byte.
10.6.1.2 Structure of the Audio Stream
DST coded Audio Frames have a variable length. The reference model of the DST Decoder is defined in
section 10.7.
10.6.1.3 Audio_Frame
Audio_Frame contains the DST or Plain DSD coded audio information for one Frame. The maximum size of
an Audio_Frame is equal to the size of a Plain DSD coded Audio_Frame plus one byte. The syntax of an
Audio_Frame is defined in Table 10.1.
10.6.1.3.1 DSD
DSD contains the audio data for one Plain DSD Audio_Frame. The syntax of DSD is defined in Table 10.2. An
example of a 5 channel DSD Frame is given in Figure 10.2. The definition of Audio_Byte is given in subclause
10.6.1.1.
Channel_Nr is the number of the Audio Channel.
N_Channels is the number of audio channels used as given by the channelConfiguration.
Frame_Length is the length of an Audio Frame in bytes per audio channel. The Frame_Length can be
calculated from the Sampling Frequency with the following formula:
Sampling Frequency [Hz]
Frame_Length = bytes per Audio Channel
75 * 8
The Sampling Frequency can be: 64*44100 Hz, 128*44100 Hz or 256*44100 Hz. The relation between the
Frame_Length and Sampling Frequency is provided in Table 10.14.
Table 10.14 — Frame_Length versus Sampling Frequency
Sampling Frequency [Hz] Frame_Length [bytes]
64*44100 4704
128*44100 9408
256*44100 18816
© ISO/IEC 2005 – All rights reserved 15
---------------------- Page: 19 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
Figure 10.2 — Example of a 5 channel DSD Frame, with Sampling Frequency 64*44100Hz
10.6.1.3.1.1 DSD_Byte
DSD_Byte[Channel_Nr][Byte_Nr] contains the DSD signal as defined in subclause 10.6.1.1.
10.6.1.3.2 DST
DST contains the audio data for one DST coded Audio_Frame. The syntax of DST is defined in Table 10.3.
10.6.1.3.2.1 Processing_Mode
If the Processing_Mode bit is set to one, the Audio_Frame contains the DST_X_Bit and the DSD signal in a
lossless coded form. If the Processing_Mode bit is set to zero, the Audio_Frame contains the DST_X_Bit and
the DSD signal without lossless coding.
10.6.1.3.2.2 DST_X_Bit
If Frame_Format DST_Coded, each Audio_Frame contains one DST_X_Bit. At encoding the DST_X_Bit shall
be set to zero. A reader shall ignore the content of the DST_X_Bit.
10.6.1.3.2.3 Reserved
This value shall be set to zero.
16 © ISO/IEC 2005 – All rights reserved
---------------------- Page: 20 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
10.6.1.3.2.4 DSD
See subclause 10.6.1.3.1.
10.6.1.3.2.5 Segmentation
For each Audio Channel, the Audio Frame is partitioned into one or more Segments for Filters, and one or
more Segments for Ptables. Each Segment may use a different Prediction Filter / Ptable. An example of
Segmentation is shown in Figure 10.3. The syntax of Segmentation is defined in Table 10.4.
← Audio Frame →
Channel
Segments
number
1 Segment 1 Segment 2 Segment 3 Segment 4
2 Segment 1 Segment 2 Segment 3 Segment 4
3 Segment 1
4 Segment 1 Segment 2
5 Segment 1
6 Segment 1 Segment 2 Segment 3
Figure 10.3 — Example of Segmentation
Filter_Segmentation
For each Audio Channel, the Audio Frame is partitioned into one or more Segments for Prediction Filters.
Each Segment may use a different Prediction Filter. The variables Nr_Of_Segments[] and
Segment_Length[][] from Segment_Alloc (see subclause 10.6.1.3.2.5.2) used for Filter_Segmentation are
referred to as:
• Filters.Nr_Of_Segments[Channel_Nr] and
• Filters.Segment_Length[Channel_Nr][1.Filters.Nr_Of_Segments[Channel_Nr]].
With Channel_Nr = 1.N_Channels.
Ptable_Segmentation
For each Audio Channel, the Audio Frame is partitioned into one or more Segments for Ptables. Each
Segment may use a different Ptable. The variables Nr_Of_Segments[] and Segment_Length[][] from
Segment_Alloc (see subclause 10.6.1.3.2.5.2) used for Ptable_Segmentation are referred to as:
• Ptables.Nr_Of_Segments[Channel_Nr] and
• Ptables.Segment_Length[Channel_Nr][1.Ptables.Nr_Of_Segments[Channel_Nr]].
With Channel_Nr = 1.N_Channels.
© ISO/IEC 2005 – All rights reserved 17
---------------------- Page: 21 ----------------------
ISO/IEC 14496-3:2001/Amd.6:2005(E)
Filter_And_Ptable_Segmentation
For each Audio Channel, the Audio Frame is partitioned into one or more Segments. Each Segment may use
a different combination of Prediction Filter and Ptable. For each Audio Channel, the following equations must
be true:
Filters.Nr_Of_Segments[Channel_Nr] = Ptables.Nr_Of_Segments[Channel_Nr] =
Nr_Of_Segments[Channel_Nr]
Filters.Segment_Length[Channel_Nr][] = Ptables.Segment_Length[Channel_Nr][] =
Segment_Length[Channel_Nr][]
With Channel_Nr = 1.N_Channels.
10.6.1.3.2.5.1 Same_Segmentation
If Same_Segmentation is one, the Ptables and Prediction Filters use one and the same Segmentation. If
Same_Segmentation is zero, the partitioning for Prediction Filters is independent from the partitioning for
Ptables, for the Audio Frame.
10.6.1.3.2.5.2 Segment_Alloc
Segment_Alloc defines the Segmentation for the Prediction Filters and/or the Ptables. The syntax of
Segment_Alloc is defined in Table 10.5.
For each Audio Channel, the variables Nr_Of_Segments and Segment_Length[1.Nr_Of_Segments] from
Channel_Segmentation (see subclause 10.6.1.3.2.5.2.2) are referred to as:
• Nr_Of_Segments[Channel_Nr]
• Segment_Length[Channel_Nr][1.Nr_Of_Segments[Channel_Nr]]
In the syntax diagrams, syntax variables are shown in italics.
Resolution_Read indicates whether Resolution from Channel_Segmentation, see subclause 10.6.1.3.2.5.2.2,
has been read. Resolution_Read is set to true in the Channel_Segmentation of the first Audio Channel with
more than one Segment. Note that if Prediction Filters and Ptables use independent Segmentation, they also
use an independent Resolution_Read.
Channel_Nr is a local index variable.
N_Channels is the number of audio channels used.
10.6.1.3.2.5.2.1 Same_Segm_For_All_Channel
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.