ISO/IEC 14496-4:2004/Amd 8:2005
(Amendment)Information technology — Coding of audio-visual objects — Part 4: Conformance testing — Amendment 8: High Efficiency Advanced Audio Coding, audio BIFS, and structured audio conformance
Information technology — Coding of audio-visual objects — Part 4: Conformance testing — Amendment 8: High Efficiency Advanced Audio Coding, audio BIFS, and structured audio conformance
Technologies de l'information — Codage des objets audiovisuels — Partie 4: Essai de conformité — Amendement 8: Codage sonore avancé à haute efficacité, BIFS sonore et conformité sonore structurée
General Information
Relations
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 14496-4
Second edition
2004-12-15
AMENDMENT 8
2005-06-01
Information technology — Coding
of audio-visual objects —
Part 4:
Conformance testing
AMENDMENT 8: High Efficiency Advanced
Audio Coding, audio BIFS, and structured
audio conformance
Technologies de l'information — Codage des objets audiovisuels —
Partie 4: Essai de conformité
AMENDEMENT 8: Codage sonore avancé à haute efficacité, BIFS
sonore et conformité sonore structurée
Reference number
ISO/IEC 14496-4:2004/Amd.8:2005(E)
©
ISO/IEC 2005
---------------------- Page: 1 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2005
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2005 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 8 to ISO/IEC 14496-4:2004 was prepared by Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia
information.
© ISO/IEC 2005 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
Introduction
This document specifies the eighth amendment to the ISO/IEC 14496-4:2004 standard. The amendment adds
the conformance testing for the SBR audio object type defined in 14496-3. It also specifies conformance
sequences for BIFS and Structured Audio.
iv © ISO/IEC 2005 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
Information technology — Coding of audio-visual objects —
Part 4:
Conformance testing
AMENDMENT 8: High Efficiency Advanced Audio Coding, audio
BIFS, and structured audio conformance
In subclause 6.5.1 File name conventions, insert the following row into Table 29 after the row for AAC LTP:
Table 29 – File name conventions
SBR (+AAC LC) al_sbr___[_fsaac][_sig] al_sbr____[_fsaac][
_sig][_]
And add:
indicates the SBR module mainly targeted by the test sequence. Possible values are "e" for testing the
envelope adjuster "s" for testing sine addition, "gh" for testing time-grid transitions in combination with
changes of SBR header data, "i" for testing inverse filtering, "qmf" for testing the QMF implementation, “cm”
for testing various channel modes, “sig” for testing SBR signaling, "twi" for QMF identification, and “sr” for
testing various combinations of sampling rates.
corresponds to the number of channels present in the conformance test sequence. It is either a
single integer, in which case it refers to the number of main audio channels, or two integers separated by a ‘.’,
in which case the first integer equals the number of main audio channels, while the second number equals the
number of low frequency enhancement channels.
fsaac corresponds to the sampling rate of the underlying AAC-LC data. If it is omitted, it is half the
sampling rate given as output sampling rate.
is an integer describing the kind of signalling used according to the table below. If this value is omitted,
backwards compatible explicit signalling of SBR is used.
Table 29A – File name conventions
sig Signalling method used
0 Implicit signalling of SBR
1 Hierarchical explicit signalling of SBR
2 Backwards compatible explicit signalling of SBR
is either "hq" or "lp" for the high quality or the low power version of the SBR decoding algorithm
respectively.
© ISO/IEC 2005 – All rights reserved 1
---------------------- Page: 5 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
In subclause 6.6.1.2.2 (Test procedure), replace:
...testing can be done by comparing the output of a decoder under test with a reference output also supplied
by ISO/IEC JTC 1/SC 29/WG 11.
with:
...testing can be done by comparing the output of a decoder under test with a reference output also supplied
by ISO/IEC JTC 1/SC 29/WG 11. In cases where the decoder under test is followed by additional operations
(e.g. quantizing a signal to a 16 bit output signal) the conformance point is prior to such additional operations,
i.e. it is permitted to use the actual decoder output (e.g. with more than 16 bit) for conformance testing.
In subclause 6.6.3.1.2.3 (Bitstream payload), add:
6.6.3.1.2.3.2 aac_scalable_main_header()
max_sfb: Shall not be smaller than last_max_sfb (helper variable specified in ISO/IEC 14496-3).
6.6.3.1.2.3.3 aac_scalable_extension_header()
max_sfb: Shall not be smaller than last_max_sfb (helper variable specified in ISO/IEC 14496-3).
In subclause 6.6.4.1.2.1.1 AudioSpecificConfig(), add:
extensionAudioObjectType: Shall be the Audio Object Type SBR (AOT == 5).
extensionSamplingFrequency: Shall be encoded with a value listed in Table 34, and the value shall be the
same as samplingFrequency, or twice the value of samplingFrequency.
extensionSamplingFrequencyIndex: Shall be encoded with a value listed in Table 34, and the value shall
indicate an extensionSamplingFrequency being the same as samplingFrequency as indicated by
samplingFrequencyIndex, or the value shall indicate an extensionSamplingFrequency being twice the value of
samplingFrequency.
sbrPresentFlag: Shall be encoded with the value zero if no SBR data is contained in the compressed MPEG-
4 data. If SBR data is present in the compressed MPEG-4 data the parameter shall be encoded with the value
one.
In subclause 6.6.4.1.2.1.1 AudioSpecificConfig(), add the following entries to Table 34:
Table 34 — Specification of samplingFrequencyIndex and samplingFrequency
SamplingFrequencyIndex /
Level 1 Level 2 Level 3 Level 4 Level 5
SamplingFrequency
AAC Profile 0x6.0xc, 0x3.0xc, NA 0x3.0xc, 0x0.0xc,
0xf / 0xf / 0xf / 0xf /
<= 24000 <= 48000 <= 48000 <= 96000
2 © ISO/IEC 2005 – All rights reserved
---------------------- Page: 6 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
samplingFrequencyIndex Level 1 Level 2 Level 3 Level 4 Level 5
/ samplingFrequency
High Efficiency AAC SBR NA 0x6.0xc, 0x3.0xc, 0x3.0xc, 0x3.0xc,
Profile present 0xf / 0xf / 0xf / 0xf /
<= 24000 <= 48000 <= 48000 <= 48000
(Note 1)
SBR not NA 0x3.0xc, 0x3.0xc, 0x3.0xc, 0x0.0xc,
present 0xf / 0xf / 0xf / 0xf /
<= 48000 <= 48000 <= <=
48000 96000
Note 1: For Level 4, for one or two channels the maximum AAC sampling rate, with SBR present, is
48 kHz. For more than two channels the maximum AAC sampling rate, with SBR present, is 24 kHz.
(0x6.0xc, 0xf / <= 24000)
extensionSamplingFrequencyIndex / Level 1 Level 2 Level 3,4 Level 5
extensionSamplingFrequency
High Efficiency AAC Profile NA 0x6.0xc, 0x3.0xc, 0x0.0xc,
0xf / 0xf / 0xf /
<= 24000 <= 48000 <= 96000
In subclause 6.6.4.1.2.1.1 AudioSpecificConfig(), add the following entries to Table 35:
Table 35 — Specification of ChannelConfiguration
ChannelConfiguration Level 1 Level 2 Level 3 Level 4 Level 5
AAC Profile
0.2 0.2 NA 0.6 0.6
High Efficiency AAC NA 0.2 0.2 0.6 0.6
Profile
In subclause 6.6.4.1.2.2.4 (ics_info()), replace:
Test bitstreams al03 and as17 are provided respectively for Main and Low-Complexity profiles to test decoder
performance on non-meaningful transitions
with:
Test sequences al03 and as17 are provided respectively for AOT 2 (AAC LC) and AOT 3 (AAC SSR) to test
decoder performance on non-meaningful window sequence transitions (note that AOT 1 (AAC Main) and AOT
4 (AAC LTP) decoders also need to fulfil conformance for AOT 2)
In subclause 6.6.4.1.2.2.9 fill_element(), add:
Fill elements containing an extension_payload with an extension_type of EXT_SBR_DATA or
EXT_SBR_DATA_CRC shall not contain any other extension_payload of any other extension_type. For fill
elements containing an extension_payload with an extension_type of EXT_SBR_DATA or
EXT_SBR_DATA_CRC, the fill_element count field shall be set equal to the total length in bytes, including the
SBR enhancement data plus the extension_type field.
© ISO/IEC 2005 – All rights reserved 3
---------------------- Page: 7 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
In subclause 6.6.15.5 Procedure to Test Decoder Conformance:
At the end of the second paragraph, replace the last sentence:
Conformant decoders must use the RMS Measurement criterion for sequences SY001 through SY004
with:
Conformant decoders must use the RMS Measurement criterion for sequences SY001 through SY004 and
SY016 through SY019
In the same subclause, before the last paragraph (Testing of non-normative effects.) add:
Testing of the bus width calculation and send/route mechanism shall be performed using test sequence
SY014 and SY015. A reference output is provided by ISO/IEC as example. To be called an ISO/IEC 14496-3
audio decoder, the decoder shall provide two identical outputs for the two sequences, and in particular this
output shall be composed of two channels characterized by a reverberated sound in the first and a dry sound
in the second
In subclause 6.6.15.6 Description of Conformance sequences, add the following sequence descriptions after
SY013:
Sequence SY014 "clarinet1.mp4"
A clarinet synthesized at 44.1 kHz in FM with linear interpolation is routed to a reverberation instrument. The
bus width is not set in the send statement, instead the sound is output twice to the bus, creating a two-channel
bus. In the reverberation instrument only the first channel is reverberated, the second is output as is.
Sequence SY015 "clarinet2.mp4"
A clarinet synthesized at 44.1 kHz in FM with linear interpolation is routed to a reverberation instrument. The
bus width is set to 2 in the send statement, the sound is output in mono to the bus, and then it must be
replicated identical on the second channel. In the reverberation instrument only the first channel is
reverberated, the second is output as is.
Sequence SY016 "fir.mp4"
In this sequence a sound synthesized at 32kHz is low pass filtered using both fir and firt core opcodes. The
two filters have 16 coefficients and normalized cutoff frequency of 0.5 and 0.25 respectively. They are used to
filter the left (fir) and right (firt) channels of the sound. This sequence also test pitch converters and other
mathematical operators.
Sequence SY017 "vtone.mp4"
A sequence of monophonic sinusoidal tones with sampling rate frequency at 44.1 kHz is shaped according to
attack and release time. This test sequence experiences mathematical operators, pitch converters and kline
signal generator. Interpolation is linear.
Sequence SY018 "ttone.mp4"
A sequence of monophonic sinusoidal tones with sampling rate frequency at 44.1 kHz is shaped according to
attack and release time. It is similar to SY017 but in this case tones are generated through tables instead that
by mathematical functions. This test sequence experiences table generators, mathematical operators, table
access and oscillators. Interpolation is linear.
4 © ISO/IEC 2005 – All rights reserved
---------------------- Page: 8 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
Sequence SY019 "otone.mp4"
A sine tone is played in the left channel, with its octave tone in the right channel. Interpolation is linear,
sampling rate is 44.1 kHz. This sequence especially experiences user defined core opcodes (including rate
polymorphic), parameter passing, array variables.
In the same subclause, after Table 92, add:
Table AMD8-1 — Algorithmic Synthesis and Audio Fx Object Type Test Sequences (continued)
File Name SY014 SY015 SY016 SY017 SY018 SY019
Content clarinet1 clarinet2 fir vtone ttone otone
Processing level
All All ≥Med All All All
RCU - RAM (KB) < 4 < 4 < 4 < 4 < 4 < 4
After subclause 6.6.16 Main Synthetic, add the following subclauses:
6.6.17 SBR
6.6.17.1 Compressed data
6.6.17.1.1 Characteristics
For all applicable Audio Object Types the SBR extension_payload() elements should be placed last among
the extension_payload() elements, i.e. if another type of extension_payload() element is present it should be
placed prior to the SBR extension_payload() elements.
If the Audio Object Type SBR is used in combination with either of the Audio Object Types AAC main, AAC
LC, AC SSR or AAC LTP, the compressed data shall be stored as outlined in ISO/IEC 14496-3, subclause
4.5.2.8.2.2 SBR Extension Payload for the Audio Object Types AAC main, AAC SSR, AAC LC and AAC LTP.
If the Audio Object Type SBR is used in combination with either of the Audio Object Types ER AAC LC or ER
AAC LTP, the compressed data shall be stored as outlined in ISO/IEC 14496-3, subclause 4.5.2.8.2.3 SBR
Extension Payload for the Audio Object Types ER AAC LC and ER AAC LTP. For these AOTs, DRC
extension_payload() elements are not permitted simultaneously with SBR extension_payload() elements
within one er_raw_data_block(). Moreover, SBR extension_payload() elements of the type
EXT_SBR_DATA_CRC shall not be used with these AOTs.
For the scalable AOTs (AAC scalable and ER AAC scalable), the SBR data should be transmitted and
devised according to ISO/IEC 14496-3, subclause 4.5.2.8.2.4 SBR extension payload for the Audio Object
Types AAC scalable and ER AAC scalable. Restrictions are here put on the frequency range of the SBR data
and in what layers of the scalable stream the SBR data is stored. Furthermore, SBR extension_payload()
elements of the type EXT_SBR_DATA_CRC shall not be used with the audio object types ER AAC scalable.
6.6.17.1.2 Test procedure
Each compressed data shall meet the syntactic and semantic requirements specified in ISO/IEC 14496-3. The
decoded data shall also meet the requirements defined in ISO/IEC 14496-3 subclause 4.6.18.3.6
Requirements. If a syntactic element is not listed below, no restrictions apply to that element. The
bs_reserved elements shall be encoded with the value zero.
© ISO/IEC 2005 – All rights reserved 5
---------------------- Page: 9 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
6.6.17.1.2.1 Compressed MPEG-4 data payload
6.6.17.1.2.1.1 sbr_header()
The following parameters shall be encoded with values subsequently used in defining a frequency range, a
number of noise bands, a number of limiter bands, and a number of patches:
bs_start_freq
bs_stop_freq
bs_xover_band
bs_alter_scale
bs_noise_bands
bs_limiter_bands
The above parameters are used (in ISO/IEC 14496-3) to calculate the variables below:
k
2
k
0
k
x
M
N
Q
numPatches
numBands
numBands0
vDk0
vDk1
Conformant compressed MPEG-4 data shall have values for the above parameters that subsequently
evaluate to values of the above variables that satisfy the requirements outlined in ISO/IEC 14496-3, subclause
4.6.18.3.6 Requirements.
6.6.17.1.2.1.2 sbr_channel_pair_base_element()
bs_coupling: Shall be encoded with the value of 1
6.6.17.1.2.1.3 sbr_grid()
The following compressed MPEG-4 data elements shall be encoded so that a value of the number of SBR
envelopes for a SBR frame, for a given frame class, is within the limits defined in ISO/IEC 14496-3, subclause
4.6.18.3.6 Requirements:
bs_rel_bord_0
bs_rel_bord_1
bs_num_env
bs_var_bord
bs_num_rel_0
bs_var_bord_0
bs_var_bord_1
bs_num_rel_0
bs_num_rel_1
Conformant compressed MPEG-4 data shall have the above parameters chosen so that the leading border of
a given SBR frame (the frame boundary) coincides with the trailing border of the previous SBR frame (the
frame boundary of the previous frame). Furthermore, the above parameters shall be chosen so that the
envelope borders of the SBR envelopes in a given frame fall within the boundaries of the SBR frame. The
above parameters shall also be chosen so that every SBR envelope within the SBR frame has a duration
larger than zero.
6 © ISO/IEC 2005 – All rights reserved
---------------------- Page: 10 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
6.6.17.1.2.1.4 sbr_dtdf()
bs_df_env[]: Shall be encoded with the value 0 for the first envelope of the present frame, if the compressed
MPEG-4 data element bs_header_flag has the value one (i.e. a new sbr_header is available), or if the
amp_res value has changed from the previous frame due to the rule specifying amp_res = 0 for a frame of
frame class FIXFIX with only one envelope.
bs_df_noise[]: Shall be encoded with the value 0 for the first noise floor of the present frame, if the
compressed MPEG-4 data element bs_header_flag has the value one, i.e. a new sbr_header is available.
6.6.17.1.2.1.5 sbr_envelope()
bs_codeword: Shall be encoded with the values listed in the corresponding Huffman table, defined in
ISO/IEC 14496-3, Annex 4.A.6.1
Conformant compressed MPEG-4 data shall have coded envelope scalefactors based on quantized
envelopes scalefactors that satisfy the requirements outlined in ISO/IEC 14496-3, subclause 4.6.18.3.6
Requirements.
The quantised envelope scale factors E for single channel elements and E and E for channel pair elements
0 1
shall be encoded with values that are within the following limits:
7_−amp res
• For single channel elements: 0,≤
7_−amp res
0,≤
0
• For channel pair elements:
7_−−amp res bs_coupling
0,≤
()
1
where
0,if bs_num_env==1and frame_class FIXFIX
amp _res =
bs__amp res,otherwise
where subscript zero indicates the firstly encoded channel in the channel pair element and subscript one
indicates the secondly encoded channel in the channel pair element.
6.6.17.1.2.1.6 sbr_noise()
bs_codeword: Shall be encoded with the values listed in the corresponding Huffman table, defined in
ISO/IEC 14496-3, Annex 4.A.6.1
Conformant compressed MPEG-4 data shall have coded noise floor scalefactors based on quantised noise
floor scalefactors that satisfy the requirements outlined in ISO/IEC 14496-3, subclause 4.6.18.3.6
Requirements.
6.6.17.2 Decoders
6.6.17.2.1 Characteristics
The object type SBR has the Object Type ID 5, and the compressed MPEG-4 data syntax is defined in
ISO/IEC 14496-3. The Audio Object Type SBR contains the SBR Tool. The SBR Tool can be implemented in
two different versions:
• High-Quality SBR Tool
• Low-Power SBR Tool
The different versions can also be operated in down-sampled SBR-mode.
© ISO/IEC 2005 – All rights reserved 7
---------------------- Page: 11 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
The internal sampling rate of the SBR Tool shall always be twice the sampling rate indicated by
samplingFrequency or samplingFrequencyIndex in the AudioSpecificConfig().
A conformant implementation of the SBR tool that receives an SBR enhanced data stream shall operate in
upsampling mode only, until an sbr_header is received, ensuring that the SBR data can be decoded correctly.
6.6.17.2.1.1 HE-AAC
The ability to do down-sampled SBR is mandatory for levels 3 and 4 of the High Efficiency AAC Profile.
A decoder conforming to that profile and level shall operate the down-sampled SBR tool if one of the following
conditions is fulfilled:
• extensionSamplingFrequency is the same as samplingFrequency, or
• extensionSamplingFrequencyIndex is the same as samplingFrequencyIndex, or
the output sampling rate would otherwise exceed the maximum allowed output sample rate for the given level.
An HE-AAC profile decoder shall support implicit SBR signaling, as outlined in ISO/IEC 14496-3, subclause
1.6.5.3 HE AAC Profile Decoder Behavior in Case of Implicit Signaling.
An HE-AAC profile decoder shall support explicit SBR signaling as outlined in ISO/IEC 14496-3, subclause
1.6.5.4 HE AAC Profile Decoder Behavior in Case of Explicit Signaling.
6.6.17.2.2 SBR conformance test procedure
For the sake of simplicity, conformance testing of the AOT SBR is carried out in conjunction with AAC LC.
The conformance test procedure for the SBR tool internally creates a reference for comparison, given an input
compressed MPEG-4 data file and the output from the decoder under test. In order to accomplish this, apart
from the *_twi_* and *_qmf_* types, every SBR conformance compressed MPEG-4 data file is divided into two
parts as outlined in Figure AMD8-1, where the AAC data for the two parts is identical but the SBR header
does not arrive until the second part. This ensures that in the case of implicit signaling a conformant decoder
will recognise the SBR extension element at the beginning of the compressed MPEG-4 data, or in the case of
explicit signalling, by parsing the audioSpecificConfig(). Since no SBR header is present it cannot start SBR
decoding and will hence do up-sampling using the SBR QMF filterbank in anticipation of the SBR header.
sbr_header = 1
sbr_header = 0
SBR part 1st half SBR part 2nd half
AAC part 1st half AAC part 2nd half
Figure AMD8-1 — The disposition of the SBR conformance compressed MPEG-4 data files
The conformance test procedure stipulate:
• reading the compressed MPEG-4 data file;
• while no SBR header is present taking the input decoded file (the output from the decoder under test)
and down-sample the signal;
• store the down-sampled signal.
Since this signal is just an up-sampled version of the output-signal from the AAC and hence the input signal to
the SBR decoder under test, it can be down-sampled, and by means of a polyphase correction filter, be
approximated to be the same signal as was used by the SBR Tool in the decoder under test.
8 © ISO/IEC 2005 – All rights reserved
---------------------- Page: 12 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
In parallel to storing the signal it shall also be fed to the reference SBR decoder where, since no SBR header
is available, up-sampling is performed. The, in the reference SBR decoder, upsampled signal shall be
compared to the input signal, i.e. the output from the decoder under test. This serves as a QMF test of the first
half of the conformance file.
When the SBR header arrives, a SBR processed reference signal based on the stored lowband signal (that is
a very close approximation of the signal that the SBR Tool in the decoder under test used) shall be calculated.
This means that it is possible to test the accuracy of the SBR part of the implementation, without having to
deal with the differences between the AAC implementation used in the decoder under test, and the AAC
implementation used for producing reference waveforms. Furthermore, the accuracy of the QMF
implementation of the decoder under test is tested separately for every conformance sequence.
If a complex QMF filter bank with a modified internal phase angle (hereinafter referred to as twiddles) is used
in the decoder under test, the al_sbr_twi_* sequences shall be tested first. Contrary to the other al_sbr_* test-
sequences, these sequences consist of zero signal AAC-data and SBR elements that trigger sine-addition in
the entire frequency range covered by SBR. The output-signal of this test sequence will contain sinusoids with
phase angles that depend on the implementation of the synthesis filterbank in the decoder under test. Based
on the obtained output signal from the decoder under test, the two parameters ϕ and β describing the phase
characteristics of the filter banks in the decoder under test are identified. Since different phase characteristics
may be used, depending on whether downsampled SBR is used or not, both al_sbr_twi_* sequences must be
run prior to conformance testing. The file sequence is only used to perform an examination and identification
of the filter bank. It does not test conformance of the QMF bank. The performance of the total QMF (analysis –
synthesis) shall be tested with the *_qmf_* sequences and the other informative QMF bank tests.
In order to ensure that the QMF is implemented correctly, the output from the QMF test specific sequences
*_qmf_* are compared to the internal reference without down-sampling and storing the first half of the test
sequence. This is since it is possible, however very unlikely, to introduce errors in the QMF implementation
that from the SBR conformance test procedure point of view look like differences in the AAC implementation.
By, for the QMF specific sequences, omitting the parts of the tool that are designed to neglect differences in
the AAC implementation it is ensured that the QMF is implemented correctly.
If the decoder under test passes the conformance criteria for the dedicated QMF test sequences, this is a
good indication that the QMF implementation is accurate. However, it is no definite guarantee, and hence it
could happen that a QMF implementation that barely passes the conformance for the QMF test, does not pass
conformance for other parts of the system due to the QMF implementation. Therefore, it is useful to observe
the result from the QMF test for the first half for any of the conformance sequences. This can give a good
indication of the origin of a potential error.
Figure AMD8-2 outlines the SBR conformance test procedure.
Compressed data
Reference waveform
Reference SBR
Read
decoder
Delay
store/input
maxDiff /
maxRMS
Comparison
Down-
Store
test
sample
Twiddle-
estimation
Test decoder
output
(waveform)
Figure AMD8-2 — Block diagram of the SBR conformance test procedure
© ISO/IEC 2005 – All rights reserved 9
---------------------- Page: 13 ----------------------
ISO/IEC 14496-4:2004/Amd.8:2005(E)
The essential modules are:
• Read store/input: This module parses the SBR part of the compressed MPEG-4 data and searches
for the SBR header. When no SBR header is available, the downsampled Test decoder output is
routed to both the Storage Module and through the Delay module to the Reference SBR decoder.
When an SBR header is available, the data stored in the Storage module is routed through the Delay
module to the Reference SBR decoder. The Test decoder output is routed to the comparison test
module.
• Down-sample: This module downsamples the Test decoder output signal, by decimation, and applies
a polyphase filter that approximates the inverse of the equivalent polyphase filter of the QMF-
upsampler. The delay imposed by the downsampler is given by:
K −1
delay=⋅32 , where K = 25 is the length of the polyphase filter. If down-sampled SBR is
2
used, the down-sampler omits the decimation and does only the polyphase filtering. The polyphase
filter matrix H(,kl) of size 32 ×K is tabulated in Table AMD8-2. The polyphase filtering step
consists of the operation which maps a time signal x()n to yn() , where
K −1
yk(+=32i) H(k,l)x k+ 32(i−l) , k= 0,1,…,31.
()
∑
l=0
• Twiddle-estimation: This module identifies the twiddles used in the synthesis filter bank in the codec
under test, and based on this computes the analysis filterbank twiddles in the codec under test. This
information is passed on to the reference SBR decoder where the QMF filter bank is given the same
distribution of the twiddle factors as used in the decoder under test.
• Reference SBR decoder is a reference SBR decoder according to the ISO specification. It generates
a reference signal based on the stored low-band signal and t
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.