ISO/IEC 14496-2:2001/Amd 2:2002
(Amendment)Information technology - Coding of audio-visual objects - Part 2: Visual - Amendment 2: Streaming video profile
Information technology - Coding of audio-visual objects - Part 2: Visual - Amendment 2: Streaming video profile
Technologies de l'information — Codage des objets audiovisuels — Partie 2: Codage visuel — Amendement 2: Cours du profil vidéo
General Information
Relations
Frequently Asked Questions
ISO/IEC 14496-2:2001/Amd 2:2002 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - Coding of audio-visual objects - Part 2: Visual - Amendment 2: Streaming video profile". This standard covers: Information technology - Coding of audio-visual objects - Part 2: Visual - Amendment 2: Streaming video profile
Information technology - Coding of audio-visual objects - Part 2: Visual - Amendment 2: Streaming video profile
ISO/IEC 14496-2:2001/Amd 2:2002 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO/IEC 14496-2:2001/Amd 2:2002 has the following relationships with other standards: It is inter standard links to ISO 17892-6:2017, ISO/IEC 14496-2:2001, ISO/IEC 14496-2:2004; is excused to ISO/IEC 14496-2:2001. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
You can purchase ISO/IEC 14496-2:2001/Amd 2:2002 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 14496-2
Second edition
2001-12-01
AMENDMENT 2
2002-02-01
Information technology — Coding of
audio-visual objects —
Part 2:
Visual
AMENDMENT 2: Streaming video profile
Technologies de l'information — Codage des objets audiovisuels —
Partie 2: Codage visuel
AMENDEMENT 2: Cours du profil vidéo
Reference number
ISO/IEC 14496-2:2001/Amd.2:2002(E)
©
ISO/IEC 2002
ISO/IEC 14496-2:2001/Amd.2:2002(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but shall not
be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this
file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat accepts no liability in this
area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters
were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event
that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2002
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic
or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISO's member body
in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.ch
Web www.iso.ch
Printed in Switzerland
ii © ISO/IEC 2002 – All rights reserved
ISO/IEC 14496-2:2001/Amd.2:2002(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission)
form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC
participate in the development of International Standards through technical committees established by the
respective organization to deal with particular fields of technical activity. ISO and IEC technical committees
collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in
liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have
established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 3.
The main task of the joint technical committee is to prepare International Standards. Draft International Standards
adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International
Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this Amendment may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 2 to International Standard ISO/IEC 14496-2:2001 was prepared by Joint Technical Committee
ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and
hypermedia information.
© ISO/IEC 2002 – All rights reserved iii
ISO/IEC 14496-2:2001/Amd.2:2002(E)
Information technology — Coding of audio-visual objects —
Part 2: Visual
AMENDMENT 2: Streaming video profile
1) Add the following text to the end of ‘Purpose’ of ‘Introduction’:
“
Two profiles are developed in response to the growing need for a video coding method for Streaming Video on
Internet applications. It provides the definition and description of Advanced Simple (AS) Profile and Fine Granularity
Scalable (FGS) Profile. AS Profile provides the capability to distribute single-layer frame based video at a wide
range of bit rates available for the distribution of video on Internet. FGS Profile uses AS Video Object in the base
layer and provides the description of two enhancement layer types - Fine Granularity Scalability (FGS) and FGS
Temporal Scalability (FGST). FGS Profile allows the coverage of a wide range of bit rates for the distribution of
video on Internet with the flexibility of using multiple layers, where there is a wide range of bandwidth variation.
“
2) Add the following text into ‘Introduction’ following ‘Error Resilience’:
“
Fine Granularity Scalability
Fine Granularity Scalability (FGS) provides quality scalability for each VOP. Figure AMD2-1 shows a basic FGS
decoder structure.
FGS Enhancement Decoding
Enhancement
Bitstream Enhancement Video
Bit-plane Bit-plane
Clipping
IDCT
VLD Shift
-1
Q IDCT Clipping
VLD
Base Layer Video
Base Layer
(optional output)
Bitstream
Motion
Compensation
Frame
Memory
Figure AMD2-1 — A Basic FGS Decoder Structure
To reconstruct the enhanced VOP, the enhancement bitstream is first decoded using bit-plane VLD. The decoded
block-bps are used to reconstruct the DCT coefficients in the DCT domain which are then right-shifted based on the
frequency weighting and selective enhancement shifting factors. The output of bit-plane shift is the DCT coefficients
of the image domain residues. After the IDCT, the image domain residues are reconstructed. They are added to the
reconstructed clipped base-layer pixels to reconstruct the enhanced VOP. The reconstructed enhanced VOP pixels
© ISO/IEC 2002 – All rights reserved 1
ISO/IEC 14496-2:2001/Amd.2:2002(E)
are limited into the value range between 0 and 255 by the clipping unit in the enhancement layer to generate the
final enhanced video. The reconstructed base layer video is available as an optional output since each base layer
reconstructed VOP needs to be stored in the frame buffer for motion compensation.
The basic FGS enhancement layer consists of FGS VOPs that enhance the quality of the base-layer VOPs as
shown in Figure AMD2-2.
FGS Layer
FGS FGS FGS FGS FGS FGS FGS
VOP VOP VOP VOP VOP VOP VOP
Base Base Base Base Base Base Base
VOP VOP VOP VOP VOP VOP VOP
Base Layer
Figure AMD2-2 — Basic FGS Enhancement Structure
When FGS temporal scalability (FGST) is used, there are two possible enhancement structures. One structure is to
have two separate enhancement layers for FGS and FGST as shown in Figure AMD2-3 and the other structure is
to have one combined enhancement layer for FGS and FGST as shown in Figure AMD2-4.
FGST FGST FGST FGST
Layer VOP VOP VOP
FGS FGS FGS FGS FGS
Layer VOP VOP VOP VOP
Base Base Base Base Base
Layer VOP VOP VOP VOP
Figure AMD2-3 — Two Separate Enhancement Layers for FGS and FGST
FGS-FGST Layer
FGS FGST FGS FGST FGS FGST FGS
VOP VOP VOP VOP VOP VOP VOP
Base Base Base Base
VOP VOP VOP VOP
Base Layer
Figure AMD2-4 — One Combined Enhancement Layer for FGS and FGST
2 © ISO/IEC 2002 – All rights reserved
ISO/IEC 14496-2:2001/Amd.2:2002(E)
In either one of these two structures that include FGS temporal scalability, the prediction for the FGS temporal
scalable VOPs can only be from the base layer. Each FGS temporal scalable VOP has two separate parts. The first
part contains motion vector data and the second part contains the DCT texture data. The syntax for the first part is
similar to that in the temporal scalability described in subclause 6.2. The DCT texture data in the second part are
coded using bit-plane coding in the same way as that in FGS. To distinguish the temporal scalability in subclause
6.2 and FGS temporal scalability, the FGS temporal scalability layer in Figure AMD2-3 is called “FGST layer”. The
combined FGS and FGST layer in Figure AMD2-4 is called “FGS-FGST layer”. The “FGS VOP” shown in Figure
AMD2-3 and Figure AMD2-4 is an fgs vop with fgs_vop_coding_type being ‘I’. The “FGST VOP” shown in Figure
AMD2-3 and Figure AMD2-4 is an fgs vop with fgs_vop_coding_type being ‘P’ or ‘B’.
The code value of profile_and_level_indication in VisualObjectSequence() has been extended to include the
profile and level indications for AS Profile and FGS Profile. The identifier for an enhancement layer is the syntax
video_object_type_indication in VideoObjectLayer(). A unique code is defined for FGS Object Type to indicate
that this VOL contains fgs vops. Another unique code is defined for AS Object Type to indicate that this VOL is the
base-layer. There is a syntax fgs_layer_type in VideoObjectLayer() to indicate whether this VOL is an FGS layer
as shown in Figure AMD2-2 and Figure AMD2-3, or an FGST layer as shown in Figure AMD2-3, or an FGS-FGST
layer as shown in Figure AMD2-4. Similar to the syntax structure in subclause 6.2, under each VOL for FGS, there
is a hierarchy of fgs vop, fgs macroblock, and fgs block. An fgs vop starts with a unique fgs_vop_start_code.
Within each fgs vop, there are multiple vop-bps. Each vop-bp in an fgs vop starts with an fgs_bp_start_code
whose last 5 bits indicate the ID of the vop-bp. In each fgs macroblock, there are 4 block-bps for the luminance
component (Y), 2 block-bps for the two chrominance components (U and V) for the 4:2:0 chrominance format. Each
block-bp is coded by VLC.
“
3) Add the following subclauses in clause 3
"
3.AMD2.1 block-bp: An array of 64 bits, one from each DCT coefficient at the same significant position of
accuracy in a zigzag scan order. When frequency weighting is used, block-bps are formed after the
weighting is applied to the DCT coefficients in an 8x8 block.
3.AMD2.2 end of plane; eop: A symbol to indicate whether a ‘1’ bit is the last 1’ bit of a block-bp.
3.AMD2.3 fgs block: An 8-row by 8-column matrix of bits, each from one DCT coefficient at the same significant
position of accuracy, or its coded representation. The usage is clear from the context.
3.AMD2.4 fgs macroblock: The four block-bps of luminance component (Y) and the two (for 4:2:0 chrominance
format) corresponding block-bps of chrominance components (U and V) with the same accuracy
significance coming from the DCT coefficients of a macroblock. It may also be used to refer to the
coded representation of the six block-bps. The usage is clear from the context.
3.AMD2.5 fgs macroblock number: A number for an fgs macroblock within a vop-bp. The fgs macroblock
number of the top-left fgs macroblock in each vop-bp shall be zero. The fgs macroblock number
increments from left to right and from top to bottom.
3.AMD2.6 fgs run: The number of ‘0’ bits preceding a ‘1’ bit within a block-bp.
3.AMD2.7 fgs temporal scalability; FGST: A type of scalability where an enhancement layer uses predictions
from sample data derived from the base layer using motion vectors. The VOP size in the enhancement
layer is the same as that in the base layer. FGST is a specific type of temporal scalability where all
DCT coefficients are coded using bit-plane coding as in FGS.
3.AMD2.8 fgs vop: The pixel differences between the original VOP and the reconstructed VOP in the base layer.
It may be used to refer to the DCT coefficients of the pixel differences or the original VOP. It may also
be used to refer to the coded representation of the DCT coefficients. In the context of FGST, fgs vop
refers to the original temporal scalable VOP. The usage is clear from the context.
© ISO/IEC 2002 – All rights reserved 3
ISO/IEC 14496-2:2001/Amd.2:2002(E)
3.AMD2.9 fine granularity scalability; FGS: A type of scalability where an enhancement layer uses prediction
from sample data of reconstructed VOP in the base layer. The encoded bitstream for each fgs vop can
be truncated into any number of bits. The truncated bitstream for each fgs vop can be decoded to
provide quality enhancement proportional to the amount of bits in the truncated bitstream of the fgs
vop. The fgs vop has the same size and VOP rate as those of the base layer.
3.AMD2.10 vop-bp: An array of block-bps with the same accuracy significance in an fgs vop. There are three color
components (Y, U, and V) in a vop-bp. Each color component in a vop-bp consists of all the block-bps
of that color.
“
4) Add the following subclause to subclause 5.2:
“
Definition of start_of_bit_plane() function
The function start_of_bit_plane() returns 1 if the next bit in the bitstream is the first bit of the codes associated with
a vop-bp. Otherwise it returns 0.
“
5) Add the following text to the end of subclause 6.1:
“
In a typical application of FGS, the bitstream at the input of an FGS decoder is a truncated version of the bitstream
at the output of an FGS encoder. It is likely that, at the end of each fgs vop before the next fgs_vop_start_code,
only partial bits of the fgs vop are at the input of the decoder due to truncation of the fgs vop bitstream. Decoding of
the truncated bitstream is not normative. An example of dealing with the truncated bitstream is described in Annex
S. The FGS syntax description in this clause is for a complete bitstream without truncation.
“
6) Replace Table 6-3 in subclause 6.2.1 with the following table:
“
Table 6-3. Start code values
name start code value
(hexadecimal)
video_object_start_code 00 through 1F
video_object_layer_start_code 20 through 2F
reserved 30 through 3F
fgs_bp_start_code 40 through 5F
reserved 60 through AF
visual_object_sequence_start_code B0
visual_object_sequence_end_code B1
user_data_start_code B2
group_of_vop_start_code B3
video_session_error_code B4
visual_object_start_code B5
vop_start_code B6
4 © ISO/IEC 2002 – All rights reserved
ISO/IEC 14496-2:2001/Amd.2:2002(E)
B7
slice_start_code
extension_start_code B8
fgs_vop_start_code B9
fba_object_start_code BA
fba_object_plane_start_code BB
mesh_object_start_code BC
mesh_object_plane_start_code BD
still_texture_object_start_code BE
texture_spatial_layer_start_code BF
texture_snr_layer_start_code C0
texture_tile_start_code C1
texture_shape_layer_start_code C2
reserved C3-C5
System start codes (see note) C6 through FF
NOTE System start codes are defined in ISO/IEC 14496-1
“
7) Replace VideoObjectLayer() in subclause 6.2.3:
“
VideoObjectLayer() { No. of bits Mnemonic
if(next_bits() == video_object_layer_start_code) {
short_video_header = 0
video_object_layer_start_code 32 bslbf
random_accessible_vol 1 bslbf
video_object_type_indication 8 uimsbf
is_object_layer_identifier 1 uimsbf
if (is_object_layer_identifier) {
video_object_layer_verid 4 uimsbf
video_object_layer_priority 3 uimsbf
}
aspect_ratio_info 4 uimsbf
if (aspect_ratio_info == “extended_PAR”) {
par_width 8 uimsbf
par_height 8 uimsbf
}
vol_control_parameters 1 bslbf
if (vol_control_parameters) {
chroma_format 2 uimsbf
low_delay 1 uimsbf
vbv_parameters 1 blsbf
if (vbv_parameters) {
first_half_bit_rate 15 uimsbf
marker_bit 1 bslbf
latter_half_bit_rate 15 uimsbf
marker_bit 1 bslbf
first_half_vbv_buffer_size 15 uimsbf
© ISO/IEC 2002 – All rights reserved 5
ISO/IEC 14496-2:2001/Amd.2:2002(E)
marker_bit
1 bslbf
latter_half_vbv_buffer_size 3 uimsbf
first_half_vbv_occupancy 11 uimsbf
marker_bit 1 blsbf
latter_half_vbv_occupancy
15 uimsbf
marker_bit 1 blsbf
}
}
video_object_layer_shape
2 uimsbf
if (video_object_layer_shape == "grayscale"
&& video_object_layer_verid != ‘0001’)
video_object_layer_shape_extension
4 uimsbf
marker_bit 1 bslbf
vop_time_increment_resolution 16 uimsbf
marker_bit 1 bslbf
fixed_vop_rate
1 bslbf
if (fixed_vop_rate)
fixed_vop_time_increment 1-16 uimsbf
if (video_object_layer_shape != “binary only”) {
if (video_object_layer_shape == “rectangular”) {
marker_bit 1 bslbf
video_object_layer_width 13 uimsbf
marker_bit 1 bslbf
video_object_layer_height
13 uimsbf
marker_bit 1 bslbf
}
interlaced 1 bslbf
obmc_disable
1 bslbf
if (video_object_layer_verid == ‘0001’)
sprite_enable 1 bslbf
else
sprite_enable
2 uimsbf
if (sprite_enable== “static” || sprite_enable ==
“GMC”) {
if (sprite_enable != “GMC”) {
sprite_width 13 uimsbf
marker_bit 1 bslbf
sprite_height 13 uimsbf
marker_bit 1 bslbf
sprite_left_coordinate 13 simsbf
marker_bit 1 bslbf
sprite_top_coordinate 13 simsbf
marker_bit 1 bslbf
}
no_of_sprite_warping_points 6 uimsbf
sprite_warping_accuracy 2 uimsbf
sprite_brightness_change 1 bslbf
6 © ISO/IEC 2002 – All rights reserved
ISO/IEC 14496-2:2001/Amd.2:2002(E)
if (sprite_enable != “GMC”)
low_latency_sprite_enable 1 bslbf
}
if (video_object_layer_verid != ‘0001’ &&
video_object_layer_shape != ”rectangular”)
sadct_disable 1 bslbf
not_8_bit 1 bslbf
if (not_8_ bit) {
quant_precision 4 uimsbf
bits_per_pixel 4 uimsbf
}
if (video_object_layer_shape==”grayscale”) {
no_gray_quant_update 1 bslbf
composition_method 1 bslbf
linear_composition 1 bslbf
}
quant_type 1 bslbf
if (quant_type) {
load_intra_quant_mat 1 bslbf
if (load_intra_quant_mat)
intra_quant_mat 8*[2-64] uimsbf
load_nonintra_quant_mat 1 bslbf
if (load_nonintra_quant_mat)
nonintra_quant_mat
8*[2-64] uimsbf
if(video_object_layer_shape==”grayscale”) {
for(i=0; i
load_intra_quant_mat_grayscale 1 bslbf
if(load_intra_quant_mat_grayscale)
intra_quant_mat_grayscale[i] 8*[2-64] uimsbf
1 bslbf
load_nonintra_quant_mat_grayscale
if(load_nonintra_quant_mat_grayscale)
8*[2-64] uimsbf
nonintra_quant_mat_grayscale[i]
}
}
}
if (video_object_layer_verid != ‘0001’)
quarter_sample 1 bslbf
complexity_estimation_disable
1 bslbf
if (!complexity_estimation_disable)
define_vop_complexity_estimation_header()
resync_marker_disable 1 bslbf
data_partitioned
1 bslbf
if(data_partitioned)
reversible_vlc 1 bslbf
if(video_object_layer_verid != ’0001’) {
© ISO/IEC 2002 – All rights reserved 7
ISO/IEC 14496-2:2001/Amd.2:2002(E)
newpred_enable
1 bslbf
if (newpred_enable) {
requested_upstream_message_type 2 uimsbf
newpred_segment_type 1 bslbf
}
reduced_resolution_vop_enable 1 bslbf
}
scalability 1 bslbf
if (scalability) {
hierarchy_type 1 bslbf
ref_layer_id 4 uimsbf
ref_layer_sampling_direc 1 bslbf
hor_sampling_factor_n
5 uimsbf
hor_sampling_factor_m 5 uimsbf
vert_sampling_factor_n 5 uimsbf
vert_sampling_factor_m 5 uimsbf
enhancement_type
1 bslbf
if(video_object_layer == “binary” &&
hierarchy_type== ‘0’) {
use_ref_shape
1 bslbf
use_ref_texture 1 bslbf
shape_hor_sampling_factor_n 5 uimsbf
shape_hor_sampling_factor_m 5 uimsbf
shape_vert_sampling_factor_n
5 uimsbf
shape_vert_sampling_factor_m 5 uimsbf
}
}
}
else {
if(video_object_layer_verid !=”0001”) {
scalability 1 bslbf
if(scalability) {
shape_hor_sampling_factor_n 5 uimsbf
shape_hor_sampling_factor_m 5 uimsbf
shape_vert_sampling_factor_n 5 uimsbf
shape_vert_sampling_factor_m
5 uimsbf
}
}
resync_marker_disable 1 bslbf
}
next_start_code()
while ( next_bits()== user_data_start_code){
user_data()
}
if (sprite_enable == “static” &&
!low_latency_sprite_enable)
VideoObjectPlane()
8 © ISO/IEC 2002 – All rights reserved
ISO/IEC 14496-2:2001/Amd.2:2002(E)
do {
if (next_bits() == group_of_vop_start_code)
Group_of_VideoObjectPlane()
VideoObjectPlane()
} while ((next_bits() == group_of_vop_start_code) ||
(next_bits() == vop_start_code))
} else {
short_video_header = 1
do {
video_plane_with_short_header()
} while(next_bits() == short_video_start_marker)
}
}
“
with
“
VideoObjectLayer() { No. of bits Mnemonic
if(next_bits() == video_object_layer_start_code) {
short_video_header = 0
video_object_layer_start_code
32 bslbf
random_accessible_vol 1 bslbf
video_object_type_indication 8 uimsbf
if ( video_object_type_indication == “Fine Granularity Scalable” ) {
fgs_layer_type 2 uimsbf
video_object_layer_priority 3 uimsbf
aspect_ratio_info 4 uimsbf
if (aspect_ratio_info == “extended_PAR”) {
par_width 8 uimsbf
par_height 8 uimsbf
}
vol_control_parameters 1 bslbf
if (vol_control_parameters) {
chroma_format 2 uimsbf
low_delay 1 uimsbf
}
marker_bit 1 bslbf
vop_time_increment_resolution 16 uimsbf
marker_bit 1 bslbf
fixed_vop_rate 1 bslbf
if (fixed_vop_rate)
fixed_vop_time_increment 1-16 uimsbf
marker_bit 1 bslbf
© ISO/IEC 2002 – All rights reserved 9
ISO/IEC 14496-2:2001/Amd.2:2002(E)
video_object_layer_width 13 uimsbf
marker_bit 1 bslbf
video_object_layer_height 13 uimsbf
marker_bit 1 bslbf
interlaced 1 bslbf
if (fgs_layer_type ==“FGST” || fgs_layer_type ==“FGS_FGST”)
fgs_ref_layer_id 4 uimsbf
if (fgs_layer_type ==“FGS” || fgs_layer_type ==“FGS_FGST”) {
fgs_frequency_weighting_enable 1 bslbf
if ( fgs_frequency_weighting_enable ) {
load_fgs_frequency_weighting_matrix
1 bslbf
if (load_fgs_frequency_weighting_matrix)
fgs_frequency_weighting_matrix 3*[2-64] uimsbf
}
}
if (fgs_layer_type ==“FGST” || fgs_layer_type ==“FGS_FGST”)
{
fgst_frequency_weighting_enable 1 bslbf
if ( fgst_frequency_weighting_enable ) {
load_fgst_frequency_weighting_matrix 1 bslbf
if (load_fgst_frequency_weighting_matrix)
fgst_frequency_weighting_matrix 3*[2-64] uimsbf
}
}
quarter_sample 1 bslbf
fgs_resync_marker_disable 1 bslbf
do {
if (nextbits_bytealigned() == group_of_vop_start_code)
Group_of_VideoObjectPlane()
FGSVideoObjectPlane()
} while((nextbits_bytealigned()==group_of_vop_start_code)||
(nextbits_bytealigned()==fgs_vop_start_code))
} else {
is_object_layer_identifier 1 uimsbf
if (is_object_layer_identifier) {
video_object_layer_verid
4 uimsbf
video_object_layer_priority 3 uimsbf
}
aspect_ratio_info 4 uimsbf
if (aspect_ratio_info == “extended_PAR”) {
par_width 8 uimsbf
par_height 8 uimsbf
}
10 © ISO/IEC 2002 – All rights reserved
ISO/IEC 14496-2:2001/Amd.2:2002(E)
vol_control_parameters
1 bslbf
if (vol_control_parameters) {
chroma_format 2 uimsbf
low_delay 1 uimsbf
vbv_parameters
1 blsbf
if (vbv_parameters) {
first_half_bit_rate 15 uimsbf
marker_bit 1 bslbf
latter_half_bit_rate
15 uimsbf
marker_bit 1 bslbf
first_half_vbv_buffer_size 15 uimsbf
marker_bit 1 bslbf
latter_half_vbv_buffer_size
3 uimsbf
first_half_vbv_occupancy 11 uimsbf
marker_bit 1 blsbf
latter_half_vbv_occupancy 15 uimsbf
marker_bit
1 blsbf
}
}
video_object_layer_shape 2 uimsbf
if (video_object_layer_shape == "grayscale"
&& video_object_layer_verid != ‘0001’)
video_object_layer_shape_extension 4 uimsbf
marker_bit
1 bslbf
vop_time_increment_resolution 16 uimsbf
marker_bit 1 bslbf
fixed_vop_rate 1 bslbf
if (fixed_vop_rate)
fixed_vop_time_increment 1-16 uimsbf
if (video_object_layer_shape != “binary only”) {
if (video_object_layer_shape == “rectangular”) {
marker_bit
1 bslbf
video_object_layer_width 13 uimsbf
marker_bit 1 bslbf
video_object_layer_height 13 uimsbf
marker_bit
1 bslbf
}
interlaced 1 bslbf
obmc_disable 1 bslbf
if (video_object_layer_verid == ‘0001’)
sprite_enable 1 bslbf
else
sprite_enable 2 uimsbf
if (sprite_enable== “static” || sprite_enable == “GMC”) {
if (sprite_enable != “GMC”) {
sprite_width 13 uimsbf
marker_bit 1 bslbf
© ISO/IEC 2002 – All rights reserved 11
ISO/IEC 14496-2:2001/Amd.2:2002(E)
sprite_height
13 uimsbf
marker_bit 1 bslbf
sprite_left_coordinate 13 simsbf
marker_bit 1 bslbf
sprite_top_coordinate
13 simsbf
marker_bit 1 bslbf
}
no_of_sprite_warping_points 6 uimsbf
sprite_warping_accuracy
2 uimsbf
sprite_brightness_change 1 bslbf
if (sprite_enable != “GMC”)
low_latency_sprite_enable 1 bslbf
}
if (video_object_layer_verid != ‘0001’ &&
video_object_layer_shape != ”rectangular”)
sadct_disable
1 bslbf
not_8_bit 1 bslbf
if (not_8_ bit) {
quant_precision 4 uimsbf
bits_per_pixel
4 uimsbf
}
if (video_object_layer_shape==”grayscale”) {
no_gray_quant_update 1 bslbf
composition_method
1 bslbf
linear_composition 1 bslbf
}
quant_type 1 bslbf
if (quant_type) {
load_intra_quant_mat 1 bslbf
if (load_intra_quant_mat)
intra_quant_mat 8*[2-64] uimsbf
load_nonintra_quant_mat
1 bslbf
if (load_nonintra_quant_mat)
nonintra_quant_mat 8*[2-64] uimsbf
if(video_object_layer_shape==”grayscale”) {
for(i=0; i
load_intra_quant_mat_grayscale 1 bslbf
if(load_intra_quant_mat_grayscale)
intra_quant_mat_grayscale[i] 8*[2-64] uimsbf
load_nonintra_quant_mat_grayscale
1 bslbf
if(load_nonintra_quant_mat_grayscale)
nonintra_quant_mat_grayscale[i] 8*[2-64] uimsbf
}
}
}
12 © ISO/IEC 2002 – All rights reserved
ISO/IEC 14496-2:2001/Amd.2:2002(E)
if (video_object_layer_verid != ‘0001’)
quarter_sample 1 bslbf
complexity_estimation_disable 1 bslbf
if (!complexity_estimation_disable)
define_vop_complexity_estimation_header()
resync_marker_disable 1 bslbf
data_partitioned 1 bslbf
if(data_partitioned)
reversible_vlc
1 bslbf
if(video_object_layer_verid != ’0001’) {
newpred_enable 1 bslbf
if (newpred_enable) {
requested_upstream_message_type
2 uimsbf
newpred_segment_type 1 bslbf
}
reduced_resolution_vop_enable 1 bslbf
}
scalability 1 bslbf
if (scalability) {
hierarchy_type 1 bslbf
ref_layer_id
4 uimsbf
ref_layer_sampling_direc 1 bslbf
hor_sampling_factor_n 5 uimsbf
hor_sampling_factor_m 5 uimsbf
vert_sampling_factor_n
5 uimsbf
vert_sampling_factor_m 5 uimsbf
enhancement_type 1 bslbf
if(video_object_layer == “binary” &&
hierarchy_type== ‘0’) {
use_ref_shape 1 bslbf
use_ref_texture 1 bslbf
shape_hor_sampling_factor_n
5 uimsbf
shape_hor_sampling_factor_m 5 uimsbf
shape_vert_sampling_factor_n 5 uimsbf
shape_vert_sampling_factor_m 5 uimsbf
}
}
} else {
if(video_object_layer_verid !=”0001”) {
scalability
1 bslbf
if(scalability) {
shape_hor_sampling_factor_n 5 uimsbf
shape_hor_sampling_factor_m 5 uimsbf
shape_vert_sampling_factor_n
5 uimsbf
shape_vert_sampling_factor_m 5 uimsbf
}
}
© ISO/IEC 2002 – All rights reserved 13
ISO/IEC 14496-2:2001/Amd.2:2002(E)
resync_marker_disable
1 bslbf
}
next_start_code()
while ( next_bits()== user_data_start_code) {
user_data()
}
if (sprite_enable == “static” && !low_latency_sprite_enable)
VideoObjectPlane()
do {
if (next_bits() == group_of_vop_start_code)
Group_of_VideoObjectPlane()
VideoObjectPlane()
} while ((next_bits() == group_of_vop_start_code) ||
(next_bits() == vop_start_code))
}
} else {
short_video_header = 1
do {
video_plane_with_short_header()
} while(next_bits() == short_video_start_marker)
}
}
“
8) Add the following subclause 6.2.14 after subclause 6.2.13:
“
6.2.14 FGS Video Object
6.2.14.1 FGS Video Object Plane
FGSVideoObjectPlane() { No. of bits Mnemonic
fgs_vop_start_code 32 bslbf
fgs_vop_coding_type
2 uimsbf
do {
modulo_time_base 1 bslbf
} while (modulo_time_base != ‘0’)
marker_bit 1 bslbf
vop_time_increment 1-16 uimsbf
marker_bit 1 bslbf
14 © ISO/IEC 2002 – All rights reserved
ISO/IEC 14496-2:2001/Amd.2:2002(E)
fgs_vop_max_level_y 5 uimsbf
fgs_vop_max_level_u 5 uimsbf
fgs_vop_max_level_v 5 uimsbf
marker_bit 1 bslbf
fgs_vop_number_of_vop_bp_coded 5 uimsbf
fgs_vop_mc_bit_plane_used 5 uimsbf
fgs_vop_selective_enhancement_enable 1 bslbf
if ( fgs_vop_coding_type != “I” ) {
if (interlaced)
top_field_first 1 bslbf
vop_fcode_forward 3 uimsbf
if (fgs_vop_coding_type == “B”)
vop_fcode_backward 3 uimsbf
fgs_ref_select_code
2 uimsbf
do {
fgs_motion_macroblock()
} while( nextbits_bytealigned() != ‘000 0000 0000 0000
0000 0000’)
}
next_start_code()
if (nextbits_bytealigned () == fgs_bp_start_code) {
while(nextbits_bytealigned() != ‘000 0000 0000 0000
0000 0000’ || nextbits_bytealigned () == fgs_bp_start_code)
{
if ( start_of_bit_plane() )
fgs_bp_start_code 32 bslbf
else {
if ( ! fgs_resync_marker_disable &&
nextbits_bytealigned () == fgs_resync_marker )
{
next_resync_marker()
fgs_resync_marker 23 uimsbf
fgs_ macroblock_number
1-14 vlclbf
}
}
fgs_macroblock()
}
next_start_code()
}
}
© ISO/IEC 2002 – All rights reserved 15
ISO/IEC 14496-2:2001/Amd.2:2002(E)
6.2.14.2 FGS Motion Macroblock
No. of bits Mnemonic
fgs_motion_macroblock() {
if (fgs_vop_coding_type == “P”) {
fgs_not_coded 1 bslbf
if ( !fgs_not_coded ) {
fgs_p_mb_type 1 bslbf
if (interlaced)
fgs_motion_interlaced_information()
if ( fgs_p_mb_type == 0 ) {
fgs_motion_vector(“forward”)
if (interlaced && field_prediction)
fgs_motion_vector(“forward”)
}
if (fgs_p_mb_type == 1) {
for (j=0; j < 4; j++)
fgs_motion_vector(“forward”)
}
}
} else {
fgs_modb 1 bslbf
if ( !fgs_modb ) {
fgs_b_mb_type
1-4 vlclbf
if (interlaced)
fgs_motion_interlaced_information()
if ( fgs_b_mb_type==‘0001’ ||
fgs_b_mb_type==‘01’ ) {
fgs_motion_vector(“forward”)
if (interlaced && field_prediction)
fgs_motion_vector(“forward”)
}
if (fgs_b_mb_type == ‘01’ || fgs_b_mb_type ==
‘001’) {
fgs_motion_vector(“backward”)
if (interlaced && field_prediction)
fgs_motion_vector(“backward”)
16 © ISO/IEC 2002 – All rights reserved
ISO/IEC 14496-2:2001/Amd.2:2002(E)
}
if (fgs_b_mb_type == ‘1’)
fgs_motion_vector(“direct”)
}
}
}
6.2.14.3 FGS Motion Interlaced Information
fgs_motion_interlaced_information( ) { No. of bits Mnemonic
fgs_field_prediction 1 bslbf
if (fgs_field_prediction) {
if (fgs_vop_coding_type == “P” ||
(fgs_vop_coding_type==“B” &&
fgs_b_mb_type!=“001”)){
fgs_forward_top_field_reference 1 bslbf
fgs_forward_bottom_field_reference 1 bslbf
}
if ((fgs_vop_coding_type == “B”) &&
(fgs_b_mb_type != “1”) ) {
fgs_backward_top_field_reference 1 bslbf
fgs_backward_bottom_field_reference 1 bslbf
}
}
}
6.2.14.4 FGS Motion Vector
fgs_motion_vector ( mode ) { No. of bits Mnemonic
if ( mode == „direct“ ) {
horizontal_mv_data 1-13 vlclbf
vertical_mv_data
1-13 vlclbf
}
else if ( mode == „forward“ ) {
horizontal_mv_data 1-13 vlclbf
© ISO/IEC 2002 – All rights reserved 17
ISO/IEC 14496-2:2001/Amd.2:2002(E)
if ((vop_fcode_forward != 1)&&(horizontal_mv_data != 0))
horizontal_mv_residual
1-6 uimsbf
vertical_mv_data 1-13 vlclbf
if ((vop_fcode_forward != 1)&&(vertical_mv_data != 0))
vertical_mv_residual 1-6 uimsbf
}
else if ( mode == „backward“ ) {
horizontal_mv_data 1-13 vlclbf
if ((vop_fcode_backward != 1)&&(horizontal_mv_data !=
0))
horizontal_mv_residual 1-6 uimsbf
vertical_mv_data 1-13 vlclbf
if ((vop_fcode_backward != 1)&&(vertical_mv_data != 0))
vertical_mv_residual 1-6 uimsbf
}
}
6.2.14.5 FGS Macroblock
fgs_macroblock() { No. of bits Mnemonic
if ( fgs_vop_bp_id < 2 )
fgs_cbp 1-9 vlclbf
if ( fgs_vop_selective_enhancement_enable==1 ) {
if ( !mb_shift_factor_received && none_zero_macroblock )
fgs_shifted_bits 1-5 vlclbf
}
if ( interlaced==1 ) {
if ( !dct_type_received && non_zero_macroblock )
fgs_dct_type 1 bslbf
}
for ( i=0; i<6; i++ ) {
if ( start_decode==1 )
fgs_block()
}
}
NOTE 1 — In the syntax of fgs vop, there are three elements: fgs_vop_max_level_y, fgs_vop_max_level_u, and
fgs_vop_max_level_v. The maximum value of these three elements is the number of vop-bps in an fgs vop. If any one of the
three elements has a smaller value than the number of vop-bps in the fgs vop, the color component corresponding to the
smaller value element is absent in one or more vop-bps. If the difference between the number of vop-bps and the value of the
element is k, the corresponding color component is absent in the first k vop-bps. There is no need to decode fgs blocks of the
color component absent in the vop-bps. start_decode is defined to be a flag to indicate whether decoding should be performed
18 © ISO/IEC 2002 – All rights reserved
ISO/IEC 14496-2:2001/Amd.2:2002(E)
for an fgs block depending on whether the fgs block belongs to an absent color component or not. The value of start_decode is
0 if the fgs block belongs to an absent color component. Otherwise, the value of start_decode is 1 and decoding is performed
for the fgs block.
NOTE 2 — mb_shift_factor_received is a flag to indicate whether fgs_shifted_bits has been decoded in the previous fgs
macroblock with the same fgs macroblock number. It is initialized to 0 before decoding the first vop-bp. It is set to 1 after
fgs_shifted_bits is decoded.
NOTE 3 — non_zero_macroblock is a flag to indicate whether every block-bp in the fgs macroblock is an all-zero block-bp. If
every block-bp in the fgs macroblock is an all-zero block-bp, the value of this flag is 0. Otherwise, its value is 1.
NOTE 4 — dct_type_received is a flag to indicate whether fgs_dct_type has been decoded in the previous fgs macroblock with
the same fgs macroblock number. It is initialized to 0 before decoding the first vop-bp. It is set to 1 after fgs_dct_type is
decoded.
6.2.14.6 FGS Block
fgs_block() { No. of bits Mnemonic
if (fgs_vop_bp_id>=2 && previous_fgs_msb_not_reached==1 )
fgs_msb_not_reached 1 bslbf
if ( fgs_msb_not_reached == 0 ) {
while ( eop == 0 ) {
fgs_run_eop_code 1-16 vlclbf
if (coeff_msb_not_reached ==1)
fgs_sign_bit 1 bslbf
}
}
}
NOTE 1 — previous_fgs_msb_not_reached is defined to be fgs_msb_not_reached decoded in the previous block-bp of the
same 8x8 DCT block.
NOTE 2 — eop is defined to be the EOP flag resulting from decoding the most recent fgs_run_eop_code. eop is reset to 0 at
the beginning of fgs_block().
NOTE 3 — coeff_msb_not_reached is defined to be an internal flag indicating, with a value ‘1’, that the MSB of the non-zero
DCT coefficient associated with the fgs_run_eop_code above was not reached. The value of this flag is changed to ‘0’ when the
MSB of the non-zero DCT coefficient is reached.
“
© ISO/IEC 2002 – All rights reserved 19
ISO/IEC 14496-2:2001/Amd.2:2002(E)
9) Replace Table 6-4 in subclause 6.3.2 with the following table:
“
Table 6-4 — Meaning of visual_object_verid
Visual_object_verid Meaning
0000 reserved
0001 object type listed in Table 9-1
0010 object type listed in Table V2-39
0011 reserved
0100 object type listed in Table AMD1-40
0101 object type listed in Table AMD2-13
0110 - 1111 reserved
“
10) Replace Table 6-10 in subclause 6.3.3 with the following table:
“
Table 6-10 — FLC table for video_object_type_indication
Video Object Type Code
Reserved 00000000
Simple Object Type 00000001
Simple Scalable Object Type 00000010
Core Object Type 00000011
Main Object Type 00000100
N-bit Object Type 00000101
Basic Anim. 2D Texture 00000110
Anim. 2D Mesh 00000111
Simple Face 00001000
Still Scalable Texture 00001001
Advanced Real Time Simple 00001010
Core Scalable 00001011
Advanced Coding Efficiency 00001100
Advanced Scalable Texture 00001101
Simple FBA 00001110
Simple Studio 00001111
Core Studio 00010000
Advanced Simple 00010001
Fine Granularity Scalable 00010010
Reserved 00010011 - 11111111
“
20 © ISO/IEC 2002 – All rights reserved
ISO/IEC 14496-2:2001/Amd.2:2002(E)
11) Add the following to subclause 6.3.3 after Table 6-10:
“
fgs_layer_type – This is a 2-bit code indicating whether this layer is FGS only, FGST only, or a combination of
FGS and FGST. Table AMD2-1 shows the codes and the meanings.
Table AMD2-1 — Code for fgs_layer_type
Code Meaning
00 reserved
01 FGS
10 FGST
11 FGS-FGST
“
12) Replace Table 6-11 in subclause 6.3.3 with the following table:
“
Table 6-11 – Meaning of video_object_layer_verid
video_object_layer_verid Meaning
0000 reserved
0001 object type listed in Table 9-1
0010 object type listed in Table V2-39
0011 reserved
0100 object type listed in Table AMD1-40
0101 object type listed in Table AMD2-13
0011 - 1111 reserved
“
13) Replace the following in subclause 6.3.3:
“
video_object_layer_priority – This is a 3-bit code which specifies the priority of the video object layer. It takes
values between 1 and 7, with 1 representing the highest priority and 7, the lowest priority. The value of zero is
reserved.
“
with
“
video_object_layer_priority – This is a 3-bit code which specifies the priority of the video object layer. It takes
values between 1 and 7, with 1 representing the highest priority and 7 the lowest priority. The value of zero is
reserved. For the transmission of FGS and FGST in two VOLs, the relative transmission priority of an FGS VOL vs.
© ISO/IEC 2002 – All rights reserved 21
ISO/IEC 14496-2:2001/Amd.2:2002(E)
that of an FGST VOL can be specified by setting this parameter in the FGS VOL relative to the same parameter in
the FGST VOL.
“
14) Add the following to subclause 6.3.3 after ‘Interlaced’:
“
fgs_ref_layer_id – This is a 4-bit unsigned integer with value between 0 and 15. It indicates the layer to be used
as reference for prediction in the case of fgs_layer_type being FGST or FGS_FGST.
fgs_frequency_weighting_enable – This is a one-bit flag to indicate that frequency weighting is used in this VOL,
when set to ‘1’. Otherwise, when this flag is set to ‘0’, frequency weighting is not used. The default frequency
weighting matrix is an all zero matrix when fgs_frequency_weighting_enable is ‘1’.
load_fgs_frequency_weighting_matrix – This is a one-bit flag which is set to ‘1’ when
fgs_frequency_weighting_matrix follows. If it is set to ‘0’ then the default frequency weighting matrix is used.
fgs_frequency_weighting_matrix – This is a list of 2 to 64 three-bit unsigned integers. The integers are in zigzag
scan order representing the fgs_frequency_weighting_matrix. A value of 0 indicates that no more values are
transmitted and the remaining, non-transmitted values are set to zero.
fgst_frequency_weighting_enable – This is a one-bit flag to indicate that frequency weighting is used in this VOL,
when set to ‘1’. Otherwise, when this flag is set to ‘0’, frequency weighting is not used. The default frequency
weighting matrix is an all zero matrix when fgst_frequency_weighting_enable is ‘1’.
load_fgst_frequency_weighting_matrix – This is a one-bit flag which is set to ‘1’ when
fgst_frequency_weighting_matrix follows. If it is set to ‘0’ then the default matrix is used.
fgst_frequency_weighting_matrix – This is a list of 2 to 64 three-bit unsigned integers. The integers are in zigzag
scan order representing the fgst_frequency_weighting_matrix. A value of 0 indicates that no more values are
transmitted and the remaining, non-transmitted values are set to zero.
fgs_resync_marker_disable – This is a one-bit flag which when set to ‘1’ indicates that there is no
fgs_resync_marker in coded fgs vops of this VOL. When this flag is set to ‘0’, it indicates that fgs_resync_marker
may be used in coded fgs vops of this VOL.
“
15) Replace the following in subclause 6.3.3:
“
sprite_enable: When video_object_layer_verid == ‘0001’, this is a one-bit flag which when set to ‘1’ indicates the
usage of static (basic or low latency) sprite coding. When video_object_layer_verid == ‘0002’, this is a two-bit
unsigned integer which indicates the usage of static sprite coding or global motion compensation (GMC). Table V2-
2 shows the meaning of various codewords. An S-VOP with sprite_enable == “GMC” is referred to as an S (GMC)-
VOP in this document.
22 © ISO/IEC 2002 – All rights reserved
ISO/IEC 14496-2:2001/Amd.2:2002(E)
Table V2 - 2 – Meaning of sprite_enable codewords
sprite_enable sprite_enable Sprite Coding Mode
(video_object_layer_ (video_object_layer_
verid == ‘0001’) verid == ‘0002’)
0 00 sprite not used
1 01 static (Basic/Low Latency)
10 GMC (Global Motion Compensation)
−
11 reserved
−
“
with
“
sprite_enable: When video_object_layer_verid == ‘0001’, this is a one-bit flag which when set to ‘1’ indicates the
usage of static (basic or low latency) sprite coding. When video_object_layer_verid == ‘0010’ or
video_object_layer_verid == ‘0101’, this is a two-bit unsigned integer which indicates the usage of static sprite
coding or global motion compensation (GMC). Table V2-2 shows the meaning of various codewords. An S-VOP
with sprite_enable == “GMC” is referred to as an S (GMC)-VOP in this document.
Table V2 - 2 – Meaning of sprite_enable codewords
sprite_enable sprite_enable Sprite Coding Mode
(video_object_layer_ (video_object_layer_v
verid == ‘0001’) erid == '0010' ||
video_object_layer_ve
rid == '0101')
0 00 sprite not used
1 01 static (Basic/Low Latency)
10 GMC (Global Motion Compensation)
−
11 reserved
−
“
16) Replace the following in subclause 6.3.3:
“
quarter_sample: This is a one-bit flag which when set to ‘0’ indicates that half sample mode and when set to ‘1’
indicates that quarter sample mode shall be used for motion compensation of the luminance component.
“
with
“
quarter_sample: This is a one-bit flag which when set to ‘0’ indicates that half sample mode and when set to ‘1’
indicates that quarter sample mode shall be used for motion compensation of the luminance component. For FGST
or FGS_FGST enhancement layer, this flag shall be 0.
“
© ISO/IEC 2002 – All rights reserved 23
ISO/IEC 14496-2:2001/Amd.2:2002(E)
17) Add the following subclause 6.3.14 after subclause 6.3.13:
“
6.3.14 FGS Video Object
6.3.14.1 FGS Video Object Plane
fgs_vop_start_code – This is the bit string ‘000001B9’ in hexadecimal. It marks the start of an fgs vop.
fgs_vop_coding_type – The fgs_vop_coding_type identifies whether an fgs vop is an FGS coding type (I),
predictive-coded FGS coding type (P), or bidirectionally predictive-coded FGS coding type (B). The meaning of
fgs_vop_coding_type is defined in Tabl
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...