Information technology — Coding of audio-visual objects — Part 10: Advanced Video Coding — Amendment 2: MVC extensions for inclusion of depth maps

Technologies de l'information — Codage des objets audiovisuels — Partie 10: Codage visuel avancé — Amendement 2: Extensions du codage vidéo multivues pour l'inclusion de cartes de profondeur

General Information

Status
Withdrawn
Publication Date
18-Sep-2013
Withdrawal Date
18-Sep-2013
Current Stage
9599 - Withdrawal of International Standard
Completion Date
27-Aug-2014
Ref Project

Relations

Buy Standard

Standard
ISO/IEC 14496-10:2012/Amd 2:2013 - MVC extensions for inclusion of depth maps
English language
84 pages
sale 15% off
Preview
sale 15% off
Preview
Standard
ISO/IEC 14496-10:2012/Amd 2:2013 - MVC extensions for inclusion of depth maps
English language
84 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO/IEC
STANDARD 14496-10
Seventh edition
2012-05-01
AMENDMENT 2
2013-09-15


Information technology — Coding of
audio-visual objects —
Part 10:
Advanced Video Coding
AMENDMENT 2: MVC extensions for
inclusion of depth maps
Technologies de l'information — Codage des objets audiovisuels —
Partie 10: Codage visuel avancé
AMENDEMENT 2: Extensions du codage vidéo multivues pour
l'inclusion de cartes de profondeur




Reference number
ISO/IEC 14496-10:2012/Amd.2:2013(E)
©
ISO 2013

---------------------- Page: 1 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)

COPYRIGHT PROTECTED DOCUMENT


©  ISO 2013
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form or by any
means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission.
Permission can be requested from either ISO at the address below or ISO’s member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland

ii © ISO/IEC 2013 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 2 to ISO/IEC 14496-10:2012 was prepared by Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia
information.

© ISO/IEC 2013 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)

Information technology — Coding of audio-visual objects —
Part 10:
Advanced Video Coding
AMENDMENT 2: MVC extensions for inclusion of depth maps
In 0.6, add the following paragraph after the paragraph that starts with "Multiview video coding":
An extension of multiview video coding that additionally supports the inclusion of depth maps is specified in Annex I,
allowing the construction of bitstreams that represent multiple views with corresponding depth views. In a similar
manner as with the multiview video coding specified in Annex H, bitstreams encoded as specified in Annex I may also
contain sub-bitstreams that conform to this Specification.

In 0.7, add the following paragraph after the paragraph that starts with "Annex H specifies":
Annex I specifies MVC extensions for inclusion of depth maps, referred to as multiview video coding with depth
(MVCD). The reader is referred to Annex I for the entire decoding process for MVCD, which is specified there with
references being made to clauses 2-9 and Annexes A-E and Annex H. Subclause I.10 specifies one profile for MVCD
(Multiview and Depth).

In Clause 2, add the following additional normative reference:
– ISO 12232:2006, Photography – Digital still cameras – Determination of exposure index, ISO speed
ratings, standard output sensitivity, and recommended exposure index.
In Clause 4, add the following additional abbreviation:
MVCD Multiview Video Coding with Depth
© ISO/IEC 2013 – All rights reserved 1

---------------------- Page: 4 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)
In 7.3.1, replace the syntax table with:

nal_unit( NumBytesInNALunit ) { C Descriptor
forbidden_zero_bit All f(1)
nal_ref_idc All u(2)
nal_unit_type All u(5)
NumBytesInRBSP = 0
nalUnitHeaderBytes = 1

if( nal_unit_type = = 14 | | nal_unit_type = = 20 | |
 nal_unit_type = = 21 ) {
 svc_extension_flag All u(1)
 if( svc_extension_flag )
  nal_unit_header_svc_extension( ) /* specified in Annex G */ All
 else
  nal_unit_header_mvc_extension( ) /* specified in Annex H */ All
 nalUnitHeaderBytes += 3
}
for( i = nalUnitHeaderBytes; i < NumBytesInNALunit; i++ ) {
 if( i + 2 < NumBytesInNALunit && next_bits( 24 ) = = 0x000003 ) {
  rbsp_byte[ NumBytesInRBSP++ ] All b(8)
  rbsp_byte[ NumBytesInRBSP++ ] All b(8)
  i += 2
  emulation_prevention_three_byte /* equal to 0x03 */ All f(8)
 } else
  rbsp_byte[ NumBytesInRBSP++ ] All b(8)
}
}

2 © ISO/IEC 2013 – All rights reserved

---------------------- Page: 5 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)
In 7.3.2.1.1, replace the syntax table with:
seq_parameter_set_data( ) { C Descriptor
profile_idc 0 u(8)
constraint_set0_flag 0 u(1)
constraint_set1_flag 0 u(1)
constraint_set2_flag 0 u(1)
constraint_set3_flag 0 u(1)
constraint_set4_flag 0 u(1)
constraint_set5_flag 0 u(1)
reserved_zero_2bits /* equal to 0 */ 0 u(2)
level_idc 0 u(8)
seq_parameter_set_id 0 ue(v)
if( profile_idc = = 100 | | profile_idc = = 110 | |
 profile_idc = = 122 | | profile_idc = = 244 | | profile_idc = = 44 | |
 profile_idc = = 83 | | profile_idc = = 86 | | profile_idc = = 118 | |
 profile_idc = = 128 | | profile_idc = = 138 ) {
 chroma_format_idc 0 ue(v)
 if( chroma_format_idc = = 3 )
  separate_colour_plane_flag 0 u(1)
 bit_depth_luma_minus8 0 ue(v)
 bit_depth_chroma_minus8 0 ue(v)
 qpprime_y_zero_transform_bypass_flag 0 u(1)
 seq_scaling_matrix_present_flag 0 u(1)
 if( seq_scaling_matrix_present_flag )
  for( i = 0; i < ( ( chroma_format_idc != 3 ) ? 8 : 12 ); i++ ) {
  seq_scaling_list_present_flag[ i ] 0 u(1)
  if( seq_scaling_list_present_flag[ i ] )
   if( i < 6 )
   scaling_list( ScalingList4x4[ i ], 16, 0
       UseDefaultScalingMatrix4x4Flag[ i ])
   else
   scaling_list( ScalingList8x8[ i − 6 ], 64, 0
       UseDefaultScalingMatrix8x8Flag[ i − 6 ] )
  }
}
log2_max_frame_num_minus4 0 ue(v)
pic_order_cnt_type 0 ue(v)
if( pic_order_cnt_type = = 0 )
 log2_max_pic_order_cnt_lsb_minus4 0 ue(v)
else if( pic_order_cnt_type = = 1 ) {
 delta_pic_order_always_zero_flag 0 u(1)
 offset_for_non_ref_pic 0 se(v)
 offset_for_top_to_bottom_field 0 se(v)
 num_ref_frames_in_pic_order_cnt_cycle 0 ue(v)
 for( i = 0; i < num_ref_frames_in_pic_order_cnt_cycle; i++ )
  offset_for_ref_frame[ i ] 0 se(v)
}
max_num_ref_frames 0 ue(v)
gaps_in_frame_num_value_allowed_flag 0 u(1)
© ISO/IEC 2013 – All rights reserved 3

---------------------- Page: 6 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)
pic_width_in_mbs_minus1 0 ue(v)
pic_height_in_map_units_minus1 0 ue(v)
frame_mbs_only_flag 0 u(1)
if( !frame_mbs_only_flag )
 mb_adaptive_frame_field_flag 0 u(1)
direct_8x8_inference_flag 0 u(1)
frame_cropping_flag 0 u(1)
if( frame_cropping_flag ) {
0 ue(v)
 frame_crop_left_offset
 frame_crop_right_offset 0 ue(v)
0 ue(v)
 frame_crop_top_offset
0 ue(v)
 frame_crop_bottom_offset
}
0 u(1)
vui_parameters_present_flag
if( vui_parameters_present_flag )
 vui_parameters( ) 0
}


In 7.3.2.1.3, replace the syntax table with:
subset_seq_parameter_set_rbsp( ) {
C Descriptor
seq_parameter_set_data( ) 0
if( profile_idc = = 83 | | profile_idc = = 86 ) {
 seq_parameter_set_svc_extension( ) /* specified in Annex G */ 0
 svc_vui_parameters_present_flag 0 u(1)
 if( svc_vui_parameters_present_flag = = 1 )
  svc_vui_parameters_extension( ) /* specified in Annex G */ 0
} else if( profile_idc = = 118 | | profile_idc = = 128 ) {
 bit_equal_to_one /* equal to 1 */ 0 f(1)
 seq_parameter_set_mvc_extension( ) /* specified in Annex H */ 0
 mvc_vui_parameters_present_flag 0 u(1)
 if( mvc_vui_parameters_present_flag = = 1 )
  mvc_vui_parameters_extension( ) /* specified in Annex H */ 0
} else if ( profile_idc = = 138 ) {
 bit_equal_to_one /* equal to 1 */ 0 f(1)
 seq_parameter_set_mvcd_extension( ) /* specified in Annex I */ 0
}
additional_extension2_flag 0 u(1)
if( additional_extension2_flag = = 1 )
 while( more_rbsp_data( ) )
  additional_extension2_data_flag 0 u(1)
rbsp_trailing_bits( ) 0
}

4 © ISO/IEC 2013 – All rights reserved

---------------------- Page: 7 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)
In 7.3.3, replace the syntax table with:
slice_header( ) { C Descriptor
first_mb_in_slice 2 ue(v)
slice_type 2 ue(v)
pic_parameter_set_id 2 ue(v)
if( separate_colour_plane_flag = = 1 )
 colour_plane_id 2 u(2)
frame_num 2 u(v)
if( !frame_mbs_only_flag ) {
 field_pic_flag 2 u(1)
 if( field_pic_flag )
  bottom_field_flag 2 u(1)
}
if( IdrPicFlag )
 idr_pic_id 2 ue(v)
if( pic_order_cnt_type = = 0 ) {
 pic_order_cnt_lsb 2 u(v)
 if( bottom_field_pic_order_in_frame_present_flag && !field_pic_flag )
  delta_pic_order_cnt_bottom 2 se(v)
}
if( pic_order_cnt_type = = 1 && !delta_pic_order_always_zero_flag ) {
 delta_pic_order_cnt[ 0 ] 2 se(v)
 if( bottom_field_pic_order_in_frame_present_flag && !field_pic_flag )
  delta_pic_order_cnt[ 1 ] 2 se(v)
}
if( redundant_pic_cnt_present_flag )
 redundant_pic_cnt 2 ue(v)
if( slice_type = = B )
 direct_spatial_mv_pred_flag 2 u(1)
if( slice_type = = P | | slice_type = = SP | | slice_type = = B ) {
 num_ref_idx_active_override_flag 2 u(1)
 if( num_ref_idx_active_override_flag ) {
  num_ref_idx_l0_active_minus1 2 ue(v)
  if( slice_type = = B )
  num_ref_idx_l1_active_minus1 2 ue(v)
 }
}
if( nal_unit_type = = 20 | | nal_unit_type = = 21 )
 ref_pic_list_mvc_modification( ) /* specified in Annex H */ 2
else
 ref_pic_list_modification( ) 2
if( ( weighted_pred_flag && ( slice_type = = P | | slice_type = = SP ) ) | |
 ( weighted_bipred_idc = = 1 && slice_type = = B ) )
 pred_weight_table( ) 2
if( nal_ref_idc != 0 )
 dec_ref_pic_marking( ) 2
if( entropy_coding_mode_flag && slice_type != I && slice_type != SI )
 cabac_init_idc 2 ue(v)
slice_qp_delta 2 se(v)
© ISO/IEC 2013 – All rights reserved 5

---------------------- Page: 8 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)
if( slice_type = = SP | | slice_type = = SI ) {
 if( slice_type = = SP )
  sp_for_switch_flag 2 u(1)
 slice_qs_delta 2 se(v)
}
if( deblocking_filter_control_present_flag ) {
 disable_deblocking_filter_idc 2 ue(v)
 if( disable_deblocking_filter_idc != 1 ) {
2 se(v)
  slice_alpha_c0_offset_div2
  slice_beta_offset_div2 2 se(v)
 }
}

if( num_slice_groups_minus1 > 0 &&
 slice_group_map_type >= 3 && slice_group_map_type <= 5)
 slice_group_change_cycle 2 u(v)
}

6 © ISO/IEC 2013 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)
Replace Table 7-1 with:
nal_unit_type Content of NAL unit and RBSP C Annex A Annex G Annex I
syntax structure and
Annex H
NAL unit NAL unit NAL unit
type class type class type class
0 Unspecified non-VCL non-VCL non-VCL
1 Coded slice of a non-IDR picture 2, 3, 4 VCL VCL VCL
slice_layer_without_partitioning_rbsp( )
2 Coded slice data partition A 2 VCL not not
slice_data_partition_a_layer_rbsp( ) applicable applicable
3 Coded slice data partition B 3 VCL not not
slice_data_partition_b_layer_rbsp( ) applicable applicable
4 Coded slice data partition C 4 VCL not not
slice_data_partition_c_layer_rbsp( ) applicable applicable
5 Coded slice of an IDR picture 2, 3 VCL VCL VCL
slice_layer_without_partitioning_rbsp( )
6 Supplemental enhancement information 5 non-VCL non-VCL non-VCL
(SEI)
sei_rbsp( )
7 Sequence parameter set 0 non-VCL non-VCL non-VCL
seq_parameter_set_rbsp( )
8 Picture parameter set 1 non-VCL non-VCL non-VCL
pic_parameter_set_rbsp( )
9 Access unit delimiter 6 non-VCL non-VCL non-VCL
access_unit_delimiter_rbsp( )
10 End of sequence 7 non-VCL non-VCL non-VCL
end_of_seq_rbsp( )
11 End of stream 8 non-VCL non-VCL non-VCL
end_of_stream_rbsp( )
12 Filler data 9 non-VCL non-VCL non-VCL
filler_data_rbsp( )
13 Sequence parameter set extension 10 non-VCL non-VCL non-VCL
seq_parameter_set_extension_rbsp( )
14 Prefix NAL unit 2 non-VCL suffix suffix
prefix_nal_unit_rbsp( ) dependent dependent
15 Subset sequence parameter set 0 non-VCL non-VCL non-VCL
subset_seq_parameter_set_rbsp( )
16.18 Reserved non-VCL non-VCL non-VCL
19 Coded slice of an auxiliary coded 2, 3, 4 non-VCL non-VCL non-VCL
picture without partitioning
slice_layer_without_partitioning_rbsp( )
20 Coded slice extension 2, 3, 4 non-VCL VCL VCL
slice_layer_extension_rbsp( )
21 Coded slice extension for depth view 2, 3, 4 non-VCL non-VCL VCL
components /*specified in Annex I */
slice_layer_extension_rbsp( ) /*
© ISO/IEC 2013 – All rights reserved 7

---------------------- Page: 10 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)
specified in Annex I */
22.23 Reserved non-VCL non-VCL VCL
24.31 Unspecified non-VCL non-VCL non-VCL

In 7.4.1, make the following changes:
Replace the following:
svc_extension_flag indicates whether a nal_unit_header_svc_extension( ) or nal_unit_header_mvc_extension( ) will
follow next in the syntax structure.
with:
svc_extension_flag indicates whether a nal_unit_header_svc_extension( ) or nal_unit_header_mvc_extension( ) will
follow next in the syntax structure. When nal_unit_type is equal to 21, svc_extension_flag shall be equal to 0 and the
semantics of svc_extension_flag equal to 1 are reserved for future specification by ITU-T | ISO/IEC.

Add the following paragraph after the semantics of svc_extension_flag just before the semantics of
rbsp_byte[ i ].
The value of svc_extension_flag shall be equal to 0 for coded video sequences conforming to one or more profiles
specified in Annex I. Decoders conforming to one or more profiles specified in Annex I shall ignore (remove from the
bitstream and discard) NAL units for which nal_unit_type is equal to 14, 20, or 21 and svc_extension_flag is equal to 1.

In 7.4.2.1.1, replace the following:
chroma_format_idc specifies the chroma sampling relative to the luma sampling as specified in clause 6.2. The value of
chroma_format_idc shall be in the range of 0 to 3, inclusive. When chroma_format_idc is not present, it shall be inferred
to be equal to 1 (4:2:0 chroma format).
with
chroma_format_idc specifies the chroma sampling relative to the luma sampling as specified in clause 6.2. The value of
chroma_format_idc shall be in the range of 0 to 3, inclusive. When chroma_format_idc is not present and profile_idc is
not equal to 138, chroma_format_idc shall be inferred to be equal to 1 (4:2:0 chroma format). When chroma_format_idc
is not present and profile_idc is equal to 138, chroma_format_idc shall be inferred to be equal to 0 (4:0:0 chroma format),
otherwise, it shall be inferred to be equal to 1 (4:2:0 chroma format).

In 7.4.2.1.3, replace the following:
additional_extension2_data_flag may have any value. It shall not affect the conformance to profiles specified in
Annex A, G, or H.
with
additional_extension2_data_flag may have any value. It shall not affect the conformance to profiles specified in
Annex A, G, H, or I.

8 © ISO/IEC 2013 – All rights reserved

---------------------- Page: 11 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)
Replace Annex C with:
Annex C
Hypothetical reference decoder

(This annex forms an integral part of this Recommendation | International Standard)
This annex specifies the hypothetical reference decoder (HRD) and its use to check bitstream and decoder conformance.
Two types of bitstreams are subject to HRD conformance checking for this Recommendation | International Standard.
The first such type of bitstream, called Type I bitstream, is a NAL unit stream containing only the VCL NAL units and
filler data NAL units for all access units in the bitstream. The second type of bitstream, called a Type II bitstream,
contains, in addition to the VCL NAL units and filler data NAL units for all access units in the bitstream, at least one of
the following:
– additional non-VCL NAL units other than filler data NAL units,
– all leading_zero_8bits, zero_byte, start_code_prefix_one_3bytes, and trailing_zero_8bits syntax elements that form
a byte stream from the NAL unit stream (as specified in Annex B).
Figure C-1 shows the types of bitstream conformance points checked by the HRD.
Non-VCL NAL units other
VCL NAL units than filter data NAL units
Filter data NAL units
Byte stream format
encapsulation
(see Annex B)
H.264(09)_FC-1

Figure C-1 – Structure of byte streams and NAL unit streams for HRD conformance checks
The syntax elements of non-VCL NAL units (or their default values for some of the syntax elements), required for the
HRD, are specified in the semantics subclauses of clause 7, Annexes D and E, and subclauses G.7, G.13, G.14,
H.7, H.13, H.14, I.7, I.13, and I.14.
Two types of HRD parameter sets (NAL HRD parameters and VCL HRD parameters) are used. The HRD parameter sets
are signalled as follows:
© ISO/IEC 2013 – All rights reserved 9

---------------------- Page: 12 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)
– When the coded video sequence conforms to one or more of the profiles specified in Annex A and the decoding
process specified in clauses 2-9 is applied, the HRD parameter sets are signalled through video usability information
as specified in subclauses E.1 and E.2, which is part of the sequence parameter set syntax structure.
– When the coded video sequence conforms to one or more of the profiles specified in Annex G and the decoding
process specified in Annex G is applied, the HRD parameter sets are signalled through the SVC video usability
information extension as specified in subclauses G.14.1 and G.14.2, which is part of the subset sequence parameter
set syntax structure.
NOTE 1 – For coded video sequences that conform to both, one or more of the profiles specified in Annex A and one or more of
the profiles specified in Annex G, the signalling of the applicable HRD parameter sets is depending on whether the decoding
process specified in clauses 2-9 or the decoding process specified in Annex G is applied.
– When the coded video sequence conforms to one or more of the profiles specified in Annex H and the decoding
process specified in Annex H is applied, the HRD parameter sets are signalled through the MVC video usability
information extension as specified in subclauses H.14.1 and H.14.2, which is part of the subset sequence parameter
set syntax structure.
NOTE 2 – For coded video sequences that conform to both, one or more of the profiles specified in Annex A and one or
more of the profiles specified in Annex H, the signalling of the applicable HRD parameter sets is depending on whether the
decoding process specified in clauses 2-9 or the decoding process specified in Annex H is applied.
– When the coded video sequence conforms to one or more of the profiles specified in Annex I and the decoding
process specified in Annex I is applied, the HRD parameter sets are signalled through the MVC video usability
information extension as specified in subclause I.14, which is part of the subset sequence parameter set syntax
structure.
NOTE 3 – For coded video sequences that conform to one or more of the profiles specified in Annex A, one or more of the
profiles specified in Annex H and one or more of the profiles specified in Annex I, the signalling of the applicable HRD
parameter sets is depending on whether the decoding process specified in clauses 2-9, the decoding process specified in
Annex H or the decoding process specified in Annex I is applied.
All sequence parameter sets and picture parameter sets referred to in the VCL NAL units, and corresponding buffering
period and picture timing SEI messages shall be conveyed to the HRD, in a timely manner, either in the bitstream (by
non-VCL NAL units), or by other means not specified in this Recommendation | International Standard.
In Annexes C, D, and E and subclauses G.12, G.13, G.14, H.12, H.13, H.14, I.12, I.13, and I.14, the specification for
"presence" of non-VCL NAL units is also satisfied when those NAL units (or just some of them) are conveyed to
decoders (or to the HRD) by other means not specified by this Recommendation | International Standard. For the purpose
of counting bits, only the appropriate bits that are actually present in the bitstream are counted.
NOTE 4 – As an example, synchronization of a non-VCL NAL unit, conveyed by means other than presence in the bitstream, with
the NAL units that are present in the bitstream, can be achieved by indicating two points in the bitstream, between which the
non-VCL NAL unit would have been present in the bitstream, had the encoder decided to convey it in the bitstream.
When the content of a non-VCL NAL unit is conveyed for the application by some means other than presence within the
bitstream, the representation of the content of the non-VCL NAL unit is not required to use the same syntax specified in
this annex.
NOTE 5 – When HRD information is contained within the bitstream, it is possible to verify the conformance of a bitstream to the
requirements of this subclause based solely on information contained in the bitstream. When the HRD information is not present in
the bitstream, as is the case for all "stand-alone" Type I bitstreams, conformance can only be verified when the HRD data is
supplied by some other means not specified in this Recommendation | International Standard.
The HRD contains a coded picture buffer (CPB), an instantaneous decoding process, a decoded picture buffer (DPB), and
output cropping as shown in Figure C-2.
10 © ISO/IEC 2013 – All rights reserved

---------------------- Page: 13 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)
Hypothetical stream
scheduler (HSS)
Type I or type II bitstream
Coded picture
buffer (CPB)
Access units
Decoding process
(instantaneous)
Reference
Pictures
pictures
Decoded picture
buffer (DPB)
Pictures
Output cropping
Output cropped pictures

Figure C-2 – HRD buffer model
The CPB size (number of bits) is CpbSize[ SchedSelIdx ]. The DPB size (number of frame buffers) is
Max( 1, max_dec_frame_buffering ). When the coded video sequence conforms to one or more of the profiles specified
in Annex H and the decoding process specified in Annex H is applied, the DPB size is specified in units of view
components. When the coded video sequence conforms to one or more of the profiles specified in Annex I and the
decoding process specified in Annex I is applied, the DPB is operated separately for texture view components and depth
view components and the terms texture DPB and depth DPB are used, respectively. The texture DPB size is specified in
units of texture view components and the depth DPB size is specified in units of depth view components.
The HRD operates as follows. Data associated with access units that flow into the CPB according to a specified arrival
schedule are delivered by the HSS. The data associated with each access unit are removed and decoded instantaneously
by the instantaneous decoding process at CPB removal times. Each decoded picture is placed in the DPB at its CPB
removal time unless it is output at its CPB removal time and is a non-reference picture. When a picture is placed in the
DPB it is removed from the DPB at the later of the DPB output time or the time that it is marked as "unused for
reference".
For each picture in the bitstream, the variable OutputFlag for the decoded picture and, when applicable, the reference
base picture is set as follows:
– If the coded video sequence containing the picture conforms to one or more of the profiles specified in Annex A and
the decoding process specified in clauses 2-9 is applied, OutputFlag is set equal to 1.
– Otherwise, if the coded video sequence containing the picture conforms to one or more of the profiles specified in
Annex G and the decoding process specified in Annex G is applied, the following applies:
– For a reference base picture, OutputFlag is set equal to 0.
© ISO/IEC 2013 – All rights reserved 11

---------------------- Page: 14 ----------------------
ISO/IEC 14496-10:2012/Amd.2:2013(E)
– For a decoded picture, OutputFlag is set equal to the value of the output_flag syntax element of the target layer
representation.
– Otherwise, if the coded video sequence containing the picture conforms to one or more of the profiles specified in
Annex H and the decoding process specified in Annex H is applied, the following applies:
– For the decoded view components of the target output views, OutputFlag is set equal to 1.
– For the decoded view components of other views, OutputFlag is set equal to 0.
– Otherwise (the coded video sequence containing the picture conforms to one or more of the profiles specified in
Annex I and the decoding process specified in Annex I is applied), the following applies:
– For the decoded texture view components and corresponding depth view components with same VOIdx of the
target output views, OutputFlag is set equal to 1.
– For the decoded texture view components and corresponding depth view components with same VOIdx of
other views, OutputFlag is set equal to 0.
The operation of the CPB is specified in subclause C.1. The instantaneous decoder operation is specified in clauses 2-9
(for coded video sequences conforming to one or more of the profiles specified in Annex A) and in Annex G (for coded
video sequences conforming to one or more of the profiles specified in Annex G) and in Annex H (for coded video
sequences conforming to one or more of the profiles specified in Annex H) and in Annex I (for coded video sequences
conforming to one or more of the profiles specified in Annex I). The operation of the DPB is specified in subclause C.2.
The output cropping is specified in subclause C.2.2.
NOTE 6 – Coded video sequences that conform to both, one or more of the profiles specified in Annex A and one or more of the
profiles specified in Annex G, can be decoded either by the decoding process specified in clauses 2-9 or by the decoding process
specified in Annex G. The decoding result and the HRD operation may be dependent on which of the decoding processes is
applied.
NOTE 7 – Coded video sequences that conform both to one or more of the profiles specified in Annex A and one or more of the
profiles specified in Annex H can be decoded either by the decoding process specified in clauses 2-9 or by the decoding process
specified in Annex H. The decoding result and the HRD operation may be dependent on which of the decoding processes is
applied.
NOTE 8 – Coded video sequences that conform to one or more of the profiles specified in Annex A, one or more of the profiles
specified in Annex H and one or more of the profiles specified in Annex I, can be decoded either by the decoding process specified
in clauses 2-9, by the decoding process specified in Annex H or by the decoding process specified in Annex I. The decoding result
and the HRD operation may be dependent on which of the decoding processes is applied.
HSS and HRD information concerning the number of enumerated delivery schedules and their associated bit rates and
buffer sizes is specified in subclauses E.1.1, E.1.2, E.2.1, E.2.2, G.14.1, G.14.2, H.14.1, H.14.2 and I.14. The HRD is
initialised as specified by the buffering period SEI message as specified in subclauses D.1.1 and D.2.1. The removal
timing of access units from the CPB and output timing from the DPB are specified in the picture timing SEI message as
specified in subclauses D.1.2 and D.2.2. All timing information relating to a specific access unit shall arrive prior to the
CPB removal time of the access unit.
When the coded video sequence conforms to one or more of the profiles specified in Annex G and the decoding process
specified in Annex G is applied, the following is specified:
(a) When an access unit contains one or more buffering period SEI messages that are included in scalable nesting
SEI messages and are associated with values of DQId in the range of ( ( DQIdMax >> 4) << 4 ) to
( ( ( DQIdMax >> 4 ) << 4 ) + 15 ), inclusive, the last of these buffering period SEI messages in decoding order
is the buffering period SEI message that initialises the HRD. Let hrdDQId be the largest value of
16 * sei_dependency_id[ i ] + sei_quality_id[ i ] that is associated with the
...

DRAFT AMENDMENT ISO/IEC 14496-10:2012/DAM 2
ISO/IEC JTC 1 Secretariat: ANSI

Voting begins on Voting terminates on
2012-10-09 2013-01-09
INTERNATIONAL ORGANIZATION FOR STANDARDIZATION • МЕЖДУНАРОДНАЯ ОРГАНИЗАЦИЯ ПО СТАНДАРТИЗАЦИИ • ORGANISATION INTERNATIONALE DE NORMALISATION
INTERNATIONAL ELECTROTECHNICAL COMMISSION • МЕЖДУНАРОДНАЯ ЭЛЕКТРОТЕХНИЧЕСКАЯ КОММИСИЯ • COMMISSION ÉLECTROTECHNIQUE INTERNATIONALE


Information technology — Coding of audio-visual objects —
Part 10:
Advanced Video Coding
AMENDMENT 2
Technologies de l'information — Codage des objets audiovisuels —
Partie 10: Codage visuel avancé
AMENDEMENT 2

ICS 35.040


To expedite distribution, this document is circulated as received from the committee
secretariat. ISO Central Secretariat work of editing and text composition will be undertaken at
publication stage.
Pour accélérer la distribution, le présent document est distribué tel qu'il est parvenu du
secrétariat du comité. Le travail de rédaction et de composition de texte sera effectué au
Secrétariat central de l'ISO au stade de publication.


THIS DOCUMENT IS A DRAFT CIRCULATED FOR COMMENT AND APPROVAL. IT IS THEREFORE SUBJECT TO CHANGE AND MAY NOT BE
REFERRED TO AS AN INTERNATIONAL STANDARD UNTIL PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS BEING ACCEPTABLE FOR INDUSTRIAL, TECHNOLOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON OCCASION HAVE TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL TO BECOME
STANDARDS TO WHICH REFERENCE MAY BE MADE IN NATIONAL REGULATIONS.
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT, WITH THEIR COMMENTS, NOTIFICATION OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPORTING DOCUMENTATION.
International Organization for Standardization, 2012
©
International Electrotechnical Commission, 2012

---------------------- Page: 1 ----------------------
ISO/IEC 14496-10:2010/DAM 2

Copyright notice
This ISO document is a Draft International Standard and is copyright-protected by ISO. Except as permitted
under the applicable laws of the user's country, neither this ISO draft nor any extract from it may be
reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic,
photocopying, recording or otherwise, without prior written permission being secured.
Requests for permission to reproduce should be addressed to either ISO at the address below or ISO's
member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Reproduction may be subject to royalty payments or a licensing agreement.
Violators may be prosecuted.
ii © ISO/IEC 2012 — All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC 14496-10:2012/DAM 2
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 2 to ISO/IEC 14496-10:2012 was prepared by Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia
information.

© ISO/IEC 2012 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC 14496-10:2012/DAM 2

Information technolog — Coding of audio-visual objects —
Part 10: Advanced Video Coding, AMENDMENT 2: MVC
extensions for inclusion of depth maps
AMENDMENT 2

In 0.6, add the following paragraph after the paragraph that starts with “Multiview video coding”:
An extension of multiview video coding that also supports the inclusion of depth maps is specified in Annex I allowing
the construction of bitstreams that represent multiple views with corresponding depth views. Similar to multiview video
coding, bitstreams that include multiple depth views may also contain sub-bitstreams that conform to this specification.
For temporal bitstream scalability, i.e., the presence of a sub-bitstream with a smaller temporal sampling rate than the
bitstream, complete access units are removed from the bitstream when deriving the sub-bitstream. For view bitstream
scalability, i.e. the presence of a sub-bitstream with fewer views than those included in the bitstream, NAL units are
removed from the bitstream when deriving the sub-bitstream. In this case, inter-view prediction, i.e., the prediction of one
view by data of another view signal, is typically used for efficient coding.

In 0.7, add the following paragraph after the paragraph that starts with “Annex H specifies”:
Annex I specifies MVC extensions for inclusion of depth maps, referred to as 3D Video Coding (3DVC). The reader is
referred to Annex I for the entire decoding process for 3DVC, which is specified there with references being made to
clauses 2-9 and Annexes A-E and Annex H. Subclause I.10 specifies one profile for 3DVC (Multiview and Depth).

© ISO/IEC 2012 – All rights reserved 1

---------------------- Page: 4 ----------------------
ISO/IEC 14496-10:2012/DAM 2
In 7.3.1, replace the syntax table with:

nal_unit( NumBytesInNALunit ) { C Descriptor
forbidden_zero_bit All f(1)
nal_ref_idc All u(2)
nal_unit_type All u(5)
NumBytesInRBSP = 0
nalUnitHeaderBytes = 1
if( nal_unit_type = = 14 | | nal_unit_type = = 20  | |
 nal_unit_type = = 21) {
 svc_extension_flag All u(1)
 if( !svc_extension_flag | | nal_unit_type = = 21 ) )
  nal_unit_header_mvc_extension( ) /* specified in Annex H */ All
 else
  nal_unit_header_svc_extension( ) /* specified in Annex G */ All
 nalUnitHeaderBytes += 3
}
for( i = nalUnitHeaderBytes; i < NumBytesInNALunit; i++ ) {
 if( i + 2 < NumBytesInNALunit && next_bits( 24 ) = = 0x000003 ) {
  rbsp_byte[ NumBytesInRBSP++ ] All b(8)
  rbsp_byte[ NumBytesInRBSP++ ] All b(8)
  i += 2
  emulation_prevention_three_byte /* equal to 0x03 */ All f(8)
 } else
  rbsp_byte[ NumBytesInRBSP++ ] All b(8)
}
}

2 © ISO/IEC 2012 – All rights reserved

---------------------- Page: 5 ----------------------
ISO/IEC 14496-10:2012/DAM 2
In 7.3.2.1.3, replace the syntax table with:

subset_seq_parameter_set_rbsp( ) { C Descriptor
seq_parameter_set_data( ) 0
if( profile_idc = = 83 | | profile_idc = = 86 ) {
 seq_parameter_set_svc_extension( ) /* specified in Annex G */ 0
 svc_vui_parameters_present_flag 0 u(1)
 if( svc_vui_parameters_present_flag = = 1 )
  svc_vui_parameters_extension( ) /* specified in Annex G */ 0
} else if( profile_idc = = 118 | | profile_idc = = 128 ) {
 bit_equal_to_one /* equal to 1 */ 0 f(1)
 seq_parameter_set_mvc_extension( ) /* specified in Annex H */ 0
 mvc_vui_parameters_present_flag 0 u(1)
 if( mvc_vui_parameters_present_flag = = 1 )
  mvc_vui_parameters_extension( ) /* specified in Annex H */ 0
}
if( profile_idc = = 138 ) {
 bit_equal_to_one /* equal to 1 */ 0 f(1)
 seq_parameter_set_mvc_extension( ) /* specified in Annex H */ 0
 seq_parameter_set_3dvc_extension( ) 0
}
additional_extension3_flag 0 u(1)
if( additional_extension3_flag = = 1 )
 while( more_rbsp_data( ) )
  additional_extension3_data_flag 0 u(1)
rbsp_trailing_bits( ) 0
}

In 7.3.3, replace the syntax table with:

slice_header( ) { C Descriptor
first_mb_in_slice 2 ue(v)
slice_type 2 ue(v)
pic_parameter_set_id 2 ue(v)
if( separate_colour_plane_flag = = 1 )
 colour_plane_id 2 u(2)
frame_num 2 u(v)
if( !frame_mbs_only_flag ) {
 field_pic_flag 2 u(1)
 if( field_pic_flag )
  bottom_field_flag 2 u(1)
}
if( IdrPicFlag )
 idr_pic_id 2 ue(v)
if( pic_order_cnt_type = = 0 ) {
 pic_order_cnt_lsb 2 u(v)
 if( bottom_field_pic_order_in_frame_present_flag && !field_pic_flag )
© ISO/IEC 2012 – All rights reserved 3

---------------------- Page: 6 ----------------------
ISO/IEC 14496-10:2012/DAM 2
  delta_pic_order_cnt_bottom 2 se(v)
}
if( pic_order_cnt_type = = 1 && !delta_pic_order_always_zero_flag ) {
 delta_pic_order_cnt[ 0 ] 2 se(v)
 if( bottom_field_pic_order_in_frame_present_flag && !field_pic_flag )
  delta_pic_order_cnt[ 1 ] 2 se(v)
}
if( redundant_pic_cnt_present_flag )
 redundant_pic_cnt 2 ue(v)
if( slice_type = = B )
 direct_spatial_mv_pred_flag 2 u(1)
if( slice_type = = P | | slice_type = = SP | | slice_type = = B ) {
 num_ref_idx_active_override_flag 2 u(1)
 if( num_ref_idx_active_override_flag ) {
  num_ref_idx_l0_active_minus1 2 ue(v)
  if( slice_type = = B )
  num_ref_idx_l1_active_minus1 2 ue(v)
 }
}
if( nal_unit_type = = 20 | | nal_unit_type = = 21 )
 ref_pic_list_mvc_modification( ) /* specified in Annex H */ 2
else
 ref_pic_list_modification( ) 2
if( ( weighted_pred_flag && ( slice_type = = P | | slice_type = = SP ) ) | |
 ( weighted_bipred_idc = = 1 && slice_type = = B ) )
 pred_weight_table( ) 2
if( nal_ref_idc != 0 )
 dec_ref_pic_marking( ) 2
if( entropy_coding_mode_flag && slice_type != I && slice_type != SI )
 cabac_init_idc 2 ue(v)
slice_qp_delta 2 se(v)
if( slice_type = = SP | | slice_type = = SI ) {
 if( slice_type = = SP )
  sp_for_switch_flag 2 u(1)
 slice_qs_delta 2 se(v)
}
if( deblocking_filter_control_present_flag ) {
 disable_deblocking_filter_idc 2 ue(v)
 if( disable_deblocking_filter_idc != 1 ) {
2 se(v)
  slice_alpha_c0_offset_div2
2 se(v)
  slice_beta_offset_div2
 }
}

if( num_slice_groups_minus1 > 0 &&
 slice_group_map_type >= 3 && slice_group_map_type <= 5)
 slice_group_change_cycle 2 u(v)
}

4 © ISO/IEC 2012 – All rights reserved

---------------------- Page: 7 ----------------------
ISO/IEC 14496-10:2012/DAM 2
Replace Table 7-1 with:

nal_unit_type Content of NAL unit and RBSP C Annex A Annex G Annex I
syntax structure NAL unit and NAL unit
type class Annex H type class
NAL unit
type class
0 Unspecified non-VCL non-VCL
non-VCL
1 Coded slice of a non-IDR picture 2, 3, 4 VCL VCL
VCL
slice_layer_without_partitioning_rbsp( )
2 Coded slice data partition A 2 VCL not
not
slice_data_partition_a_layer_rbsp( ) applicable
applicable
3 Coded slice data partition B 3 VCL not
not
slice_data_partition_b_layer_rbsp( ) applicable
applicable
4 Coded slice data partition C 4 VCL not
not
slice_data_partition_c_layer_rbsp( ) applicable
applicable
5 Coded slice of an IDR picture 2, 3 VCL VCL
VCL
slice_layer_without_partitioning_rbsp( )
6 Supplemental enhancement information 5 non-VCL non-VCL
(SEI)
non-VCL
sei_rbsp( )
7 Sequence parameter set 0 non-VCL non-VCL
non-VCL
seq_parameter_set_rbsp( )
8 Picture parameter set 1 non-VCL non-VCL
non-VCL
pic_parameter_set_rbsp( )
9 Access unit delimiter 6 non-VCL non-VCL
non-VCL
access_unit_delimiter_rbsp( )
10 End of sequence 7 non-VCL non-VCL
non-VCL
end_of_seq_rbsp( )
11 End of stream 8 non-VCL non-VCL
non-VCL
end_of_stream_rbsp( )
12 Filler data 9 non-VCL non-VCL
non-VCL
filler_data_rbsp( )
13 Sequence parameter set extension 10 non-VCL non-VCL
non-VCL
seq_parameter_set_extension_rbsp( )
14 Prefix NAL unit 2 non-VCL suffix
suffix
prefix_nal_unit_rbsp( ) dependent
dependent
15 Subset sequence parameter set 0 non-VCL non-VCL
non-VCL
subset_seq_parameter_set_rbsp( )
16.18 Reserved non-VCL non-VCL
non-VCL
19 Coded slice of an auxiliary coded 2, 3, 4 non-VCL non-VCL
picture without partitioning
non-VCL
slice_layer_without_partitioning_rbsp( )
20 Coded slice extension 2, 3, 4 non-VCL VCL
VCL
slice_layer_extension_rbsp( )
21 Coded slice extension for depth view 2, 3, 4 non-VCL VCL
VCL
© ISO/IEC 2012 – All rights reserved 5

---------------------- Page: 8 ----------------------
ISO/IEC 14496-10:2012/DAM 2
components /*specified in Annex I */
slice_layer_extension_rbsp( ) /*
specified in Annex I */
22.23 Reserved non-VCL non-VCL
VCL
24.31 Unspecified non-VCL non-VCL non-VCL

In 7.4.1, add the following paragraph in the semantics of svc_extension_flag just before the semantics of
rbsp_byte[ i ].
The value of svc_extension_flag shall be equal to 0 for coded video sequences conforming to one or more profiles
specified in Annex I. Decoders conforming to one or more profiles specified in Annex I shall ignore (remove from the
bitstream and discard) NAL units for which nal_unit_type is equal to 14, 20, or 21 and for which svc_extension_flag is
equal to 1.

In 7.4.2.1.3, make the following changes:
Replace the sentence following text in the semantics of chroma_format_idc after “inclusive.” with:
When chroma_format_idc is not present and when profile_idc is equal to 138, it shall be inferred to be equal to 0 (4:0:0
chroma format), otherwise, it shall be inferred to be equal to 1 (4:2:0 chroma format).

Substitute each occurrence of “additional_extension2_flag” with “additional_extension3_flag”.
Replace the semantics of additional_extension2_data_flag with the following:
additional_extension3_data_flag may have any value. It shall not affect the conformance to profiles specified in
Annex A, G, H, or I.


6 © ISO/IEC 2012 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/IEC 14496-10:2012/DAM 2
Replace Annex C with:
Annex C\
Hypothetical reference decoder

− (This annex forms an integral part of this Recommendation | International Standard)
This annex specifies the hypothetical reference decoder (HRD) and its use to check bitstream and decoder conformance.
Two types of bitstreams are subject to HRD conformance checking for this Recommendation | International Standard.
The first such type of bitstream, called Type I bitstream, is a NAL unit stream containing only the VCL NAL units and
filler data NAL units for all access units in the bitstream. The second type of bitstream, called a Type II bitstream,
contains, in addition to the VCL NAL units and filler data NAL units for all access units in the bitstream, at least one of
the following:
– additional non-VCL NAL units other than filler data NAL units,
– all leading_zero_8bits, zero_byte, start_code_prefix_one_3bytes, and trailing_zero_8bits syntax elements that form
a byte stream from the NAL unit stream (as specified in Annex B).
Figure C-1 shows the types of bitstream conformance points checked by the HRD.
Non-VCL NAL units other
VCL NAL units than filter data NAL units
Filter data NAL units
Byte stream format
encapsulation
(see Annex B)
H.264(09)_FC-1

Figure C-1 – Structure of byte streams and NAL unit streams for HRD conformance checks
The syntax elements of non-VCL NAL units (or their default values for some of the syntax elements), required for the
HRD, are specified in the semantic subclauses of clause 7, Annexes D and E, and subclauses G.7, G.13, G.14, H.7, H.13,
H.14, I.7, I.13, and I.14.
Two types of HRD parameter sets (NAL HRD parameters and VCL HRD parameters) are used. The HRD parameter sets
are signalled as follows:
– When the coded video sequence conforms to one or more of the profiles specified in Annex A and the decoding
process specified in clauses 2-9 is applied, the HRD parameter sets are signalled through video usability information
as specified in subclauses E.1 and E.2, which is part of the sequence parameter set syntax structure.
– When the coded video sequence conforms to one or more of the profiles specified in Annex G and the decoding
process specified in Annex G is applied, the HRD parameter sets are signalled through the SVC video usability
information extension as specified in subclauses G.14.1 and G.14.2, which is part of the subset sequence parameter
set syntax structure.
© ISO/IEC 2012 – All rights reserved 7

---------------------- Page: 10 ----------------------
ISO/IEC 14496-10:2012/DAM 2
NOTE 1 – For coded video sequences that conform to both, one or more of the profiles specified in Annex A and one
or more of the profiles specified in Annex G, the signalling of the applicable HRD parameter sets is depending on
whether the decoding process specified in clauses 2-9 or the decoding process specified in Annex G is applied.
– When the coded video sequence conforms to one or more of the profiles specified in Annex H and the decoding
process specified in Annex H is applied, the HRD parameter sets are signalled through the MVC video usability
information extension as specified in subclauses H.14.1 and H.14.2, which is part of the subset sequence parameter
set syntax structure.
NOTE 2 – For coded video sequences that conform to both, one or more of the profiles specified in Annex A and one
or more of the profiles specified in Annex H, the signalling of the applicable HRD parameter sets is depending on
whether the decoding process specified in clauses 2-9 or the decoding process specified in Annex H is applied.
– When the coded video sequence conforms to one or more of the profiles specified in Annex I and the decoding
process specified in Annex I is applied, the HRD parameter sets are signalled through the MVC video usability
information extension as specified in subclauses I.14, which is part of the subset sequence parameter set syntax
structure.
NOTE 3 – For coded video sequences that conform to one or more of the profiles specified in Annex A, one or more
of the profiles specified in Annex H and one or more of the profiles specified in Annex I, the signalling of the
applicable HRD parameter sets is depending on whether the decoding process specified in clauses 2-9, the decoding
process specified in Annex H or the decoding process specified in Annex I is applied.
All sequence parameter sets and picture parameter sets referred to in the VCL NAL units, and corresponding buffering
period and picture timing SEI messages shall be conveyed to the HRD, in a timely manner, either in the bitstream (by
non-VCL NAL units), or by other means not specified in this Recommendation | International Standard.
In Annexes C, D, and E and subclauses G.12, G.13, G.14, H.12, H.13, H.14, I.12, I.13, and I.14, the specification for
"presence" of non-VCL NAL units is also satisfied when those NAL units (or just some of them) are conveyed to
decoders (or to the HRD) by other means not specified by this Recommendation | International Standard. For the purpose
of counting bits, only the appropriate bits that are actually present in the bitstream are counted.
NOTE 3 – As an example, synchronization of a non-VCL NAL unit, conveyed by means other than presence in the
bitstream, with the NAL units that are present in the bitstream, can be achieved by indicating two points in the
bitstream, between which the non-VCL NAL unit would have been present in the bitstream, had the encoder decided
to convey it in the bitstream.
When the content of a non-VCL NAL unit is conveyed for the application by some means other than presence within the
bitstream, the representation of the content of the non-VCL NAL unit is not required to use the same syntax specified in
this annex.
NOTE 4 – When HRD information is contained within the bitstream, it is possible to verify the conformance of a
bitstream to the requirements of this subclause based solely on information contained in the bitstream. When the HRD
information is not present in the bitstream, as is the case for all "stand-alone" Type I bitstreams, conformance can
only be verified when the HRD data is supplied by some other means not specified in this Recommendation |
International Standard.
The HRD contains a coded picture buffer (CPB), an instantaneous decoding process, a decoded picture buffer (DPB), and
output cropping as shown in Figure C-2.
8 © ISO/IEC 2012 – All rights reserved

---------------------- Page: 11 ----------------------
ISO/IEC 14496-10:2012/DAM 2
Hypothetical stream
scheduler (HSS)
Type I or type II bitstream
Coded picture
buffer (CPB)
Access units
Decoding process
(instantaneous)
Reference
Pictures
pictures
Decoded picture
buffer (DPB)
Pictures
Output cropping
Output cropped pictures

Figure C-2 – HRD buffer model
The CPB size (number of bits) is CpbSize[ SchedSelIdx ]. The DPB size (number of frame buffers) is
Max( 1, max_dec_frame_buffering ). When the coded video sequence conforms to one or more of the profiles specified
in Annex H and the decoding process specified in Annex H is applied, the DPB size is specified in units of view
components. When the coded video sequence conforms to one or more of the profiles specified in Annex I and the
decoding process specified in Annex I is applied, the DPB size is specified in units of texture view components and depth
view components.
The HRD operates as follows. Data associated with access units that flow into the CPB according to a specified arrival
schedule are delivered by the HSS. The data associated with each access unit are removed and decoded instantaneously
by the instantaneous decoding process at CPB removal times. Each decoded picture is placed in the DPB at its CPB
removal time unless it is output at its CPB removal time and is a non-reference picture. When a picture is placed in the
DPB it is removed from the DPB at the later of the DPB output time or the time that it is marked as "unused for
reference".
For each picture in the bitstream, the variable OutputFlag for the decoded picture and, when applicable, the reference
base picture is set as follows:
– If the coded video sequence containing the picture conforms to one or more of the profiles specified in Annex A and
the decoding process specified in clauses 2-9 is applied, OutputFlag is set equal to 1.
– Otherwise, if the coded video sequence containing the picture conforms to one or more of the profiles specified in
Annex G and the decoding process specified in Annex G is applied, the following applies:
– For a reference base picture, OutputFlag is set equal to 0.
– For a decoded picture, OutputFlag is set equal to the value of the output_flag syntax element of the target layer
representation.
– Otherwise (the coded video sequence containing the picture conforms to one or more of the profiles specified in
Annex H and the decoding process specified in Annex H is applied), the following applies:
© ISO/IEC 2012 – All rights reserved 9

---------------------- Page: 12 ----------------------
ISO/IEC 14496-10:2012/DAM 2
– For the decoded view components of the target output views, OutputFlag is set equal to 1.
– For the decoded view components of other views, OutputFlag is set equal to 0.
– Otherwise (the coded video sequence containing the picture conforms to one or more of the profiles specified in
Annex I and the decoding process specified in Annex I is applied), the following applies:
– For the decoded texture view components and depth view component with same VOIdx of the target output
views, OutputFlag is set equal to 1.
– For the decoded texture view components and depth view component with same VOIdx of other views,
OutputFlag is set equal to 0.
The operation of the CPB is specified in subclause C.1. The instantaneous decoder operation is specified in clauses 2-9
(for coded video sequences conforming to one or more of the profiles specified in Annex A) and in Annex G (for coded
video sequences conforming to one or more of the profiles specified in Annex G) and in Annex H (for coded video
sequences conforming to one or more of the profiles specified in Annex H) and in Annex I (for coded video sequences
conforming to one or more of the profiles specified in Annex I). The operation of the DPB is specified in subclause C.2.
The output cropping is specified in subclause C.2.2.
NOTE 5 – Coded video sequences that conform to both, one or more of the profiles specified in Annex A and one or
more of the profiles specified in Annex G, can be decoded either by the decoding process specified in clauses 2-9 or
by the decoding process specified in Annex G. The decoding result and the HRD operation may be depending on
which of the decoding processes is applied.
NOTE 6 – Coded video sequences that conform to both, one or more of the profiles specified in Annex A and one or
more of the profiles specified in Annex H, can be decoded either by the decoding process specified in clauses 2-9 or
by the decoding process specified in Annex H. The decoding result and the HRD operation may be depending on
which of the decoding processes is applied.
NOTE 7 – Coded video sequences that conform to one or more of the profiles specified in Annex A, one or more of
the profiles specified in Annex H and one or more of the profiles specified in Annex I, can be decoded either by the
decoding process specified in clauses 2-9, by the decoding process specified in Annex H or by the decoding process
specified in Annex I. The decoding result and the HRD operation may be depending on which of the decoding
processes is applied.
HSS and HRD information concerning the number of enumerated delivery schedules and their associated bit rates and
buffer sizes is specified in subclauses E.1.1, E.1.2, E.2.1, E.2.2, G.14.1, G.14.2, H.14.1, H.14.2 and I.14. The HRD is
initialised as specified by the buffering period SEI message as specified in subclauses D.1.1 and D.2.1. The removal
timing of access units from the CPB and output timing from the DPB are specified in the picture timing SEI message as
specified in subclauses D.1.2 and D.2.2. All timing information relating to a specific access unit shall arrive prior to the
CPB removal time of the access unit.
When the coded video sequence conforms to one or more of the profiles specified in Annex G and the decoding process
specified in Annex G is applied, the following is specified:
(a) When an access unit contains one or more buffering period SEI messages that are included in scalable nesting
SEI messages and are associated with values of DQId in the range of ( ( DQIdMax >> 4) << 4 ) to
( ( ( DQIdMax >> 4 ) << 4 ) + 15 ), inclusive, the last of these buffering period SEI messages in decoding order
is the buffering period SEI message that initialises the HRD. Let hrdDQId be the largest value of
16 * sei_dependency_id[ i ] + sei_quality_id[ i ] that is associated with the scalable nesting SEI message
containing the buffering period SEI message that initialises the HRD, let hrdDId and hrdQId be equal to
hrdDQId >> 4 and hrdDQId & 15, respectively, and let hrdTId be the value of sei_temporal_id that is
associated with the scalable nesting SEI message containing the buffering period SEI message that initialises
the HRD.
(b) The picture timing SEI messages that specify the removal timing of access units from the CPB and output
timing from the DPB are the picture timing SEI messages that are included in scalable nesting SEI messages
associated with values of sei_dependency_id[ i ], sei_quality_id[ i ], and sei_temporal_id equal to hrdDId,
hrdQId, and hrdTId, respectively.
(c) The HRD parameter sets that are used for conformance checking are the HRD parameter sets, included in the
SVC video usability information extension of the active SVC sequence parameter set, that are associated with
values of vui_ext_dependency_id[ i ], vui_ext_quality_id[ i ], and vui_ext_temporal_id[ i ] equal to hrdDId,
hrdQId, and hrdTId, respectively. For the specification in this annex, num_units_in_tick, time_scale,
fixed_frame_rate_flag, nal_hrd_parameters_present_flag, vcl_hrd_parameters_present_flag,
low_delay_hrd_flag, and pic_struct_present_flag are substituted with the values of
vui_ext_num_units_in_tick[ i ], vui_ext_time_scale[ i ], vui_ext_fixed_frame_rate_flag[ i ],
10 © ISO/IEC 2012 – All rights reserved

---------------------- Page: 13 ----------------------
ISO/IEC 14496-10:2012/DAM 2
vui_ext_nal_hrd_parameters_present_flag[ i ], vui_ext_vcl_hrd_parameters_present_flag[ i ],
vui_ext_low_delay_hrd_flag[ i ], and vui_ext_pic_struct_present_flag[ i ], respectively, with i being the value
for which vui_ext_dependency_id[ i ], vui_ext_quality_id[ i ], and vui_ext_temporal_id[ i ] are equal to hrdDId,
hrdQId, and hrdTId, respectively.
When the coded video sequence conforms to one or more of the profiles specified in Annex H and the decoding process
specified in Annex H is applied, the following is specified:
(a) When an access unit contains one or more buffering period SEI messages that are included in MVC scalable
nesting SEI messages, the buffering period SEI message that is associated with the operation point being
decoded is the buffering period SEI mes
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.