Information technology — Coding of audio-visual objects — Part 10: Advanced video coding

This document specifies advanced video coding for coding of audio-visual objects.

Technologies de l'information — Codage des objets audiovisuels — Partie 10: Codage visuel avancé

General Information

Status
Published
Publication Date
07-Nov-2022
Current Stage
9092 - International Standard to be revised
Completion Date
24-Jul-2023
Ref Project

Relations

Buy Standard

Standard
ISO/IEC 14496-10:2022 - Information technology — Coding of audio-visual objects — Part 10: Advanced video coding Released:8. 11. 2022
English language
867 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO/IEC
STANDARD 14496-10
Tenth edition
2022-11
Information technology — Coding of
audio-visual objects —
Part 10:
Advanced video coding
Technologies de l'information — Codage des objets audiovisuels —
Partie 10: Codage visuel avancé
Reference number
ISO/IEC 14496-10:2022(E)
© ISO/IEC 2022

---------------------- Page: 1 ----------------------
ISO/IEC 14496-10:2022(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2022
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
  © ISO/IEC 2022 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC 14496-10:2022(E)
Contents
Foreword vi
0 Introduction vii
1 Scope 1
2 Normative references 1
3 Terms and definitions 1
4 Abbreviated terms 28
5 Conventions 29
5.1 Arithmetic operators . 29
5.2 Logical operators . 30
5.3 Relational operators . 30
5.4 Bit-wise operators . 30
5.5 Assignment operators . 31
5.6 Range notation. 31
5.7 Mathematical functions . 31
5.8 Order of operation precedence . 32
5.9 Variables, syntax elements, and tables . 33
5.10 Text description of logical operations . 34
5.11 Processes . 35
6 Source, coded, decoded and output data formats, scanning processes, and neighbouring relationships 35
6.1 Bitstream formats . 35
6.2 Source, decoded, and output picture formats . 35
6.3 Spatial subdivision of pictures and slices . 40
6.4 Inverse scanning processes and derivation processes for neighbours . 41
6.4.1 Inverse macroblock scanning process . 41
6.4.2 Inverse macroblock partition and sub-macroblock partition scanning process . 42
6.4.3 Inverse 4x4 luma block scanning process . 43
6.4.4 Inverse 4x4 Cb or Cr block scanning process for ChromaArrayType equal to 3 . 44
6.4.5 Inverse 8x8 luma block scanning process . 44
6.4.6 Inverse 8x8 Cb or Cr block scanning process for ChromaArrayType equal to 3 . 44
6.4.7 Inverse 4x4 chroma block scanning process . 44
6.4.8 Derivation process of the availability for macroblock addresses . 45
6.4.9 Derivation process for neighbouring macroblock addresses and their availability . 45
6.4.10 Derivation process for neighbouring macroblock addresses and their availability in MBAFF frames . 45
6.4.11 Derivation processes for neighbouring macroblocks, blocks, and partitions . 46
6.4.12 Derivation process for neighbouring locations . 51
6.4.13 Derivation processes for block and partition indices . 54
7 Syntax and semantics 55
7.1 Method of specifying syntax in tabular form . 55
7.2 Specification of syntax functions, categories, and descriptors . 56
7.3 Syntax in tabular form . 58
7.3.1 NAL unit syntax . 58
7.3.2 Raw byte sequence payloads and RBSP trailing bits syntax . 59
7.3.3 Slice header syntax . 67
7.3.4 Slice data syntax . 72
7.3.5 Macroblock layer syntax . 73
7.4 Semantics . 80
7.4.1 NAL unit semantics . 80
7.4.2 Raw byte sequence payloads and RBSP trailing bits semantics . 90
7.4.3 Slice header semantics . 104
7.4.4 Slice data semantics . 115
7.4.5 Macroblock layer semantics . 116
8 Decoding process 129
8.1 NAL unit decoding process . 130
8.2 Slice decoding process . 131
8.2.1 Decoding process for picture order count . 131
© ISO/IEC 2022 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC 14496-10:2022(E)
8.2.2 Decoding process for macroblock to slice group map . 135
8.2.3 Decoding process for slice data partitions . 138
8.2.4 Decoding process for reference picture lists construction . 139
8.2.5 Decoded reference picture marking process . 146
8.3 Intra prediction process . 150
8.3.1 Intra_4x4 prediction process for luma samples . 151
8.3.2 Intra_8x8 prediction process for luma samples . 157
8.3.3 Intra_16x16 prediction process for luma samples . 164
8.3.4 Intra prediction process for chroma samples. 167
8.3.5 Sample construction process for I_PCM macroblocks . 171
8.4 Inter prediction process . 172
8.4.1 Derivation process for motion vector components and reference indices . 174
8.4.2 Decoding process for Inter prediction samples . 187
8.4.3 Derivation process for prediction weights . 196
8.5 Transform coefficient decoding process and picture construction process prior to deblocking filter process . 198
8.5.1 Specification of transform decoding process for 4x4 luma residual blocks . 198
8.5.2 Specification of transform decoding process for luma samples of Intra_16x16 macroblock prediction
mode . 199
8.5.3 Specification of transform decoding process for 8x8 luma residual blocks . 200
8.5.4 Specification of transform decoding process for chroma samples . 200
8.5.5 Specification of transform decoding process for chroma samples with ChromaArrayType equal to 3 . 202
8.5.6 Inverse scanning process for 4x4 transform coefficients and scaling lists . 203
8.5.7 Inverse scanning process for 8x8 transform coefficients and scaling lists . 204
8.5.8 Derivation process for chroma quantization parameters . 205
8.5.9 Derivation process for scaling functions . 206
8.5.10 Scaling and transformation process for DC transform coefficients for Intra_16x16 macroblock type . 207
8.5.11 Scaling and transformation process for chroma DC transform coefficients . 208
8.5.12 Scaling and transformation process for residual 4x4 blocks . 210
8.5.13 Scaling and transformation process for residual 8x8 blocks . 212
8.5.14 Picture construction process prior to deblocking filter process . 216
8.5.15 Intra residual transform-bypass decoding process . 218
8.6 Decoding process for P macroblocks in SP slices or SI macroblocks . 218
8.6.1 SP decoding process for non-switching pictures . 219
8.6.2 SP and SI slice decoding process for switching pictures . 221
8.7 Deblocking filter process . 223
8.7.1 Filtering process for block edges . 227
8.7.2 Filtering process for a set of samples across a horizontal or vertical block edge . 229
9 Parsing process 235
9.1 Parsing process for Exp-Golomb codes . 235
9.1.1 Mapping process for signed Exp-Golomb codes . 237
9.1.2 Mapping process for coded block pattern . 237
9.2 CAVLC parsing process for transform coefficient levels . 240
9.2.1 Parsing process for total number of non-zero transform coefficient levels and number of trailing ones . 241
9.2.2 Parsing process for level information . 244
9.2.3 Parsing process for run information . 246
9.2.4 Combining level and run information . 249
9.3 CABAC parsing process for slice data . 249
9.3.1 Initialization process . 250
9.3.2 Binarization process . 274
9.3.3 Decoding process flow . 283
9.3.4 Arithmetic encoding process . 304
Annex A (normative) Profiles and levels 311
Annex B (normative) Byte stream format 334
Annex C (normative) Hypothetical reference decoder 337
Annex D (normative) Supplemental enhancement information 358
Annex E (normative) Video usability information 449
Annex F (normative) Scalable video coding 470
Annex G (normative) Multiview video coding 695
iv © ISO/IEC 2022 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC 14496-10:2022(E)
Annex H (normative) Multiview and depth video coding 762
Annex I (normative) Multiview and depth video with enhanced non-base view coding 811
Bibliography 867

© ISO/IEC 2022 – All rights reserved v

---------------------- Page: 5 ----------------------
ISO/IEC 14496-10:2022(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work.
The procedures used to develop this document and those intended for its further maintenance are described in
the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of
document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives or www.iec.ch/members_experts/refdocs).
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. Details of any
patent rights identified during the development of the document will be in the Introduction and/or on the ISO
list of patent declarations received (see www.iso.org/patents) or the IEC list of patent declarations received
(see http://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information, in collaboration
with ITU-T (as ITU-T H.264).
This tenth edition cancels and replaces the ninth edition (ISO/IEC 14496-10:2020), which has been
technically revised.
The main changes are as follows:
— addition of annotated regions and shutter interval information supplemental enhancement information
messages.
A list of all parts in the ISO/IEC 14496 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html and www.iec.ch/national-
committees.
vi © ISO/IEC 2022 – All rights reserved

---------------------- Page: 6 ----------------------
ISO/IEC 14496-10:2022(E)
0 Introduction
0.1 Prologue
As the costs for both processing power and memory have reduced, network support for coded video data has diversified,
and advances in video coding technology have progressed, the need has arisen for an industry standard for compressed
video representation with substantially increased coding efficiency and enhanced robustness to network environments.
Toward these ends the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group
(MPEG) formed a Joint Video Team (JVT) in 2001 for development of a new Recommendation | International Standard.
The standard has since been maintained and enhanced jointly by VCEG and MPEG.
0.2 Purpose
This Recommendation | International Standard was developed in response to the growing need for higher compression of
moving pictures for various applications such as videoconferencing, digital storage media, television broadcasting,
internet streaming, and communication. It is also designed to enable the use of the coded video representation in a
flexible manner for a wide variety of network environments. The use of this Recommendation | International Standard
allows motion video to be manipulated as a form of computer data and to be stored on various storage media, transmitted
and received over existing and future networks and distributed on existing and future broadcasting channels.
0.3 Applications
This Recommendation | International Standard is designed to cover a broad range of applications for video content
including but not limited to the following:
⎯ CATV: cable TV on optical networks, copper, etc.
⎯ DBS: direct broadcast satellite video services.
⎯ DSL: digital subscriber line video services.
⎯ DTTB: digital terrestrial television broadcasting.
⎯ ISM: interactive storage media (optical disks, etc.).
⎯ MMM: multimedia mailing.
⎯ MSPN: multimedia services over packet networks.
⎯ RTC: real-time conversational services (videoconferencing, videophone, etc.).
⎯ RVS: remote video surveillance.
⎯ SSM: serial storage media (digital VTR, etc.).
0.4 Publication and versions of this document
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 1 refers to the first approved version of this Recommendation |
International Standard.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 2 refers to the integrated text containing the corrections specified in the
first technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 3 refers to the integrated text containing both the first technical
corrigendum (2004) and the first amendment, which is referred to as the "Fidelity range extensions".
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 4 refers to the integrated text containing the first technical corrigendum
(2004), the first amendment (the "Fidelity range extensions"), and an additional technical corrigendum (2005).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 5 refers to the integrated version 4 text with its specification of the
High 4:4:4 profile removed.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 6 refers to the integrated version 5 text after its amendment to support
additional colour space indicators.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 7 refers to the integrated version 6 text after its amendment to define five
new profiles intended primarily for professional applications (the High 10 Intra, High 4:2:2 Intra, High 4:4:4 Intra,
© ISO/IEC 2022 – All rights reserved vii

---------------------- Page: 7 ----------------------
ISO/IEC 14496-10:2022(E)
CAVLC 4:4:4 Intra, and High 4:4:4 Predictive profiles) and two new types of supplemental enhancement information
(SEI) messages (the post-filter hint SEI message and the tone mapping information SEI message).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 8 refers to the integrated version 7 text after its amendment to specify
scalable video coding in three profiles (Scalable Baseline, Scalable High, and Scalable High Intra profiles).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 9 refers to the integrated version 8 text after applying the corrections
specified in a third technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 10 refers to the integrated version 9 text after its amendment to specify a
profile for multiview video coding (the Multiview High profile) and to define additional SEI messages.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 11 refers to the integrated version 10 text after its amendment to define a
new profile (the Constrained Baseline profile) intended primarily to enable implementation of decoders supporting only
the common subset of capabilities supported in various previously-specified profiles.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 12 refers to the integrated version 11 text after its amendment to define a
new profile (the Stereo High profile) for two-view video coding with support of interlaced coding tools and to specify an
additional SEI message specified as the frame packing arrangement SEI message. The changes for versions 11 and 12
were processed as a single amendment in the ISO/IEC approval process.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 13 refers to the integrated version 12 text with various minor corrections
and clarifications as specified in a fourth technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 14 refers to the integrated version 13 text after its amendment to define a
new level (Level 5.2) supporting higher processing rates in terms of maximum macroblocks per second and a new profile
(the Progressive High profile) to enable implementation of decoders supporting only the frame coding tools of the
previously-specified High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 15 refers to the integrated version 14 text with miscellaneous corrections
and clarifications as specified in a fifth technical corrigendum.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 16 refers to the integrated version 15 text after its amendment to define
three new profiles intended primarily for communication applications (the Constrained High, Scalable Constrained
Baseline, and Scalable Constrained High profiles).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 17 refers to the integrated version 16 text after its amendment to define
additional supplemental enhancement information (SEI) message data, including the multiview view position SEI
message, the display orientation SEI message, and two additional frame packing arrangement type indication values for
the frame packing arrangement SEI message (the 2D content and tiled arrangement type indication values).
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 18 refers to the integrated version 17 text after its amendment to specify
the coding of depth signals, including the specification of an additional profile, the Multiview Depth High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 19 refers to the integrated version 18 text after incorporating a correction
to the sub-bitstream extraction process for multiview video coding.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 20 refers to the integrated version 19 text after its amendment to specify
the combined coding of video view and depth enhancement, including the specification of an additional profile, the
Enhanced Multiview Depth High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 21 refers to the integrated version 20 text after its amendment to specify
additional colorimetry identifiers and an additional model type in the tone mapping information SEI message.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 22 refers to the integrated version 21 text after its amendment to specify
multi-resolution frame-compatible (MFC) enhancement for stereoscopic video coding, including the specification of an
additional profile, the MFC High profile.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 23 refers to the integrated version 22 text after its amendment to specify
multi-resolution frame-compatible (MFC) stereoscopic video with depth maps, including the specification of an
additional profile, the MFC Depth High profile, and the mastering display colour volume SEI message, additional colour-
related video usability information codepoint identifiers, and miscellaneous minor corrections and clarifications.
ITU-T Rec. H.264 | ISO/IEC 14496-10 version 24 refers to the integrated version 23 text after its amendment to specify
additional levels of decoder capability supporting larger picture sizes (Levels 6, 6.1, and 6.2), the green metadata SEI
message, the alternative depth information SEI message, additional colour-related video usability information codepoint
identifiers, and miscellaneous minor corrections and clarifications.
viii © ISO/IEC 2022 – All rights reserved

---------------------- Page: 8 ----------------------
ISO/IEC 14496-10:2022(E)
Rec. ITU-T H.264 | ISO/IEC 14496-10 version 25 refers to the integrated version 24 text after its amendment to specify
the Progressive High 10 profile; support for additional colour-related indicators, including the hybrid log-gamma transfer
characteristics indication, the alternative transfer characteristics SEI message, the IC C colour matrix transformation,
T P
chromaticity-derived constant luminance and non-constant luminance colour matrix coefficients, the colour remapping
information SEI message, and miscellaneous minor corrections and clarifications.
Rec. ITU-T H.264 | ISO/IEC 14496-10 version 26 refers to the integrated version 25 text after its amendment to
specify additional SEI messages for ambient viewing environment, content light level information, content colour
volume, equirectangular projection, cubemap projection, sphere rotation, region-wise packing, omnidirectional
viewport, SEI manifest, and SEI prefix indication, and miscellaneous minor corrections and clarifications.
Rec. ITU-T H.264 | ISO/IEC 14496-10 version 27 (the current document) refers to the integrated version 26 text
after its amendment to specify additional SEI messages for annotated regions (through referencing to Rec. ITU-T
H.274 | ISO/IEC 23002-7) and shutter interval information, and miscellaneous minor corrections and
clarifications.
This edition corresponds in technical content to the fourteenth edition in ITU-T (approved in August 2021).
0.5 Profiles and levels
This document is designed to be generic in the sense that it serves a wide range of applications, bit rates, resolutions,
qualities, and services. Applications sh
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.