Information technology — Coding of audio-visual objects — Part 10: Advanced video coding

This document specifies advanced video coding for coding of audio-visual objects.

Technologies de l'information — Codage des objets audiovisuels — Partie 10: Codage visuel avancé

General Information

Status
Published
Publication Date
07-Nov-2022
Current Stage
6060 - International Standard published
Due Date
30-Jan-2023
Completion Date
08-Nov-2022
Ref Project

RELATIONS

Buy Standard

Standard
ISO/IEC 14496-10:2022 - Information technology — Coding of audio-visual objects — Part 10: Advanced video coding Released:8. 11. 2022
English language
867 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (sample)

INTERNATIONAL ISO/IEC
STANDARD 14496-10
Tenth edition
2022-11
Information technology — Coding of
audio-visual objects —
Part 10:
Advanced video coding
Technologies de l'information — Codage des objets audiovisuels —
Partie 10: Codage visuel avancé
Reference number
ISO/IEC 14496-10:2022(E)
© ISO/IEC 2022
---------------------- Page: 1 ----------------------
ISO/IEC 14496-10:2022(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2022

All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may

be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on

the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below

or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
© ISO/IEC 2022 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC 14496-10:2022(E)
Contents
Foreword vi
0 Introduction vii
1 Scope 1
2 Normative references 1
3 Terms and definitions 1
4 Abbreviated terms 28
5 Conventions 29

5.1 Arithmetic operators ........................................................................................................................................... 29

5.2 Logical operators ................................................................................................................................................ 30

5.3 Relational operators ............................................................................................................................................ 30

5.4 Bit-wise operators .............................................................................................................................................. 30

5.5 Assignment operators ......................................................................................................................................... 31

5.6 Range notation.................................................................................................................................................... 31

5.7 Mathematical functions ...................................................................................................................................... 31

5.8 Order of operation precedence ........................................................................................................................... 32

5.9 Variables, syntax elements, and tables ............................................................................................................... 33

5.10 Text description of logical operations ................................................................................................................ 34

5.11 Processes ............................................................................................................................................................ 35

6 Source, coded, decoded and output data formats, scanning processes, and neighbouring relationships 35

6.1 Bitstream formats ............................................................................................................................................... 35

6.2 Source, decoded, and output picture formats ..................................................................................................... 35

6.3 Spatial subdivision of pictures and slices ........................................................................................................... 40

6.4 Inverse scanning processes and derivation processes for neighbours ................................................................ 41

6.4.1 Inverse macroblock scanning process ........................................................................................................ 41

6.4.2 Inverse macroblock partition and sub-macroblock partition scanning process .......................................... 42

6.4.3 Inverse 4x4 luma block scanning process .................................................................................................. 43

6.4.4 Inverse 4x4 Cb or Cr block scanning process for ChromaArrayType equal to 3 ....................................... 44

6.4.5 Inverse 8x8 luma block scanning process .................................................................................................. 44

6.4.6 Inverse 8x8 Cb or Cr block scanning process for ChromaArrayType equal to 3 ....................................... 44

6.4.7 Inverse 4x4 chroma block scanning process .............................................................................................. 44

6.4.8 Derivation process of the availability for macroblock addresses ............................................................... 45

6.4.9 Derivation process for neighbouring macroblock addresses and their availability .................................... 45

6.4.10 Derivation process for neighbouring macroblock addresses and their availability in MBAFF frames ...... 45

6.4.11 Derivation processes for neighbouring macroblocks, blocks, and partitions ............................................. 46

6.4.12 Derivation process for neighbouring locations ........................................................................................... 51

6.4.13 Derivation processes for block and partition indices .................................................................................. 54

7 Syntax and semantics 55

7.1 Method of specifying syntax in tabular form ..................................................................................................... 55

7.2 Specification of syntax functions, categories, and descriptors ........................................................................... 56

7.3 Syntax in tabular form ........................................................................................................................................ 58

7.3.1 NAL unit syntax ......................................................................................................................................... 58

7.3.2 Raw byte sequence payloads and RBSP trailing bits syntax ...................................................................... 59

7.3.3 Slice header syntax ..................................................................................................................................... 67

7.3.4 Slice data syntax ......................................................................................................................................... 72

7.3.5 Macroblock layer syntax ............................................................................................................................ 73

7.4 Semantics ........................................................................................................................................................... 80

7.4.1 NAL unit semantics .................................................................................................................................... 80

7.4.2 Raw byte sequence payloads and RBSP trailing bits semantics ................................................................. 90

7.4.3 Slice header semantics .............................................................................................................................. 104

7.4.4 Slice data semantics .................................................................................................................................. 115

7.4.5 Macroblock layer semantics ..................................................................................................................... 116

8 Decoding process 129

8.1 NAL unit decoding process .............................................................................................................................. 130

8.2 Slice decoding process ..................................................................................................................................... 131

8.2.1 Decoding process for picture order count ................................................................................................. 131

© ISO/IEC 2022 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/IEC 14496-10:2022(E)

8.2.2 Decoding process for macroblock to slice group map .............................................................................. 135

8.2.3 Decoding process for slice data partitions ................................................................................................ 138

8.2.4 Decoding process for reference picture lists construction ......................................................................... 139

8.2.5 Decoded reference picture marking process ............................................................................................. 146

8.3 Intra prediction process ..................................................................................................................................... 150

8.3.1 Intra_4x4 prediction process for luma samples ......................................................................................... 151

8.3.2 Intra_8x8 prediction process for luma samples ......................................................................................... 157

8.3.3 Intra_16x16 prediction process for luma samples ..................................................................................... 164

8.3.4 Intra prediction process for chroma samples............................................................................................. 167

8.3.5 Sample construction process for I_PCM macroblocks ............................................................................. 171

8.4 Inter prediction process ..................................................................................................................................... 172

8.4.1 Derivation process for motion vector components and reference indices ................................................. 174

8.4.2 Decoding process for Inter prediction samples ......................................................................................... 187

8.4.3 Derivation process for prediction weights ................................................................................................ 196

8.5 Transform coefficient decoding process and picture construction process prior to deblocking filter process .. 198

8.5.1 Specification of transform decoding process for 4x4 luma residual blocks .............................................. 198

8.5.2 Specification of transform decoding process for luma samples of Intra_16x16 macroblock prediction

mode ......................................................................................................................................................... 199

8.5.3 Specification of transform decoding process for 8x8 luma residual blocks .............................................. 200

8.5.4 Specification of transform decoding process for chroma samples ............................................................ 200

8.5.5 Specification of transform decoding process for chroma samples with ChromaArrayType equal to 3 .... 202

8.5.6 Inverse scanning process for 4x4 transform coefficients and scaling lists ................................................ 203

8.5.7 Inverse scanning process for 8x8 transform coefficients and scaling lists ................................................ 204

8.5.8 Derivation process for chroma quantization parameters ........................................................................... 205

8.5.9 Derivation process for scaling functions ................................................................................................... 206

8.5.10 Scaling and transformation process for DC transform coefficients for Intra_16x16 macroblock type ..... 207

8.5.11 Scaling and transformation process for chroma DC transform coefficients ............................................. 208

8.5.12 Scaling and transformation process for residual 4x4 blocks ..................................................................... 210

8.5.13 Scaling and transformation process for residual 8x8 blocks ..................................................................... 212

8.5.14 Picture construction process prior to deblocking filter process ................................................................. 216

8.5.15 Intra residual transform-bypass decoding process .................................................................................... 218

8.6 Decoding process for P macroblocks in SP slices or SI macroblocks .............................................................. 218

8.6.1 SP decoding process for non-switching pictures ....................................................................................... 219

8.6.2 SP and SI slice decoding process for switching pictures .......................................................................... 221

8.7 Deblocking filter process .................................................................................................................................. 223

8.7.1 Filtering process for block edges .............................................................................................................. 227

8.7.2 Filtering process for a set of samples across a horizontal or vertical block edge ...................................... 229

9 Parsing process 235

9.1 Parsing process for Exp-Golomb codes ............................................................................................................ 235

9.1.1 Mapping process for signed Exp-Golomb codes ...................................................................................... 237

9.1.2 Mapping process for coded block pattern ................................................................................................. 237

9.2 CAVLC parsing process for transform coefficient levels ................................................................................. 240

9.2.1 Parsing process for total number of non-zero transform coefficient levels and number of trailing ones .. 241

9.2.2 Parsing process for level information ....................................................................................................... 244

9.2.3 Parsing process for run information .......................................................................................................... 246

9.2.4 Combining level and run information ....................................................................................................... 249

9.3 CABAC parsing process for slice data ............................................................................................................. 249

9.3.1 Initialization process ................................................................................................................................. 250

9.3.2 Binarization process .................................................................................................................................. 274

9.3.3 Decoding process flow .............................................................................................................................. 283

9.3.4 Arithmetic encoding process ..................................................................................................................... 304

Annex A (normative) Profiles and levels 311
Annex B (normative) Byte stream format 334
Annex C (normative) Hypothetical reference decoder 337
Annex D (normative) Supplemental enhancement information 358
Annex E (normative) Video usability information 449
Annex F (normative) Scalable video coding 470
Annex G (normative) Multiview video coding 695
iv © ISO/IEC 2022 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/IEC 14496-10:2022(E)
Annex H (normative) Multiview and depth video coding 762

Annex I (normative) Multiview and depth video with enhanced non-base view coding 811

Bibliography 867
© ISO/IEC 2022 – All rights reserved v
---------------------- Page: 5 ----------------------
ISO/IEC 14496-10:2022(E)
Foreword

ISO (the International Organization for Standardization) and IEC (the International Electrotechnical

Commission) form the specialized system for worldwide standardization. National bodies that are members of

ISO or IEC participate in the development of International Standards through technical committees

established by the respective organization to deal with particular fields of technical activity. ISO and IEC

technical committees collaborate in fields of mutual interest. Other international organizations, governmental

and non-governmental, in liaison with ISO and IEC, also take part in the work.

The procedures used to develop this document and those intended for its further maintenance are described in

the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of

document should be noted. This document was drafted in accordance with the editorial rules of the

ISO/IEC Directives, Part 2 (see www.iso.org/directives or www.iec.ch/members_experts/refdocs).

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent

rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. Details of any

patent rights identified during the development of the document will be in the Introduction and/or on the ISO

list of patent declarations received (see www.iso.org/patents) or the IEC list of patent declarations received

(see http://patents.iec.ch).

Any trade name used in this document is information given for the convenience of users and does not

constitute an endorsement.

For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions

related to conformity assessment, as well as information about ISO's adherence to the World Trade

Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.

In the IEC, see www.iec.ch/understanding-standards.

This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,

Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information, in collaboration

with ITU-T (as ITU-T H.264).

This tenth edition cancels and replaces the ninth edition (ISO/IEC 14496-10:2020), which has been

technically revised.
The main changes are as follows:

— addition of annotated regions and shutter interval information supplemental enhancement information

messages.

A list of all parts in the ISO/IEC 14496 series can be found on the ISO and IEC websites.

Any feedback or questions on this document should be directed to the user’s national standards body. A

complete listing of these bodies can be found at www.iso.org/members.html and www.iec.ch/national-

committees.
vi © ISO/IEC 2022 – All rights reserved
---------------------- Page: 6 ----------------------
ISO/IEC 14496-10:2022(E)
0 Introduction
0.1 Prologue

As the costs for both processing power and memory have reduced, network support for coded video data has diversified,

and advances in video coding technology have progressed, the need has arisen for an industry standard for compressed

video representation with substantially increased coding efficiency and enhanced robustness to network environments.

Toward these ends the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group

(MPEG) formed a Joint Video Team (JVT) in 2001 for development of a new Recommendation | International Standard.

The standard has since been maintained and enhanced jointly by VCEG and MPEG.
0.2 Purpose

This Recommendation | International Standard was developed in response to the growing need for higher compression of

moving pictures for various applications such as videoconferencing, digital storage media, television broadcasting,

internet streaming, and communication. It is also designed to enable the use of the coded video representation in a

flexible manner for a wide variety of network environments. The use of this Recommendation | International Standard

allows motion video to be manipulated as a form of computer data and to be stored on various storage media, transmitted

and received over existing and future networks and distributed on existing and future broadcasting channels.

0.3 Applications

This Recommendation | International Standard is designed to cover a broad range of applications for video content

including but not limited to the following:
⎯ CATV: cable TV on optical networks, copper, etc.
⎯ DBS: direct broadcast satellite video services.
⎯ DSL: digital subscriber line video services.
⎯ DTTB: digital terrestrial television broadcasting.
⎯ ISM: interactive storage media (optical disks, etc.).
⎯ MMM: multimedia mailing.
⎯ MSPN: multimedia services over packet networks.
⎯ RTC: real-time conversational services (videoconferencing, videophone, etc.).
⎯ RVS: remote video surveillance.
⎯ SSM: serial storage media (digital VTR, etc.).
0.4 Publication and versions of this document

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 1 refers to the first approved version of this Recommendation |

International Standard.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 2 refers to the integrated text containing the corrections specified in the

first technical corrigendum.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 3 refers to the integrated text containing both the first technical

corrigendum (2004) and the first amendment, which is referred to as the "Fidelity range extensions".

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 4 refers to the integrated text containing the first technical corrigendum

(2004), the first amendment (the "Fidelity range extensions"), and an additional technical corrigendum (2005).

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 5 refers to the integrated version 4 text with its specification of the

High 4:4:4 profile removed.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 6 refers to the integrated version 5 text after its amendment to support

additional colour space indicators.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 7 refers to the integrated version 6 text after its amendment to define five

new profiles intended primarily for professional applications (the High 10 Intra, High 4:2:2 Intra, High 4:4:4 Intra,

© ISO/IEC 2022 – All rights reserved vii
---------------------- Page: 7 ----------------------
ISO/IEC 14496-10:2022(E)

CAVLC 4:4:4 Intra, and High 4:4:4 Predictive profiles) and two new types of supplemental enhancement information

(SEI) messages (the post-filter hint SEI message and the tone mapping information SEI message).

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 8 refers to the integrated version 7 text after its amendment to specify

scalable video coding in three profiles (Scalable Baseline, Scalable High, and Scalable High Intra profiles).

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 9 refers to the integrated version 8 text after applying the corrections

specified in a third technical corrigendum.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 10 refers to the integrated version 9 text after its amendment to specify a

profile for multiview video coding (the Multiview High profile) and to define additional SEI messages.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 11 refers to the integrated version 10 text after its amendment to define a

new profile (the Constrained Baseline profile) intended primarily to enable implementation of decoders supporting only

the common subset of capabilities supported in various previously-specified profiles.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 12 refers to the integrated version 11 text after its amendment to define a

new profile (the Stereo High profile) for two-view video coding with support of interlaced coding tools and to specify an

additional SEI message specified as the frame packing arrangement SEI message. The changes for versions 11 and 12

were processed as a single amendment in the ISO/IEC approval process.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 13 refers to the integrated version 12 text with various minor corrections

and clarifications as specified in a fourth technical corrigendum.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 14 refers to the integrated version 13 text after its amendment to define a

new level (Level 5.2) supporting higher processing rates in terms of maximum macroblocks per second and a new profile

(the Progressive High profile) to enable implementation of decoders supporting only the frame coding tools of the

previously-specified High profile.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 15 refers to the integrated version 14 text with miscellaneous corrections

and clarifications as specified in a fifth technical corrigendum.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 16 refers to the integrated version 15 text after its amendment to define

three new profiles intended primarily for communication applications (the Constrained High, Scalable Constrained

Baseline, and Scalable Constrained High profiles).

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 17 refers to the integrated version 16 text after its amendment to define

additional supplemental enhancement information (SEI) message data, including the multiview view position SEI

message, the display orientation SEI message, and two additional frame packing arrangement type indication values for

the frame packing arrangement SEI message (the 2D content and tiled arrangement type indication values).

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 18 refers to the integrated version 17 text after its amendment to specify

the coding of depth signals, including the specification of an additional profile, the Multiview Depth High profile.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 19 refers to the integrated version 18 text after incorporating a correction

to the sub-bitstream extraction process for multiview video coding.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 20 refers to the integrated version 19 text after its amendment to specify

the combined coding of video view and depth enhancement, including the specification of an additional profile, the

Enhanced Multiview Depth High profile.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 21 refers to the integrated version 20 text after its amendment to specify

additional colorimetry identifiers and an additional model type in the tone mapping information SEI message.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 22 refers to the integrated version 21 text after its amendment to specify

multi-resolution frame-compatible (MFC) enhancement for stereoscopic video coding, including the specification of an

additional profile, the MFC High profile.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 23 refers to the integrated version 22 text after its amendment to specify

multi-resolution frame-compatible (MFC) stereoscopic video with depth maps, including the specification of an

additional profile, the MFC Depth High profile, and the mastering display colour volume SEI message, additional colour-

related video usability information codepoint identifiers, and miscellaneous minor corrections and clarifications.

ITU-T Rec. H.264 | ISO/IEC 14496-10 version 24 refers to the integrated version 23 text after its amendment to specify

additional levels of decoder capability supporting larger picture sizes (Levels 6, 6.1, and 6.2), the green metadata SEI

message, the alternative depth information SEI message, additional colour-related video usability information codepoint

identifiers, and miscellaneous minor corrections and clarifications.
viii © ISO/IEC 2022 – All rights reserved
---------------------- Page: 8 ----------------------
ISO/IEC 14496-10:2022(E)

Rec. ITU-T H.264 | ISO/IEC 14496-10 version 25 refers to the integrated version 24 text after its amendment to specify

the Progressive High 10 profile; support for additional colour-related indicators, including the hybrid log-gamma transfer

characteristics indication, the alternative transfer characteristics SEI message, the IC C colour matrix transformation,

T P

chromaticity-derived constant luminance and non-constant luminance colour matrix coefficients, the colour remapping

information SEI message, and miscellaneous minor corrections and clarifications.

Rec. ITU-T H.264 | ISO/IEC 14496-10 version 26 refers to the integrated version 25 text after its amendment to

specify additional SEI messages for ambient viewing environment, content light level information, content colour

volume, equirectangular projection, cubemap projection, sphere rotation, region-wise packing, omnidirectional

viewport, SEI manifest, and SEI prefix indication, and miscellaneous minor corrections and clarifications.

Rec. ITU-T H.264 | ISO/IEC 14496-10 version 27 (the current document) refers to the integrated version 26 text

after its amendment to specify additional SEI messages for annotated regions (through referencing to Rec. ITU-T

H.274 | ISO/IEC 23002-7) and shutter interval information, and miscellaneous minor corrections and

clarifications.

This edition corresponds in technical content to the fourteenth edition in ITU-T (approved in August 2021).

0.5 Profiles and levels

This document is designed to be generic in the sense that it serves a wide range of applications, bit rates, resolutions,

qualities, and services. Applications sh
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.