Information technology -- Coded representation of immersive media

This document specifies a video coding technology known as versatile video coding (VVC), comprising a video coding technology with a compression capability that is substantially beyond that of the prior generations of such standards and with sufficient versatility for effective use in a broad range of applications. Only the syntax format, semantics, and associated decoding process requirements are specified, while other matters such as pre-processing, the encoding process, system signalling and multiplexing, data loss recovery, post-processing, and video display are considered to be outside the scope of this document. Additionally, the internal processing steps performed within a decoder are also considered to be outside the scope of this document; only the externally observable output behaviour is required to conform to the specifications of this document. This document is designed to be generic in the sense that it serves a wide range of applications, bit rates, resolutions, qualities and services. Applications include, but are not limited to, video coding for digital storage media, television broadcasting and real-time communication. In the course of creating This document, various requirements from typical applications have been considered, necessary algorithmic elements have been developed, and these have been integrated into a single syntax. Hence, this document is designed to facilitate video data interchange among different applications.

Technologies de l'information -- Représentation codée de média immersifs

General Information

Status
Published
Publication Date
25-Sep-2022
Current Stage
4060 - Close of voting
Start Date
29-Dec-2021
Completion Date
28-Dec-2021
Ref Project

Relations

Buy Standard

Standard
ISO/IEC 23090-3:2022 - Information technology -- Coded representation of immersive media Released:26. 09. 2022
English language
592 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO/IEC
STANDARD 23090-3
Second edition
2022-09
Information technology — Coded
representation of immersive media —
Part 3:
Versatile video coding
Technologies de l'information — Représentation codée de média
immersifs —
Partie 3: Codage vidéo polyvalent
Reference number
ISO/IEC 23090-3:2022(E)
© ISO/IEC 2022

---------------------- Page: 1 ----------------------
ISO/IEC 23090-3:2022(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2022
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
  © ISO/IEC 2022 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC FDIS 23090-3:2022(E)
Contents
Foreword . vi
Introduction .vi i
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Abbreviated terms . 16
5 Conventions . 18
5.1 General . 18
5.2 Arithmetic operators . 19
5.3 Logical operators . 19
5.4 Relational operators . 19
5.5 Bit-wise operators . 20
5.6 Assignment operators . 20
5.7 Range notation . 20
5.8 Mathematical functions . 20
5.9 Order of operation precedence . 21
5.10 Variables, syntax elements and tables . 22
5.11 Text description of logical operations . 23
5.12 Processes . 24
6 Bitstream and picture formats, partitionings, scanning processes and neighbouring
relationships . 25
6.1 Bitstream formats . 25
6.2 Source, decoded and output picture formats . 25
6.3 Partitioning of pictures, subpictures, slices, tiles, and CTUs . 27
6.3.1 Partitioning of pictures into subpictures, slices, and tiles . 27
6.3.2 Block, quadtree and multi-type tree structures . 30
6.3.3 Spatial or component-wise partitionings . 30
6.4 Availability processes . 31
6.4.1 Allowed quad split process . 31
6.4.2 Allowed binary split process . 32
6.4.3 Allowed ternary split process . 34
6.4.4 Derivation process for neighbouring block availability . 35
6.5 Scanning processes. 36
6.5.1 CTB raster scanning, tile scanning, and subpicture scanning processes . 36
6.5.2 Up-right diagonal scan order array initialization process . 40
6.5.3 Horizontal and vertical traverse scan order array initialization process . 40
7 Syntax and semantics . 41
7.1 Method of specifying syntax in tabular form . 41
7.2 Specification of syntax functions and descriptors . 42
7.3 Syntax in tabular form . 44
7.3.1 NAL unit syntax . 44
7.3.2 Raw byte sequence payloads, trailing bits and byte alignment syntax . 44
7.3.3 Profile, tier, and level syntax . 64
7.3.4 DPB parameters syntax . 67
7.3.5 Timing and HRD parameters syntax . 67
7.3.6 Supplemental enhancement information message syntax . 68
7.3.7 Slice header syntax . 68
7.3.8 Weighted prediction parameters syntax . 71
7.3.9 Reference picture lists syntax . 72
© ISO/IEC 2022 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC FDIS 23090-3:2022(E)
7.3.10 Reference picture list structure syntax . 72
7.3.11 Slice data syntax . 73
7.4 Semantics. 95
7.4.1 General . 95
7.4.2 NAL unit semantics. 95
7.4.3 Raw byte sequence payloads, trailing bits and byte alignment semantics . 104
7.4.4 Profile, tier, and level semantics . 163
7.4.5 DPB parameters semantics . 169
7.4.6 Timing and HRD parameters semantics . 170
7.4.7 Supplemental enhancement information message semantics . 175
7.4.8 Slice header semantics . 175
7.4.9 Weighted prediction parameters semantics . 185
7.4.10 Reference picture lists semantics . 187
7.4.11 Reference picture list structure semantics . 188
7.4.12 Slice data semantics . 189
8 Decoding process .21 5
8.1 General decoding process .21 5
8.2 NAL unit decoding process .21 8
8.3 Slice decoding process .21 8
8.3.1 Decoding process for picture order count . 218
8.3.2 Decoding process for reference picture lists construction . 220
8.3.3 Decoding process for reference picture marking . 225
8.3.4 Decoding process for generating unavailable reference pictures . 226
8.3.5 Decoding process for symmetric motion vector difference reference indices . 227
8.3.6 Decoding process for collocated picture and no backward prediction . 228
8.4 Decoding process for coding units coded in intra prediction mode . 228
8.4.1 General decoding process for coding units coded in intra prediction mode . 228
8.4.2 Derivation process for luma intra prediction mode . 230
8.4.3 Derivation process for chroma intra prediction mode . 233
8.4.4 Cross-component chroma intra prediction mode checking process . 235
8.4.5 Decoding process for intra blocks . 236
8.5 Decoding process for coding units coded in inter prediction mode . 271
8.5.1 General decoding process for coding units coded in inter prediction mode . 271
8.5.2 Derivation process for motion vector components and reference indices . 276
8.5.3 Decoder-side motion vector refinement process . 298
8.5.4 Derivation process for geometric partitioning mode motion vector components
and reference indices . 304
8.5.5 Derivation process for subblock motion vector components and reference indices
306
8.5.6 Decoding process for inter blocks . 336
8.5.7 Decoding process for geometric partitioning mode inter blocks . 360
8.5.8 Decoding process for the residual signal of coding blocks coded in inter prediction
mode . 366
8.5.9 Decoding process for the reconstructed signal of chroma coding blocks coded in
inter prediction mode . 368
8.6 Decoding process for coding units coded in IBC prediction mode . 370
8.6.1 General decoding process for coding units coded in IBC prediction mode . 370
8.6.2 Derivation process for block vector components for IBC blocks . 372
8.6.3 Decoding process for IBC blocks . 377
8.7 Scaling, transformation and array construction process . 378
8.7.1 Derivation process for quantization parameters . 378
8.7.2 Scaling and transformation process . 380
8.7.3 Scaling process for transform coefficients . 381
8.7.4 Transformation process for scaled transform coefficients . 384
8.7.5 Picture reconstruction process . 404
8.8 In-loop filter process .40 8
8.8.1 General . 408
8.8.2 Picture inverse mapping process for luma samples . 409
iv © ISO/IEC 2022 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC FDIS 23090-3:2022(E)
8.8.3 Deblocking filter process . 410
8.8.4 Sample adaptive offset process . 440
8.8.5 Adaptive loop filter process . 442
9 Parsing process .45 5
9.1 General .45 5
9.2 Parsing process for k-th order Exp-Golomb codes .45 5
9.2.1 General . 455
9.2.2 Mapping process for signed Exp-Golomb codes . 457
9.3 CABAC parsing process for slice data .45 7
9.3.1 General . 457
9.3.2 Initialization process . 459
9.3.3 Binarization process . 482
9.3.4 Decoding process flow . 492
Annex A (normative) Profiles, tiers and levels .51 0
Annex B (normative) Byte stream format .53 0
Annex C (normative) Hypothetical reference decoder .53 3
Annex D (normative) Supplemental enhancement information and use of SEI and VUI . 559
© ISO/IEC 2022 – All rights reserved v

---------------------- Page: 5 ----------------------
ISO/IEC FDIS 23090-3:2022 (E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives or
www.iec.ch/members_experts/refdocs).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents) or the
IEC list of patent declarations received (see https://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see
www.iso.org/iso/foreword.html. In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information, in collaboration
with ITU-T (as ITU-T H.266).
This second edition cancels and replaces the first edition (ISO/IEC 23090-3:2021), which has been
technically revised.
The main changes are as follows:
— the specification of operation range extensions,
— the addition of level 6.3,
— the addition of the SEI manifest SEI message, the SEI prefix indication SEI message, and the
constrained RASL encoding indication SEI message, and
— the specification of SEI payload type values and necessary variables for use of a number of SEI
messages added to the second edition of Rec. ITU-T H.274 | ISO/IEC 23002-7.
A list of all parts in the ISO/IEC 23090 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html and www.iec.ch/national-
committees.
© ISO/IEC 2022 – All rights reserved
vi

---------------------- Page: 6 ----------------------
ISO/IEC 23090-3:2022(E)
Introduction
Purpose
This document specifies a video coding technology known as versatile video coding. It has been
designed with two primary goals. The first of these is to specify a video coding technology with a
compression capability that is substantially beyond that of the prior generations of such standards, and
the second is for this technology to be highly versatile for effective use in a broader range of
applications than that addressed by prior standards. Some key application areas for the use of this
document particularly include ultra-high-definition video (e.g., with 3840×2160 or 7620×4320 picture
resolution and bit depth of 10 bits as specified in Rec. ITU-R BT.2100), video with a high dynamic range
and wide colour gamut (e.g., with the perceptual quantization or hybrid log-gamma transfer
characteristics specified in Rec. ITU-R BT.2100), and video for immersive media applications such as
360° omnidirectional video projected using a common projection format such as the equirectangular or
cubemap projection formats, in addition to the applications that have commonly been addressed by
prior video coding standards.
Profiles, tiers, and levels
This document is designed to be versatile in the sense that it serves a wide range of applications, bit
rates, resolutions, qualities, and services. Applications include, but are not limited to, video coding for
digital storage media, television broadcasting, video streaming services, real-time communication. In
the course of creating this document, various requirements from typical applications have been
considered, necessary algorithmic elements have been developed, and these have been integrated into a
single syntax. Hence, this document is designed to facilitate video data interchange among different
applications.
Considering the practicality of implementing the full syntax of this document, however, a limited
number of subsets of the syntax are also stipulated by means of "profiles", "tiers", and "levels". These
and other related terms are formally defined in Clause 3.
A "profile" is a subset of the entire bitstream syntax that is specified in this document. Within the
bounds imposed by the syntax of a given profile it is still possible to require a very large variation in the
performance of encoders and decoders depending upon the values taken by syntax elements in the
bitstream, such as the specified size of the decoded pictures. In many applications, it is currently neither
practical nor economical to implement a decoder capable of dealing with all hypothetical uses of the
syntax within a particular profile.
In order to deal with this problem, "tiers" and "levels" are specified within each profile. A level of a tier
is a specified set of constraints imposed on values of the syntax elements in the bitstream. Some of these
constraints are expressed as simple limits on values, while others take the form of constraints on
arithmetic combinations of values (e.g. picture width multiplied by picture height multiplied by number
of pictures decoded per second). A level specified for a lower tier is more constrained than a level
specified for a higher tier.
Coded video content conforming to this document uses a common syntax. In order to achieve a subset
of the complete syntax, flags, parameters, and other syntax elements are included in the bitstream that
signal the presence or absence of syntactic elements that occur later in the bitstream.
Encoding process, decoding process, and use of VUI parameters and SEI messages
Any encoding process that produces bitstream data that conforms to the specified bitstream syntax
format requirements of this document is considered to be in conformance with the requirements of this
document. The decoding process is specified such that all decoders that conform to a specified
combination of capabilities known as the profile, tier, and level will produce numerically identical
cropped decoded output pictures when invoking the decoding process associated with that profile for a
bitstream conforming to that profile, tier and level. Any decoding process that produces identical
© ISO/IEC 2022 – All rights reserved
vii

---------------------- Page: 7 ----------------------
ISO/IEC 23090-3:2022(E)
cropped decoded output pictures to those produced by the process described herein (with the correct
output order or output timing, as specified) is considered to be in conformance with the requirements
of this document.
Rec. ITU-T H.274 | ISO/IEC 23002-7 specifies the syntax and semantics of the video usability
information (VUI) parameters and supplemental enhancement information (SEI) messages that do not
affect the conformance specifications in Annex C. These VUI parameters and SEI messages may be used
together with this document.
Versions of this document
Rec. ITU-T H.266 | ISO/IEC 23090-3 version 1 refers to the first approved version of this document. The
first edition published by ISO/IEC as ISO/IEC 23090-3:2021 corresponded to the first version.
Rec. ITU-T H.266 | ISO/IEC 23090-3 version 2 (the current version) refers to the integrated text
additionally containing operation range extensions, a new level (level 6.3), additional supplement
enhancement information, and corrections to various minor defects in the prior content of the
document. This document corresponds to the second version. At the time of publication of this
document, a corresponding second edition of Rec. ITU-T H.266 was in preparation for publication by
ITU-T.
Overview of the design characteristics
The coded representation specified in the syntax is designed to enable a high compression capability for
a desired image or video quality. The algorithm is typically not mathematically lossless, as the exact
source sample values are typically not preserved through the encoding and decoding processes,
although some modes are included that provide lossless coding capability. A number of techniques are
specified to enable highly efficient compression. Encoding algorithms (not specified within the scope of
this document) may select between inter, intra, intra block copy (IBC), and palette coding for block-
shaped regions of each picture. Inter coding uses motion vectors for block-based inter-picture
prediction to exploit temporal statistical dependencies between different pictures, intra coding uses
various spatial prediction modes to exploit spatial statistical dependencies in the source signal within
the same picture, and intra block copy coding uses block displacement vectors to reference previously
decoded regions of the same picture to exploit statistical similarities among different areas of the same
picture. Motion vectors, intra prediction modes, and IBC block vectors are specified for a variety of
block sizes in the picture. The prediction residual can then be further compressed using a spatial
transform to remove spatial correlation inside a block before it is quantized, producing a possibly
irreversible process that typically discards less important visual information while forming a close
approximation to the source samples. Finally, the motion vectors, intra prediction modes, and block
vectors can also be further compressed using a variety of prediction mechanisms, and, after prediction,
are combined with the quantized transform coefficient information and encoded using arithmetic
coding.
How to read this document
It is suggested that the reader starts with Clause 1 and moves on to Clause 3. Clause 6 should be read for
the geometrical relationship of the source, input, and output of the decoder. Clause 7 specifies the order
to parse syntax elements from the bitstream. See subclauses 7.1 to 7.3 for syntactical order and
subclause 7.4 for semantics; e.g. the scope, restrictions, and conditions that are imposed on the syntax
elements. The actual parsing for most syntax elements is specified in Clause 9. Finally, Clause 8 specifies
how the syntax elements are mapped into decoded samples. Annexes A through D also form an integral
part of this document.
Annex A specifies profiles, each being tailored to certain application domains, and defines the so-called
tiers and levels of the profiles. Annex B specifies syntax and semantics of a byte stream format for
delivery of coded video as an ordered stream of bytes. Annex C specifies the hypothetical reference
decoder, bitstream conformance, decoder conformance, and the use of the hypothetical reference
decoder to check bitstream and decoder conformance. Annex D specifies syntax and semantics for
© ISO/IEC 2022 – All rights reserved
viii

---------------------- Page: 8 ----------------------
ISO/IEC 23090-3:2022(E)
supplemental enhancement information (SEI) message payloads that affect the conformance
specifications in Annex C. Rec. ITU-T H.274 | ISO/IEC 23002-7 specifies the syntax and semantics of the
video usability information (VUI) parameters as well as SEI messages that do not affect the
conformance specifications in Annex C. These VUI parameters and SEI messages may be used together
with this document.
The term "this document" is used to refer to this Recommendation | International Standard.
In this document, the following verbal forms are used:
— “shall” indicates a requirement;
— “should” indicates a recommendation;
— “may” indi
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.