Information technology — Coding of audio-visual objects — Part 15: Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format

This document specifies the storage format for streams of video that is structured as NAL Units, such as AVC (ISO/IEC 14496-10) and HEVC (ISO/IEC 23008-2) video streams. In addition, Annex E specifies parameters and sub-parameters applying when sample entries specified in this document are used as the 'codecs' parameter of a MIME type, as specified in IETF RFC 6381.

Technologies de l'information — Codage des objets audiovisuels — Partie 15: Transport de vidéo structurée en unités NAL sur la couche réseau au format ISO de base pour les fichiers médias

General Information

Status
Published
Publication Date
10-Oct-2022
Current Stage
6060 - International Standard published
Start Date
11-Oct-2022
Due Date
24-Dec-2022
Completion Date
11-Oct-2022
Ref Project

Relations

Buy Standard

Standard
ISO/IEC 14496-15:2022 - Information technology — Coding of audio-visual objects — Part 15: Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format Released:11. 10. 2022
English language
277 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
REDLINE ISO/IEC FDIS 14496-15 - Information technology — Coding of audio-visual objects — Part 15: Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format Released:1. 07. 2022
English language
278 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
ISO/IEC FDIS 14496-15 - Information technology — Coding of audio-visual objects — Part 15: Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format Released:1. 07. 2022
English language
278 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO/IEC
STANDARD 14496-15
Sixth edition
2022-10
Information technology — Coding of
audio-visual objects —
Part 15:
Carriage of network abstraction layer
(NAL) unit structured video in the ISO
base media file format
Technologies de l'information — Codage des objets audiovisuels —
Partie 15: Transport de vidéo structurée en unités NAL sur la couche
réseau au format ISO de base pour les fichiers médias
Reference number
ISO/IEC 14496-15:2022(E)
© ISO/IEC 2022

---------------------- Page: 1 ----------------------
ISO/IEC 14496-15:2022(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2022
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
  © ISO/IEC 2022 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC 14496-15:2022(E)
Contents Page
Foreword . vi
Introduction . vii
1 Scope . 1
2 Normative references . 1
3 Terms, definitions, abbreviated terms and conventions . 1
3.1 Terms and definitions . 1
3.2 Abbreviated terms . 10
3.3 Conventions . 11
4 General definitions . 12
4.1 Overview . 12
4.2 Sample and configuration definition . 12
4.3 Video track structure . 14
4.4 Template fields used . 14
4.5 Visual width and height . 14
4.6 Decoding time (DTS) and composition time (CTS) . 15
4.7 Sample groups on random access recovery points 'roll' and random access points
'rap ' . 15
4.8 Hinting . 16
4.9 On change of sample entry (informative) . 16
4.10 SEI information box . 18
4.11 Post-decoder requirements scheme for signalling of SEI . 18
4.12 Alternative extraction source track grouping . 19
4.13 NAL unit map entry . 19
4.14 Rectangular region group entry . 21
4.15 Layer information sample group . 23
5 AVC elementary streams and sample definitions . 25
5.1 Overview . 25
5.2 Elementary stream structure . 25
5.3 Sample and configuration definition . 28
5.4 Derivation from ISO base media file format . 32
6 SVC elementary stream and sample definitions. 44
6.1 Overview . 44
6.2 Elementary stream structure . 44
6.3 Use of the plain AVC file format . 45
6.4 Sample and configuration definition . 45
6.5 Derivation from the ISO base media file format . 47
7 MVC and MVD elementary stream and sample definitions . 53
7.1 Overview . 53
7.2 Overview of MVC or MVD Storage . 55
7.3 MVC and MVD elementary stream structures . 56
© ISO/IEC 2022 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC 14496-15:2022(E)
7.4 Use of the plain AVC file format . 57
7.5 Sample and configuration definition . 58
7.6 Derivation from the ISO base media file format . 61
7.7 MVC specific information boxes. 76
8 HEVC elementary streams and sample definitions . 86
8.1 Overview . 86
8.2 Elementary stream structure . 86
8.3 Sample and configuration definition . 87
8.4 Derivation from ISO base media file format . 92
9 Layered HEVC elementary stream and sample definitions . 101
9.1 Overview . 101
9.2 Overview of L-HEVC storage . 102
9.3 L-HEVC elementary stream structure . 103
9.4 Sample and configuration definition . 103
9.5 Derivation from the ISO base media file format and the HEVC file format (Clause 8)
. 105
9.6 L-HEVC specific structures . 116
10 Storage of tiled HEVC and L-HEVC video streams . 122
10.1 Overview . 122
10.2 NAL unit map entry . 123
10.3 Tile region group entry . 123
10.4 Tile sub track definition . 123
10.5 HEVC and L-HEVC tile track . 124
10.6 HEVC slice segment data track . 129
11 VVC elementary streams and sample definitions . 130
11.1 Overview . 130
11.2 Sample and configuration definition . 137
11.3 Derivation from ISO base media file format . 146
11.4 Sample groups . 160
11.5 Entity groups . 180
11.6 Data sharing and VVC bitstream reconstruction . 188
12 EVC elementary streams and sample definitions . 199
12.1 Overview . 199
12.2 Elementary stream structure . 199
12.3 Sample and configuration definition . 200
12.4 Derivation from ISO base media file format . 203
Annex A (normative) In-stream structures . 210
Annex B (normative) SVC, MVC, and MVD sample group and sub-track definitions . 228
Annex C (normative) Temporal metadata support . 251
Annex D (normative) File format toolsets and brands . 260
Annex E (normative) Sub-parameters for the MIME type ‘codecs’ parameter. 264
iv © ISO/IEC 2022 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC 14496-15:2022(E)
Annex F (informative) Unspecified nal_unit_type value management for sample entry types of
AVC and HEVC . 273
Annex G (informative) Examples of VVC base and subpicture tracks. 275

© ISO/IEC 2022 – All rights reserved v

---------------------- Page: 5 ----------------------
ISO/IEC 14496-15:2022(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of document should be noted. This document was drafted in accordance with the editorial
rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives or
www.iec.ch/members_experts/refdocs).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. Details
of any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www.iso.org/patents) or the IEC list of patent
declarations received (see https://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the World
Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see
www.iso.org/iso/foreword.html. In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
This sixth edition cancels and replaces the fifth edition (ISO/IEC 14496-15:2019), which has been
technically revised. It also incorporates the Amendment ISO/IEC 14496-15:2019/Amd 1:2020.
The main changes are as follows:
- Support for the Versatile Video Coding (ISO/IEC 23090-3) and Essential Video Coding (ISO/IEC
23094-1)
- Addition of sample entry types 'hvc3’, ‘hev3’, ‘hvt2’, and ‘hvt3’ targeted at tile-based
delivery and merging of High Efficiency Video Coding (ISO/IEC 23008-2) bitstreams
A list of all parts in the ISO/IEC 14496 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html and www.iec.ch/national-
committees.

vi © ISO/IEC 2022 – All rights reserved

---------------------- Page: 6 ----------------------
ISO/IEC 14496-15:2022(E)
Introduction
This document defines a storage format based on, and compatible with, the ISO Base Media File Format
(ISO/IEC 14496-12), which is used by the MP4 file format (ISO/IEC 14496-14) and the Motion JPEG 2000
file format (ISO/IEC 15444-3) among others. This document enables video streams formatted as Network
Adaptation Layer Units (NAL Units) to
a) be used in conjunction with other media streams, such as audio,
b) be used in an MPEG-4 systems environment, if desired,
c) be formatted for delivery by a streaming server, using hint tracks, and
d) inherit all the use cases and features of the ISO Base Media File Format on which MP4 and MJ2 are
based.
This document may be used as a standalone document; it specifies how NAL unit structured video content
shall be stored in an ISO Base Media File Format compliant format. However, it is normally used in the
context of a specification, such as the MP4 file format, derived from the ISO Base Media File Format, that
permits the use of NAL unit structured video such as AVC (ISO/IEC 14496-10) video and High Efficiency
Video Coding (HEVC, ISO/IEC 23008-2) video.
The ISO Base Media File Format is becoming increasingly common as a general-purpose media container
format for the exchange of digital media, and its use in this context should accelerate both adoption and
interoperability.
The International Organization for Standardization (ISO) and International Electrotechnical Commission
(IEC) draw attention to the fact that it is claimed that compliance with this document may involve the use
of a patent.
ISO and IEC take no position concerning the evidence, validity and scope of this patent right.
The holder of this patent right has assured ISO and IEC that he/she is willing to negotiate licences
under reasonable and non-discriminatory terms and conditions with applicants throughout the world.
In this respect, the statement of the holder of this patent right is registered with ISO and IEC.
Information may be obtained from the patent database available at www.iso.org/patents or
patents.iec.ch.
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights other than those in the patent database. ISO and IEC shall not be held responsible for
identifying any or all such patent rights.
© ISO/IEC 2022 – All rights reserved vii

---------------------- Page: 7 ----------------------
INTERNATIONAL STANDARD ISO/IEC 14496-15:2022(E)

Information technology — Coding of audio-visual
objects —
Part 15:
Carriage of network abstraction layer (NAL) unit
structured video in the ISO base media file format
1 Scope
This document specifies the storage format for streams of video that is structured as NAL Units, such as
AVC (ISO/IEC 14496-10) and HEVC (ISO/IEC 23008-2) video streams. In addition, Annex E specifies
parameters and sub-parameters applying when sample entries specified in this document are used as the
'codecs' parameter of a MIME type, as specified in IETF RFC 6381.
2 Normative references
The following documents, in whole or in part, are normatively referenced in this document and are
indispensable for its application. For dated references, only the edition cited applies. For undated
references, the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 14496-12:2020, Information technology — Coding of audio-visual objects — Part 12: ISO base
media file format
ISO/IEC 14496-10:2020, Information technology — Coding of audio-visual objects — Part 10: Advanced
Video Coding
ISO/IEC 23008-2:2020, Information technology — High efficiency coding and media delivery in
heterogeneous environments — Part 2: High efficiency video coding
ISO/IEC 23090-3:2021, Information technology — Coded representation of immersive media — Part 3:
Versatile video coding
ISO/IEC 23094-1:2020, Information technology — General video coding — Part 1: Essential video coding
IETF RFC 4648, The Base16, Base32, and Base64 data encodings
IETF RFC 6381, MIME codecs and profiles
3 Terms, definitions, abbreviated terms and conventions
3.1 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 14496-10,
ISO/IEC 23008-2, ISO/IEC 23090-3 or ISO/IEC 23094-1, and the following apply.
© ISO/IEC 2022 – All rights reserved 1

---------------------- Page: 8 ----------------------
ISO/IEC 14496-15:2022(E)
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https://www.iso.org/obp
— IEC Electropedia: available at https://www.electropedia.org/
3.1.1
3D-AVC NAL unit
3D AVC VCL NAL unit
NAL unit with type 21 with avc_3d_extension_flag equal to 1
3.1.2
aggregator
in-stream structure using a NAL unit header for grouping of NAL units belonging to the same sample
3.1.3
alternate region set
set of rectangular regions that are alternatives to be used as a rectangular region when reconstructing a
VVC bitstream from a VVC extraction base track
3.1.4
applicable video coding standard
video coding standard for the data carried in the track
Note 1 to entry: The video coding standard can be ISO/IEC 14496-10, ISO/IEC 23008-2, ISO/IEC 23090-3, or
ISO/IEC 23094-1.
3.1.5
AU- or picture-level non-VCL NAL unit
non-VCL NAL unit that applies to one or more entire AUs or one or more entire pictures
Note 1 to entry: An AU-level non-VCL NAL unit applies to one or more entire AUs. A picture-level non-VCL NAL unit
applies to one or more entire pictures. In VVC, AU-level or picture-level non-VCL NAL units include: 1) all the DCI, OPI, VPS,
SPS, PPS, AUD, PH, EOS, and EOB NAL units; 2) APS NAL units that apply to one or more entire AUs or pictures; and 3) SEI
NAL units that only contain SEI messages that apply to one or more entire AUs or pictures.
3.1.6
AVC base layer
maximum subset of a bitstream that is AVC compatible
Note 1 to entry: The AVC base layer is represented by AVC VCL NAL units and associated non-VCL NAL units. The AVC
base layer is not using any of the functionality of ISO/IEC 14496-10:2020, Annex G, Annex H, Annex I, or Annex J.
Note 2 to entry: The AVC base layer itself can be a temporal scalable bitstream.
3.1.7
AVC parameter set sample
sample in a parameter set elementary stream that consists of those parameter set NAL units that are to
be considered as if present in the video elementary stream at the same instant in time
3.1.8
AVC sample
access unit as defined in ISO/IEC 14496-10
2 © ISO/IEC 2022 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/IEC 14496-15:2022(E)
3.1.9
AVC NAL unit
AVC VCL NAL unit and its associated non-VCL NAL units in a bitstream
3.1.10
AVC VCL NAL unit
NAL unit with type 1 to 5 (inclusive)
3.1.11
canonical order
order of NAL units that conforms to the applicable video standard
Note 1 to entry: When a single track carries a video bitstream, the NAL units are stored in the canonical order. When
multiple tracks are used to a carry a video bitstream, an implicit or explicit video bitstream reconstruction process might
be applied to recover the canonical order.
3.1.12
canonical stream format
elementary stream that contains NAL units in the canonical order and conforms to the constraints
specified in this document for carrying an elementary stream of the applicable video standard in one or
more tracks
3.1.13
complete subset
minimal set of tracks that contain all the information in the original bitstream
3.1.14
cropped frame dimensions
width and height of the decoded frame after applying the output cropping parameters
3.1.15
default sample group description index
default_group_description_index of SampleGroupDescriptionBox with version
greater than or equal to 2
3.1.16
elementary stream
sequence of one or more bitstreams of the applicable video standard
Note 1 to entry: The term elementary stream is not directly related to the terms video elementary stream, parameter
set elementary stream, and video and parameter set elementary stream.
Note 2 to entry: The applicable video standard can be included as a prefix to the term elementary stream. For example,
an AVC elementary stream refers to an elementary stream that is a sequence of one or more bitstreams conforming to
ISO/IEC 14496-10.
3.1.17
extractor
in-stream structure using a NAL unit header for extraction of data from other tracks
Note 1 to entry: Extractors contain instructions on how to extract data from other tracks. Logically an Extractor can be
seen as a pointer to data. While reading a track containing Extractors, the Extractor is replaced by the data it is pointing to.
© ISO/IEC 2022 – All rights reserved 3

---------------------- Page: 10 ----------------------
ISO/IEC 14496-15:2022(E)
3.1.18
HEVC sample
access unit as defined in ISO/IEC 23008-2
3.1.19
implicit reconstruction
reconstruction of a stream of access units from two or more tracks not using extractors
3.1.20
in-stream structure
structure residing within sample data
3.1.21
layer
scalable layer
set of VCL NAL units with the same values of dependency_id, quality_id, and
temporal_id, and the associated non-VCL NAL units
Note 1 to entry: A scalable layer with any of dependency_id, quality_id, and temporal_id not equal to 0 enhances the
video by one or more scalability levels in at least one direction (temporal, quality or spatial resolution)
Note 2 to entry: SVC uses a “layered” encoder design that results in a bitstream representing “coding layers”. In some
publications the ‘base layer’ is the first quality layer of a specific coding layer. In some publications the base layer is the
scalable layer with the lowest priority. The SVC file format uses “scalable layer” or “layer” in a general way for describing
nested bitstreams (using terms like AVC base layer or SVC enhancement layer).
3.1.22
layer
scalable layer
set of VCL NAL units with the same value of nuh_layer_id and the associated non-VCL
NAL units
3.1.23
layer set
set of layers represented within a bitstream created from another bitstream by operation of the sub-
bitstream extraction process
3.1.24
L-HEVC sample
picture units that are within an access unit as specified in Annex F of ISO/IEC 23008-2:2020 and are
represented by the track
3.1.25
MVC NAL unit
MVC VCL NAL unit and its associated non-VCL NAL units in an MVC stream
Note 1 to entry: The association of non-VCL NAL units with MVC VCL NAL units is specified in ISO/IEC 14496-10:2020,
Annex H.
3.1.26
MVC sample
one or more view components as defined in Annex H of ISO/IEC 14496-10:2020 and the associated non-
VCL NAL units
4 © ISO/IEC 2022 – All rights reserved

---------------------- Page: 11 ----------------------
ISO/IEC 14496-15:2022(E)
3.1.27
MVC VCL NAL unit
NAL unit with type 20, and NAL units with type 14 when the immediately following NAL units are AVC
VCL NAL units
Note 1 to entry: MVC VCL NAL units do not affect the decoding process of a legacy AVC decoder.
3.1.28
MVC+D depth NAL unit
MVC+D depth VCL NAL unit
NAL unit with type 21 containing a coded slice extension for a depth view component
3.1.29
MVD NAL unit
MVD VCL NAL unit
NAL unit with type 21, containing a coded slice extension for a depth view component coded with MVC+D
or 3D-AVC, or a 3D-AVC texture view component
3.1.30
MVD sample
one or more view components as defined in Annex I or Annex J of ISO/IEC 14496-10:2020 and the
associated non-VCL NAL units, where each view component contains a texture view component, a depth
view component or both
3.1.31
NAL-unit-like structure
data structure that is similar to NAL units in the sense that it also has a NAL unit header and a payload,
with a difference that the payload might not follow the start code emulation prevention mechanism
required for the NAL unit syntax
3.1.32
natively present
not included in an aggregator or an extractor
Note 1 to entry: Data referred to by (hence not included in) an aggregator is considered as natively present. Data
included in an aggregator is not considered as natively present.
3.1.33
operating point
independently decodable subset of a layered bitstream
Note 1 to entry: Each operating point consists of all the data needed to decode this particular bitstream subset.
Note 2 to entry: In an SVC stream an operating point represents a particular spatial resolution, temporal resolution,
and quality, and can be represented either by (i) specific values of DTQ (dependency_id, temporal_id and quality_id) or (ii)
specific values of P (priority_id) or (iii) combinations of them (e.g. PDTQ). Note that the usage of priority_id is defined by
the application. In an SVC file a track represents one or more operating points. Within a track tiers can be used to define
multiple operating points.
Note 3 to entry: The bitstream subset of an MVC or MVD operating point represents a particular set of target output
views at a particular temporal resoluti
...

ISO/IEC 14496-15:20212022(E)
ISO/IEC JTC 1/SC 29/WG 03
Secretariat: JISC
Information technology — Coding of audio-visual objects —
Part 15: Carriage of network abstraction layer (NAL) unit
structured video in the ISO base media file format

FDIS stage

Warning for WDs and CDs
This document is not an ISO International Standard. It is distributed for review and comment. It is subject to
change without notice and may not be referred to as an International Standard.
Recipients of this draft are invited to submit, with their comments, notification of any relevant patent rights of
which they are aware and to provide supporting documentation.

---------------------- Page: 1 ----------------------
ISO #####-#:####(X)
© ISO/IEC 20202022
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this
publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical,
including photocopying, or posting on the internet or an intranet, without prior written permission. Permission
can be requested from either ISO at the address below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland


2 © ISO #### – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC FDIS 14496-15:2014(E)

ISO/IEC FDIS 14496-15:2022(E)
Contents Page
1 Scope 11
2 Normative references 11
3 Terms, definitions, abbreviated terms and conventions 22
3.1 Terms and definitions 22
3.2 Abbreviated terms 1010
3.3 Mathematical functions 12 Field Code Changed
4 General definitions 13 Field Code Changed
4.1 Overview 13 Field Code Changed
4.2 Sample and configuration definition 13 Field Code Changed
4.3 Video track structure 1515
4.4 Template fields used 1515
4.5 Visual width and height 16 Field Code Changed
4.6 Decoding time (DTS) and composition time (CTS) 16 Field Code Changed
4.7 Sample groups on random access recovery points 'roll' and random access points
'rap ' 16 Field Code Changed
4.8 Hinting 1717
4.9 On change of sample entry (informative) 1717
4.10 SEI information box 1919
4.11 Post-decoder requirements scheme for signalling of SEI 2020
4.12 Alternative extraction source track grouping 2020
4.13 NAL unit map entry 21 Field Code Changed
4.14 Rectangular region group entry 2222
4.15 Layer information sample group 25 Field Code Changed
5 AVC elementary streams and sample definitions 2626
5.1 Overview 2626
5.2 Elementary stream structure 27 Field Code Changed
5.3 Sample and configuration definition 3030
5.4 Derivation from ISO base media file format 35 Field Code Changed
6 SVC elementary stream and sample definitions 47 Field Code Changed
6.1 Overview 47 Field Code Changed
6.2 Elementary stream structure 47 Field Code Changed
© ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights iii
reserved

---------------------- Page: 3 ----------------------
ISO/IEC FDIS 14496-15:2014(E)ISO/IEC FDIS 14496-15:2022(E)
6.3 Use of the plain AVC file format 48 Field Code Changed
6.4 Sample and configuration definition 49 Field Code Changed
6.5 Derivation from the ISO base media file format 51 Field Code Changed
7 MVC and MVD elementary stream and sample definitions 57 Field Code Changed
7.1 Overview 57 Field Code Changed
7.2 Overview of MVC or MVD Storage 59
Field Code Changed
7.3 MVC and MVD elementary stream structures 60 Field Code Changed
7.4 Use of the plain AVC file format 61 Field Code Changed
7.5 Sample and configuration definition 62
Field Code Changed
7.6 Derivation from the ISO base media file format 65
Field Code Changed
7.7 MVC specific information boxes 82
Field Code Changed
8 HEVC elementary streams and sample definitions 92 Field Code Changed
8.1 Overview 92 Field Code Changed
8.2 Elementary stream structure 93
Field Code Changed
8.3 Sample and configuration definition 93 Field Code Changed
8.4 Derivation from ISO base media file format 98 Field Code Changed
9 Layered HEVC elementary stream and sample definitions 108 Field Code Changed
9.1 Overview 108 Field Code Changed
9.2 Overview of L-HEVC storage 109 Field Code Changed
9.3 L-HEVC elementary stream structure 110 Field Code Changed
9.4 Sample and configuration definition 110110
9.5 Derivation from the ISO base media file format and the HEVC file format (clause 8)112 Field Code Changed
9.6 L-HEVC specific structures 123 Field Code Changed
10 Storage of tiled HEVC and L-HEVC video streams 129 Field Code Changed
10.1 Overview 129 Field Code Changed
10.2 NAL unit map entry 131
Field Code Changed
10.3 Tile region group entry 131 Field Code Changed
10.4 Tile sub track definition 131 Field Code Changed
10.5 HEVC and L-HEVC tile track 132
Field Code Changed
10.6 HEVC slice segment data track 137
Field Code Changed
11 VVC elementary streams and sample definitions 138 Field Code Changed
11.1 Overview 138 Field Code Changed
11.2 Sample and configuration definition 145 Field Code Changed
11.3 Derivation from ISO base media file format 155
Field Code Changed
11.4 Sample groups 169
Field Code Changed
11.5 Entity groups 190
Field Code Changed
11.6 Data sharing and VVC bitstream reconstruction 198
Field Code Changed
12 EVC elementary streams and sample definitions 210 Field Code Changed
12.1 Overview 210 Field Code Changed
iv © ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights
reserved

---------------------- Page: 4 ----------------------
ISO/IEC FDIS 14496-15:2014(E)

ISO/IEC FDIS 14496-15:2022(E)
12.2 Elementary stream structure 211 Field Code Changed
12.3 Sample and configuration definition 211 Field Code Changed
12.4 Derivation from ISO base media file format 215 Field Code Changed
Field Code Changed
Annex A (normative) In-stream structures 222
A.1 General 222 Field Code Changed
A.2 Aggregators 222 Field Code Changed
A.3 Extractors for SVC, MVC, and MVD tracks 225 Field Code Changed
A.4 NAL unit header values for SVC 227
Field Code Changed
A.5 NAL unit header values for MVC and MVC+D depth NAL units 228
Field Code Changed
A.6 NAL unit header values for 3D-AVC NAL units 228
Field Code Changed
A.7 Extractors for HEVC and L-HEVC tracks 229
Field Code Changed
A.7.6.1 Syntax 232
Field Code Changed
A.7.6.2 Semantics 232
Field Code Changed
A.7.7.1 Overview 233
Field Code Changed
A.7.7.2 Reference constructors 233
Field Code Changed
A.7.7.3 Default HEVC extractor constructor box 235
Field Code Changed
A.7.8.1 Definition 236
Field Code Changed
A.7.8.2 Syntax 236
Field Code Changed
A.7.8.3 Semantics 236
Field Code Changed
A.8 NAL unit header values for ISO/IEC 23008-2 236
Field Code Changed
A.9 Slice segment header information NAL-unit-like structure 237
Field Code Changed
Field Code Changed
Annex B (normative) SVC, MVC, and MVD sample group and sub-track definitions 240
B.1 General 240 Field Code Changed
B.2 Definition 241 Field Code Changed
B.3 Mapping NAL units to map groups and tiers 256 Field Code Changed
B.4 Decode re-timing groups 258
Field Code Changed
B.5 View priority sample grouping 258
Field Code Changed
B.6 Sub track definitions 260
Field Code Changed
Annex C (normative) Temporal metadata support 263 Field Code Changed
C.1 General 263 Field Code Changed
C.2 Connection to the video media data 264
Field Code Changed
C.3 SVC meta data sample entry 265 Field Code Changed
C.4 Helper functions 268 Field Code Changed
C.5 Statement types 268
Field Code Changed
Annex D (normative) File format toolsets and brands 272 Field Code Changed
D.1 General 272 Field Code Changed
© ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights v
reserved

---------------------- Page: 5 ----------------------
ISO/IEC FDIS 14496-15:2014(E)ISO/IEC FDIS 14496-15:2022(E)
D.2 SVC Toolsets 272 Field Code Changed
D.3 MVC and MVD toolsets 272 Field Code Changed
D.4 L-HEVC brands 273 Field Code Changed
D.5 No Leading Picture Sync Brand 275 Field Code Changed
Field Code Changed
Annex E (normative) Sub-parameters for the MIME type ‘codecs’ parameter 276
E.1 General 276 Field Code Changed
E.2 AVC family 276 Field Code Changed
E.3 HEVC 277 Field Code Changed
E.4 L-HEVC 278
Field Code Changed
E.5 HEVC and L-HEVC tile tracks 281
Field Code Changed
E.6 VVC 281
Field Code Changed
E.7 VVC non-VCL tracks 283
Field Code Changed
E.8 VVC subpicture tracks 283
Field Code Changed
E.9 EVC 283283
Annex F (informative) Unspecified nal_unit_type value management for sample entry types of AVC
Field Code Changed
and HEVC 285
Annex G (informative) Examples of VVC base and subpicture tracks 287287

vi © ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights
reserved

---------------------- Page: 6 ----------------------
ISO/IEC FDIS 14496-15:2014(E)
ISO/IEC FDIS 14496-15:2022(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of document should be noted. This document was drafted in accordance with the editorial
rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives www.iso.org/directives or
www.iec.ch/members_experts/refdocs).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. Details
of any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www.iso.org/patentswww.iso.org/patents) or the IEC
list of patent declarations received (see patents.iec.ch). https://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the World
Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT),
see www.iso.org/iso/foreword.html.) see www.iso.org/iso/foreword.html. In the IEC, see
www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
This sixth edition cancels and replaces the fifth edition (ISO/IEC 14496-15:2019), which has
been technically revised. The main changes compared to the previous edition are as follows: It also
incorporates the Amendment ISO/IEC 14496-15:2019/Amd 1:2020.
The main changes are as follows:
- Support for the Versatile Video Coding (ISO/IEC 23090-3) and Essential Video Coding (ISO/IEC
23094-1)
© ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights vii
reserved

---------------------- Page: 7 ----------------------
ISO/IEC FDIS 14496-15:2014(E)ISO/IEC FDIS 14496-15:2022(E)
- Addition of sample entry types 'hvc3’, ‘hev3’, ‘hvt2’, and ‘hvt3’ targeted at tile-based
delivery and merging of High Efficiency Video Coding (ISO/IEC 23008-2) bitstreams
A list of all parts in the ISO/IEC 14496 series can be found on the ISO websiteand IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.htmlwww.iso.org/members.html
and www.iec.ch/national-committees.

viii © ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights
reserved

---------------------- Page: 8 ----------------------
ISO/IEC FDIS 14496-15:2014(E)

ISO/IEC FDIS 14496-15:2022(E)
Introduction
This part of ISO/IEC 14496document defines a storage format based on, and compatible with, the ISO
Base Media File Format (ISO/IEC 14496-12), which is used by the MP4 file format (ISO/IEC 14496-14)
and the Motion JPEG 2000 file format (ISO/IEC 15444-3) among others. This part of
ISO/IEC 14496document enables video streams formatted as Network Adaptation Layer Units (NAL
Units) to
a) be used in conjunction with other media streams, such as audio,
b) be used in an MPEG-4 systems environment, if desired,
c) be formatted for delivery by a streaming server, using hint tracks, and
d) inherit all the use cases and features of the ISO Base Media File Format on which MP4 and MJ2 are
based.
This part of ISO/IEC 14496document may be used as a standalone specificationdocument; it specifies
how NAL unit structured video content shall be stored in an ISO Base Media File Format compliant format.
However, it is normally used in the context of a specification, such as the MP4 file format, derived from
the ISO Base Media File Format, that permits the use of NAL unit structured video such as AVC
(ISO/IEC 14496-10) video and High Efficiency Video Coding (HEVC, ISO/IEC 23008-2) video.
The ISO Base Media File Format is becoming increasingly common as a general-purpose media container
format for the exchange of digital media, and its use in this context should accelerate both adoption and
interoperability.
The International Organization for Standardization (ISO) and International Electrotechnical Commission
(IEC) draw attention to the fact that it is claimed that compliance with this document may involve the use
of a patent.
The ISO and IEC take no position concerning the evidence, validity and scope of this patent right.
The holder of this patent right has assured the ISO and IEC that he isthey are willing to negotiate licences
under reasonable and non-discriminatory terms and conditions with applicants throughout the world. In
this respect, the statement of the holder of this patent right is registered with the ISO and IECISO and IEC.
Information may be obtained from the patent database available at www.iso.org/patents or
patents.iec.ch.
© ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights ix
reserved

---------------------- Page: 9 ----------------------
ISO/IEC FDIS 14496-15:2014(E)ISO/IEC FDIS 14496-15:2022(E)
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. other than those in the patent database. ISO and IEC shall not be held responsible for
identifying any or all such patent rights.
x © ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights
reserved

---------------------- Page: 10 ----------------------
FINAL DRAFT INTERNATIONAL STANDARDFINAL ISO/IEC FDIS 14496-
DRAFT INTERNATIONAL STANDARD 15:2014(E)ISO/IEC FDIS 14496-15:2022(E)

Information technology — Coding of audio-visual objects
— —
Part 15:
Carriage of network abstraction layer (NAL) unit
structured video in the ISO base media file format
1 Scope
This part of ISO/IEC 14496document specifies the storage format for streams of video that is structured
as NAL Units, such as AVC (ISO/IEC 14496-10) and HEVC (ISO/IEC 23008-2) video streams. In addition,
Annex E specifies parameters and sub-parameters applying when sample entries specified in this
document are used as the 'codecs' parameter of a MIME type, as specified in IETF RFC 6381.
2 Normative references
The following documents, in whole or in part, are normatively referenced in this document and are
indispensable for its application. For dated references, only the edition cited applies. For undated
references, the latest edition of the referenced document (including any amendments) applies.
IETF RFC 4648, The Base16, Base32, and Base64 Data Encodings
IETF RFC 6381, MIME Codecs and Profiles
ISO/IEC 14496-12:2020, Information technology — Coding of audio-visual objects — Part 12: ISO base
media file format
ISO/IEC 14496-10:2020, Information technology — Coding of audio-visual objects — Part 10: Advanced
Video Coding
ISO/IEC 23008-2:2020, Information technology — High efficiency coding and media delivery in
heterogeneous environments — Part 2: High efficiency video coding
ISO/IEC 23090-3:2021, Information technology — Coded representation of immersive media — Part 3:
Versatile video coding
ISO/IEC 23094-1:2020, Information technology — General video coding — Part 1: Essential video coding
IETF RFC 4648, The Base16, Base32, and Base64 data encodings
IETF RFC 6381, MIME codecs and profiles
© ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights 1
reserved

---------------------- Page: 11 ----------------------
ISO/IEC FDIS 14496-15:2014(E)ISO/IEC FDIS 14496-15:2022(E)
3 Terms, definitions, abbreviated terms and conventions
3.1 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 14496-10,
ISO/IEC 23008--2, ISO/IEC 23090-3 or ISO/IEC 23094-1, and the following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https://www.iso.org/obp
— IEC Electropedia: available at https://www.electropedia.org/
3.1.1
3D-AVC NAL unit
3D AVC VCL NAL unit
NAL unit with type 21 with avc_3d_extension_flag equal to 1
3.1.2
aggregator
in-stream structure using a NAL unit header for grouping of NAL units belonging to the same sample
3.1.3
alternate region set
set of rectangular regions that are alternatives to be used as a rectangular region when reconstructing a
VVC bitstream from a VVC extraction base track
3.1.4
applicable video coding standard
video coding standard for the data carried in the track
Note 1 to entry: The video coding standard can be ISO/IEC 14496-10, ISO/IEC 23008-2, ISO/IEC 23090-3, or
ISO/IEC 23094-1.
3.1.43.1.5
AU- or picture-level non-VCL NAL unit
non-VCL NAL unit that applies to one or more entire AUs or one or more entire pictures
Note 1 to entry: An AU-level non-VCL NAL unit applies to one or more entire AUs. A picture-level non-VCL NAL unit
applies to one or more entire pictures. In VVC, AU-level or picture-level non-VCL NAL units include: 1) all the DCI, OPI, VPS,
SPS, PPS, AUD, PH, EOS, and EOB NAL units; 2) APS NAL units that apply to one or more entire AUs or pictures; and 3) SEI
NAL units that only contain SEI messages that apply to one or more entire AUs or pictures.
3.1.53.1.6
AVC base layer
maximum subset of a bitstream that is AVC compatible
Note 1 to entry: The AVC base layer is represented by AVC VCL NAL units and associated non-VCL NAL units. The AVC
base layer is not using any of the functionality of ISO/IEC 14496-10:2020, Annex G, Annex H, Annex I, or Annex J.
Note 2 to entry: The AVC base layer itself can be a temporal scalable bitstream.
2 © ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights
reserved

---------------------- Page: 12 ----------------------
ISO/IEC FDIS 14496-15:2014(E)

ISO/IEC FDIS 14496-15:2022(E)
3.1.61.1.1
AVC NAL unit
AVC VCL NAL unit and its associated non-VCL NAL units in a bitstream
3.1.7
AVC parameter set sample
sample in a parameter set elementary stream that consists of those parameter set NAL units that are to
be considered as if present in the video elementary stream at the same instant in time
3.1.8
AVC sample
access unit as defined in ISO/IEC 14496-10:2020, subclause 7.4.1.2
3.1.9
AVC NAL unit
AVC VCL NAL unit and its associated non-VCL NAL units in a bitstream
3.1.93.1.10
AVC VCL NAL unit
NAL unit with type 1 to 5 (inclusive)
3.1.103.1.11
canonical order
order of NAL units that conforms to the applicable video standard
Note 1 to entry: When a single track carries a video bitstream, the NAL units are stored in the canonical order. When
multiple tracks are used to a carry a video bitstream, an implicit or explicit video bitstream reconstruction process might
be applied to recover the canonical order.
3.1.113.1.12
canonical stream format
elementary stream that contains NAL units in the canonical order and conforms to the constraints
specified in this document for carrying an elementary stream of the applicable video standard in one or
more tracks
3.1.123.1.13
complete subset
minimal set of tracks that contain all the information in the original bitstream
3.1.133.1.14
cropped frame dimensions
width and height of the decoded frame after applying the output cropping parameters
3.1.143.1.15
default sample group description index
default_group_description_index of SampleGroupDescriptionBox with version
greater than or equal to 2
© ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights 3
reserved

---------------------- Page: 13 ----------------------
ISO/IEC FDIS 14496-15:2014(E)ISO/IEC FDIS 14496-15:2022(E)
3.1.153.1.16
elementary stream
sequence of one or more bitstreams of the applicable video standard
Note 1 to entry: The term elementary stream is not directly related to the terms video elementary stream, parameter
set elementary stream, and video and parameter set elementary stream.
Note 2 to entry: The applicable video standard can be included as a prefix to the term elementary stream. For example,
an AVC elementary stream refers to an elementary stream that is a sequence of one or more bitstreams conforming to
ISO/IEC 14496-10.
3.1.163.1.17
extractor
in-stream structure using a NAL unit header for extraction of data from other tracks
Note 1 to entry: Extractors contain instructions on how to extract data from other tracks. Logically an Extractor can be
seen as a pointer to data. While reading a track containing Extractors, the Extractor is replaced by the data it is pointing to.
3.1.173.1.18
HEVC sample
access unit as defined in ISO/IEC 23008-2:2020, subclause 3.1
3.1.183.1.19
implicit reconstruction
reconstruction of a stream of access units from two or more tracks not using extractors
3.1.193.1.20
in-stream structure
structure residing within sample data
3.1.203.1.21
layer
scalable layer
set of VCL NAL units with the same values of dependency_id, quality_id, and
temporal_id, and the associated non-VCL NAL units
Note 1 to entry: A scalable layer with any of dependency_id, quality_id, and temporal_id not equal to 0 enhances the
video by one or more scalability levels in at least one direction (temporal, quality or spatial resolution)
Note 2 to entry: SVC uses a “layered” encoder design that results in a bitstream representing “coding layers”. In some
publications the ‘base layer’ is the first quality layer of a specific coding layer. In some publications the base layer is the
scalable layer with the lowest priority. The SVC file format uses “scalable layer” or “layer” in a general way for describing
nested bitstreams (using terms like AVC base layer or SVC enhancement layer).
3.1.213.1.22
layer
scalable layer
set of VCL NAL units with the same value of nuh_layer_id and the associated non-VCL
NAL units
3.1.223.1.23
layer set
set of layers represented within a bitstream created from another bitstream by operation of the sub-
bitstream extraction process
4 © ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights
reserved

---------------------- Page: 14 ----------------------
ISO/IEC FDIS 14496-15:2014(E)

ISO/IEC FDIS 14496-15:2022(E)
3.1.233.1.24
L-HEVC sample
picture units that are within an access unit as specified in Annex F of ISO/IEC 23008-2:2020 and are
represented by the track
3.1.243.1.25
MVC NAL unit
MVC VCL NAL unit and its associated non-VCL NAL units in an MVC stream
Note 1 to entry: The association of non-VCL NAL units with MVC VCL NAL units is specified in ISO/IEC 14496-10:2020,
Annex H.
3.1.253.1.26
MVC sample
one or more view components as defined in Annex H of ISO/IEC 14496-10:2020 and the associated non-
VCL NAL units
3.1.263.1.27
MVC VCL NAL unit
NAL unit with type 20, and NAL units with type 14 when the immediately following NAL units are AVC
VCL NAL units
Note 1 to entry: MVC VCL NAL units do not affect the decoding process of a legacy AVC decoder.
3.1.273.1.28
MVC+D depth NAL unit
MVC+D depth VCL NAL unit
NAL unit with type 21 containing a coded slice extension for a depth view component
3.1.283.1.29
MVD NAL unit
MVD VCL NAL unit
NAL unit with type 21, containing a coded slice extension for a depth view component coded with MVC+D
or 3D-AVC, or a 3D-AVC texture view component
3.1.293.1.30
MVD sample
one or more view components as defined in Annex I or Annex J of ISO/IEC 14496-10:2020 and the
associated non-VCL NAL units, where each view component contains a texture view component, a depth
view component or both
3.1.303.1.31
NAL-unit-like structure
data structure that is similar to NAL units in the sense that it also has a NAL unit header and a payload,
with a difference that the payload might not follow the start code emulation prevention mechanism
required for the NAL unit syntax
© ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights 5
reserved

---------------------- Page: 15 ----------------------
ISO/IEC FDIS 14496-15:2014(E)ISO/IEC FDIS 14496-15:2022(E)
3.1.313.1.32
natively present
not included in an aggregator or an extractor
Note 1 to entry: Data referred to by (hence not included in) an aggregator is considered as natively present. Data
included in an aggregator is not considered as natively present.
3.1.323.1.33
operating point
independently decodable subset of a layered bitstream
Note 1 to entry: Each operating point consists of all the data needed to decode this particular bitstream subset.
Note 2 to entry: In an SVC stream an operating point represents a particular spatial resolution, temporal resolution,
and quality, and can be represented either by (i) specific values of DTQ (dependency_id, temporal_id and quality_id) or (ii)
specific values of P (priority_id) or (iii) combinations of them (e.g. PDTQ). Note that the usage of priority_id is defined by
the application. In an SVC file a track represents one or more operating points. Within a track tiers can be used to define
multiple operating points.
Note 3 to entry: The bitstream subset of an MVC or MVD operating point represents a particular set of target output
views at a particular temporal resolution, and consists of all the data needed to decode this particular bitstream subset. In
MVD each target output view in the bitstream subset of an MVD operating point can contain a texture view, a depth view or
both.
Note 4 to entry: An operating point is referred to as an operation point in Annex H of ISO/IEC 14496-10.
3.1.333.1.34
operating point
independently decodable subset of a layered bitstream, where one or more layers in the set of
layers are indicated to be output layers
Note 1 to entry: Each operating point consists of all the data needed to decode this particular bitstream subset.
Note 2 to entry: An operating point is referred to as an output operation point in ISO/IEC 23008-2.
3.1.343.1.35
operating point
temporal subset of an output layer set (OLS), identified by an output layer set (OLS) index and a
highest value of TemporalId
Note 1 to entry: Each operating point consists of all the data needed to decode this particular bitstream subset.
Note 2 to entry: An operating point is referred to as an operation point in ISO/IEC 23090-3.
3.1.353.1.36
output layer set
set of layers consisting of the layers of one of the specified layer sets, where one or more layers in the set
of layers are indicated to be output layers, as specified in ISO/IEC 23008-2
3.1.363.1.37
parameter set
video parameter set, sequence parameter set, picture parameter set, or adaptation parameter set as
defined in the applicable video standard
Note 1 to entry: This term is used to refer to all types of parameter sets.
6 © ISO/IEC 2014 – All rights reserved© ISO/IEC 2022 – All rights
reserved

---------------------- Page: 16 ----------------------
ISO/IEC FDIS 14496-15:2014(E)

ISO/IEC FDIS 14496-15:2022(E)
3.1.373.1.38
parameter set elementary stream
elementary stream containing samples made up of only sequence and picture parameter set NAL units
synchronized with the video elementary stream
3.1.383.1.39
picture unit
set of VCL NAL units and their associated non-VCL NAL units
Note 1 to entry: The association of VCL NAL units and non-VCL NAL units with picture units is specified in the
applicable video standard.
3.1.393.1.40
prefix NAL unit
NAL units with type 14
Note 1 to entry: Prefix NAL units provide scalability information about AVC VCL NAL units and filler data NAL units.
Prefix NAL units do not affect the decoding process of a legacy AVC decoder. The behaviour of a legacy AVC file reader as a
response to prefix NAL units is undefined.
3.1.40
profile, tier, and level
profile, tier, and level, as defined in the applicable video standard
3.1.41
rectangular region
rectangle that does not contain holes and does not overlap with any other rectangular region of the same
picture
3.1.42
reference layer
layer that is indicated as possibly needed for decoding of another layer
Note 1 to entry: For layered HEVC, reference layers c
...

FINAL
INTERNATIONAL ISO/IEC
DRAFT
STANDARD FDIS
14496-15
ISO/IEC JTC 1/SC 29
Information technology — Coding of
Secretariat: JISC
audio-visual objects —
Voting begins on:
2022-07-15
Part 15:
Voting terminates on:
Carriage of network abstraction layer
2022-09-09
(NAL) unit structured video in the ISO
base media file format
Technologies de l'information — Codage des objets audiovisuels —
Partie 15: Transport de vidéo structurée en unités NAL sur la couche
réseau au format ISO de base pour les fichiers médias
RECIPIENTS OF THIS DRAFT ARE INVITED TO
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
Reference number
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
ISO/IEC FDIS 14496-15:2022(E)
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
NATIONAL REGULATIONS. © ISO/IEC 2022

---------------------- Page: 1 ----------------------
ISO/IEC FDIS 14496-15:2022(E)
FINAL
INTERNATIONAL ISO/IEC
DRAFT
STANDARD FDIS
14496-15
ISO/IEC JTC 1/SC 29
Information technology — Coding of
Secretariat: JISC
audio-visual objects —
Voting begins on:
2022-07-15
Part 15:
Voting terminates on:
Carriage of network abstraction layer
2022-09-09
(NAL) unit structured video in the ISO
base media file format
Technologies de l'information — Codage des objets audiovisuels —
Partie 15: Transport de vidéo structurée en unités NAL sur la couche
réseau au format ISO de base pour les fichiers médias
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2022
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
RECIPIENTS OF THIS DRAFT ARE INVITED TO
ISO copyright office
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
CP 401 • Ch. de Blandonnet 8
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
CH-1214 Vernier, Geneva
DOCUMENTATION.
Phone: +41 22 749 01 11
IN ADDITION TO THEIR EVALUATION AS
Reference number
Email: copyright@iso.org
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
ISO/IEC FDIS 14496-15:2022(E)
Website: www.iso.org
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
Published in Switzerland
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
ii
  © ISO/IEC 2022 – All rights reserved
NATIONAL REGULATIONS. © ISO/IEC 2022

---------------------- Page: 2 ----------------------
ISO/IEC FDIS 14496-15:2022(E)
Contents Page
Foreword . vi
Introduction . vii
1 Scope . 1
2 Normative references . 1
3 Terms, definitions, abbreviated terms and conventions . 1
3.1 Terms and definitions . 1
3.2 Abbreviated terms . 10
3.3 Conventions . 11
4 General definitions . 12
4.1 Overview . 12
4.2 Sample and configuration definition . 12
4.3 Video track structure . 14
4.4 Template fields used . 14
4.5 Visual width and height . 14
4.6 Decoding time (DTS) and composition time (CTS) . 15
4.7 Sample groups on random access recovery points 'roll' and random access points
'rap ' . 15
4.8 Hinting . 16
4.9 On change of sample entry (informative) . 16
4.10 SEI information box . 18
4.11 Post-decoder requirements scheme for signalling of SEI . 18
4.12 Alternative extraction source track grouping . 19
4.13 NAL unit map entry . 19
4.14 Rectangular region group entry . 21
4.15 Layer information sample group . 23
5 AVC elementary streams and sample definitions . 25
5.1 Overview . 25
5.2 Elementary stream structure . 25
5.3 Sample and configuration definition . 28
5.4 Derivation from ISO base media file format . 32
6 SVC elementary stream and sample definitions. 44
6.1 Overview . 44
6.2 Elementary stream structure . 44
6.3 Use of the plain AVC file format . 45
6.4 Sample and configuration definition . 45
6.5 Derivation from the ISO base media file format . 48
7 MVC and MVD elementary stream and sample definitions . 54
7.1 Overview . 54
7.2 Overview of MVC or MVD Storage . 55
7.3 MVC and MVD elementary stream structures . 57
© ISO/IEC 2022 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC FDIS 14496-15:2022(E)
7.4 Use of the plain AVC file format . 58
7.5 Sample and configuration definition . 59
7.6 Derivation from the ISO base media file format . 62
7.7 MVC specific information boxes. 77
8 HEVC elementary streams and sample definitions . 87
8.1 Overview . 87
8.2 Elementary stream structure . 87
8.3 Sample and configuration definition . 88
8.4 Derivation from ISO base media file format . 93
9 Layered HEVC elementary stream and sample definitions . 102
9.1 Overview . 102
9.2 Overview of L-HEVC storage . 103
9.3 L-HEVC elementary stream structure . 104
9.4 Sample and configuration definition . 104
9.5 Derivation from the ISO base media file format and the HEVC file format (Clause 8)
. 106
9.6 L-HEVC specific structures . 117
10 Storage of tiled HEVC and L-HEVC video streams . 123
10.1 Overview . 123
10.2 NAL unit map entry . 124
10.3 Tile region group entry . 124
10.4 Tile sub track definition . 124
10.5 HEVC and L-HEVC tile track . 125
10.6 HEVC slice segment data track . 130
11 VVC elementary streams and sample definitions. 131
11.1 Overview . 131
11.2 Sample and configuration definition . 138
11.3 Derivation from ISO base media file format . 147
11.4 Sample groups . 161
11.5 Entity groups . 181
11.6 Data sharing and VVC bitstream reconstruction . 189
12 EVC elementary streams and sample definitions . 200
12.1 Overview . 200
12.2 Elementary stream structure . 200
12.3 Sample and configuration definition . 201
12.4 Derivation from ISO base media file format . 204
Annex A (normative) In-stream structures . 211
Annex B (normative) SVC, MVC, and MVD sample group and sub-track definitions . 229
Annex C (normative) Temporal metadata support . 252
Annex D (normative) File format toolsets and brands . 261
Annex E (normative) Sub-parameters for the MIME type ‘codecs’ parameter. 265
iv © ISO/IEC 2022 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC FDIS 14496-15:2022(E)
Annex F (informative) Unspecified nal_unit_type value management for sample entry types of
AVC and HEVC . 274
Annex G (informative) Examples of VVC base and subpicture tracks. 276

© ISO/IEC 2022 – All rights reserved v

---------------------- Page: 5 ----------------------
ISO/IEC FDIS 14496-15:2022(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of document should be noted. This document was drafted in accordance with the editorial
rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives or
www.iec.ch/members_experts/refdocs).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. Details
of any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www.iso.org/patents) or the IEC list of patent
declarations received (see https://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the World
Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see
www.iso.org/iso/foreword.html. In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
This sixth edition cancels and replaces the fifth edition (ISO/IEC 14496-15:2019), which has been
technically revised. It also incorporates the Amendment ISO/IEC 14496-15:2019/Amd 1:2020.
The main changes are as follows:
- Support for the Versatile Video Coding (ISO/IEC 23090-3) and Essential Video Coding (ISO/IEC
23094-1)
- Addition of sample entry types 'hvc3’, ‘hev3’, ‘hvt2’, and ‘hvt3’ targeted at tile-based
delivery and merging of High Efficiency Video Coding (ISO/IEC 23008-2) bitstreams
A list of all parts in the ISO/IEC 14496 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html and www.iec.ch/national-
committees.
vi © ISO/IEC 2022 – All rights reserved

---------------------- Page: 6 ----------------------
ISO/IEC FDIS 14496-15:2022(E)
Introduction
This document defines a storage format based on, and compatible with, the ISO Base Media File Format
(ISO/IEC 14496-12), which is used by the MP4 file format (ISO/IEC 14496-14) and the Motion JPEG 2000
file format (ISO/IEC 15444-3) among others. This document enables video streams formatted as Network
Adaptation Layer Units (NAL Units) to
a) be used in conjunction with other media streams, such as audio,
b) be used in an MPEG-4 systems environment, if desired,
c) be formatted for delivery by a streaming server, using hint tracks, and
d) inherit all the use cases and features of the ISO Base Media File Format on which MP4 and MJ2 are
based.
This document may be used as a standalone document; it specifies how NAL unit structured video content
shall be stored in an ISO Base Media File Format compliant format. However, it is normally used in the
context of a specification, such as the MP4 file format, derived from the ISO Base Media File Format, that
permits the use of NAL unit structured video such as AVC (ISO/IEC 14496-10) video and High Efficiency
Video Coding (HEVC, ISO/IEC 23008-2) video.
The ISO Base Media File Format is becoming increasingly common as a general-purpose media container
format for the exchange of digital media, and its use in this context should accelerate both adoption and
interoperability.
The International Organization for Standardization (ISO) and International Electrotechnical Commission
(IEC) draw attention to the fact that it is claimed that compliance with this document may involve the use
of a patent.
ISO and IEC take no position concerning the evidence, validity and scope of this patent right.
The holder of this patent right has assured ISO and IEC that they are willing to negotiate licences under
reasonable and non-discriminatory terms and conditions with applicants throughout the world. In this
respect, the statement of the holder of this patent right is registered with ISO and IEC. Information may
be obtained from the patent database available at www.iso.org/patents or patents.iec.ch.
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights other than those in the patent database. ISO and IEC shall not be held responsible for
identifying any or all such patent rights.
© ISO/IEC 2022 – All rights reserved vii

---------------------- Page: 7 ----------------------
FINAL DRAFT INTERNATIONAL STANDARD ISO/IEC FDIS 14496-15:2022(E)

Information technology — Coding of audio-visual
objects —
Part 15:
Carriage of network abstraction layer (NAL) unit
structured video in the ISO base media file format
1 Scope
This document specifies the storage format for streams of video that is structured as NAL Units, such as
AVC (ISO/IEC 14496-10) and HEVC (ISO/IEC 23008-2) video streams. In addition, Annex E specifies
parameters and sub-parameters applying when sample entries specified in this document are used as the
'codecs' parameter of a MIME type, as specified in IETF RFC 6381.
2 Normative references
The following documents, in whole or in part, are normatively referenced in this document and are
indispensable for its application. For dated references, only the edition cited applies. For undated
references, the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 14496-12:2020, Information technology — Coding of audio-visual objects — Part 12: ISO base
media file format
ISO/IEC 14496-10:2020, Information technology — Coding of audio-visual objects — Part 10: Advanced
Video Coding
ISO/IEC 23008-2:2020, Information technology — High efficiency coding and media delivery in
heterogeneous environments — Part 2: High efficiency video coding
ISO/IEC 23090-3:2021, Information technology — Coded representation of immersive media — Part 3:
Versatile video coding
ISO/IEC 23094-1:2020, Information technology — General video coding — Part 1: Essential video coding
IETF RFC 4648, The Base16, Base32, and Base64 data encodings
IETF RFC 6381, MIME codecs and profiles
3 Terms, definitions, abbreviated terms and conventions
3.1 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 14496-10,
ISO/IEC 23008-2, ISO/IEC 23090-3 or ISO/IEC 23094-1, and the following apply.
© ISO/IEC 2022 – All rights reserved 1

---------------------- Page: 8 ----------------------
ISO/IEC FDIS 14496-15:2022(E)
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https://www.iso.org/obp
— IEC Electropedia: available at https://www.electropedia.org/
3.1.1
3D-AVC NAL unit
3D AVC VCL NAL unit
NAL unit with type 21 with avc_3d_extension_flag equal to 1
3.1.2
aggregator
in-stream structure using a NAL unit header for grouping of NAL units belonging to the same sample
3.1.3
alternate region set
set of rectangular regions that are alternatives to be used as a rectangular region when reconstructing a
VVC bitstream from a VVC extraction base track
3.1.4
applicable video coding standard
video coding standard for the data carried in the track
Note 1 to entry: The video coding standard can be ISO/IEC 14496-10, ISO/IEC 23008-2, ISO/IEC 23090-3, or
ISO/IEC 23094-1.
3.1.5
AU- or picture-level non-VCL NAL unit
non-VCL NAL unit that applies to one or more entire AUs or one or more entire pictures
Note 1 to entry: An AU-level non-VCL NAL unit applies to one or more entire AUs. A picture-level non-VCL NAL unit
applies to one or more entire pictures. In VVC, AU-level or picture-level non-VCL NAL units include: 1) all the DCI, OPI, VPS,
SPS, PPS, AUD, PH, EOS, and EOB NAL units; 2) APS NAL units that apply to one or more entire AUs or pictures; and 3) SEI
NAL units that only contain SEI messages that apply to one or more entire AUs or pictures.
3.1.6
AVC base layer
maximum subset of a bitstream that is AVC compatible
Note 1 to entry: The AVC base layer is represented by AVC VCL NAL units and associated non-VCL NAL units. The AVC
base layer is not using any of the functionality of ISO/IEC 14496-10:2020, Annex G, Annex H, Annex I, or Annex J.
Note 2 to entry: The AVC base layer itself can be a temporal scalable bitstream.
3.1.7
AVC parameter set sample
sample in a parameter set elementary stream that consists of those parameter set NAL units that are to
be considered as if present in the video elementary stream at the same instant in time
3.1.8
AVC sample
access unit as defined in ISO/IEC 14496-10
2 © ISO/IEC 2022 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/IEC FDIS 14496-15:2022(E)
3.1.9
AVC NAL unit
AVC VCL NAL unit and its associated non-VCL NAL units in a bitstream
3.1.10
AVC VCL NAL unit
NAL unit with type 1 to 5 (inclusive)
3.1.11
canonical order
order of NAL units that conforms to the applicable video standard
Note 1 to entry: When a single track carries a video bitstream, the NAL units are stored in the canonical order. When
multiple tracks are used to a carry a video bitstream, an implicit or explicit video bitstream reconstruction process might
be applied to recover the canonical order.
3.1.12
canonical stream format
elementary stream that contains NAL units in the canonical order and conforms to the constraints
specified in this document for carrying an elementary stream of the applicable video standard in one or
more tracks
3.1.13
complete subset
minimal set of tracks that contain all the information in the original bitstream
3.1.14
cropped frame dimensions
width and height of the decoded frame after applying the output cropping parameters
3.1.15
default sample group description index
default_group_description_index of SampleGroupDescriptionBox with version
greater than or equal to 2
3.1.16
elementary stream
sequence of one or more bitstreams of the applicable video standard
Note 1 to entry: The term elementary stream is not directly related to the terms video elementary stream, parameter
set elementary stream, and video and parameter set elementary stream.
Note 2 to entry: The applicable video standard can be included as a prefix to the term elementary stream. For example,
an AVC elementary stream refers to an elementary stream that is a sequence of one or more bitstreams conforming to
ISO/IEC 14496-10.
3.1.17
extractor
in-stream structure using a NAL unit header for extraction of data from other tracks
Note 1 to entry: Extractors contain instructions on how to extract data from other tracks. Logically an Extractor can be
seen as a pointer to data. While reading a track containing Extractors, the Extractor is replaced by the data it is pointing to.
© ISO/IEC 2022 – All rights reserved 3

---------------------- Page: 10 ----------------------
ISO/IEC FDIS 14496-15:2022(E)
3.1.18
HEVC sample
access unit as defined in ISO/IEC 23008-2
3.1.19
implicit reconstruction
reconstruction of a stream of access units from two or more tracks not using extractors
3.1.20
in-stream structure
structure residing within sample data
3.1.21
layer
scalable layer
set of VCL NAL units with the same values of dependency_id, quality_id, and
temporal_id, and the associated non-VCL NAL units
Note 1 to entry: A scalable layer with any of dependency_id, quality_id, and temporal_id not equal to 0 enhances the
video by one or more scalability levels in at least one direction (temporal, quality or spatial resolution)
Note 2 to entry: SVC uses a “layered” encoder design that results in a bitstream representing “coding layers”. In some
publications the ‘base layer’ is the first quality layer of a specific coding layer. In some publications the base layer is the
scalable layer with the lowest priority. The SVC file format uses “scalable layer” or “layer” in a general way for describing
nested bitstreams (using terms like AVC base layer or SVC enhancement layer).
3.1.22
layer
scalable layer
set of VCL NAL units with the same value of nuh_layer_id and the associated non-VCL
NAL units
3.1.23
layer set
set of layers represented within a bitstream created from another bitstream by operation of the sub-
bitstream extraction process
3.1.24
L-HEVC sample
picture units that are within an access unit as specified in Annex F of ISO/IEC 23008-2:2020 and are
represented by the track
3.1.25
MVC NAL unit
MVC VCL NAL unit and its associated non-VCL NAL units in an MVC stream
Note 1 to entry: The association of non-VCL NAL units with MVC VCL NAL units is specified in ISO/IEC 14496-10:2020,
Annex H.
3.1.26
MVC sample
one or more view components as defined in Annex H of ISO/IEC 14496-10:2020 and the associated non-
VCL NAL units
4 © ISO/IEC 2022 – All rights reserved

---------------------- Page: 11 ----------------------
ISO/IEC FDIS 14496-15:2022(E)
3.1.27
MVC VCL NAL unit
NAL unit with type 20, and NAL units with type 14 when the immediately following NAL units are AVC
VCL NAL units
Note 1 to entry: MVC VCL NAL units do not affect the decoding process of a legacy AVC decoder.
3.1.28
MVC+D depth NAL unit
MVC+D depth VCL NAL unit
NAL unit with type 21 containing a coded slice extension for a depth view component
3.1.29
MVD NAL unit
MVD VCL NAL unit
NAL unit with type 21, containing a coded slice extension for a depth view component coded with MVC+D
or 3D
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.