Information technology — MPEG systems technologies — Part 16: Derived visual tracks in the ISO base media file format

This document defines a storage format for derived visual tracks and an initial base set of related transformation operations. The format defined in this document enables the interchange, editing, and display of timed sequences of images that result from transformation operations applied to input still images or samples of timed sequences of images in the same presentation. This format defines normative structures used to contain the description of transformation operations, how to link that transformation operations to the inputs, and defines how to process those transformation operations to obtain a timed sequence of video frames.

Technologies de l'information — Technologies des systèmes MPEG — Partie 16: Pistes visuelles dérivées au format ISO de base pour les fichiers médias

General Information

Status
Published
Publication Date
18-Nov-2021
Current Stage
6060 - International Standard published
Start Date
19-Nov-2021
Due Date
18-Jul-2022
Completion Date
19-Nov-2021
Ref Project

Buy Standard

Standard
ISO/IEC 23001-16:2021 - Information technology -- MPEG systems technologies
English language
18 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
ISO/IEC FDIS 23001-16:Version 28-avg-2021 - Information technology -- MPEG systems technologies
English language
18 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO/IEC
STANDARD 23001-16
First edition
2021-11
Information technology — MPEG
systems technologies —
Part 16:
Derived visual tracks in the ISO base
media file format
Technologies de l'information — Technologies des systèmes MPEG —
Partie 16: Pistes visuelles dérivées au format ISO de base pour les
fichiers médias
Reference number
ISO/IEC 23001-16:2021(E)
© ISO/IEC 2021

---------------------- Page: 1 ----------------------
ISO/IEC 23001-16:2021(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2021
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
  © ISO/IEC 2021 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC 23001-16:2021(E)
Contents Page
Foreword .v
Introduction . vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Derived visual tracks, design principles . 2
5 Derivation operation .4
5.1 Definition . 4
5.2 Syntax . 5
5.3 Semantics . 6
6 Sample entry and configuration definition . 6
6.1 Sample entry definition . 6
6.2 Derived visual track configuration record . 7
6.2.1 Definition . 7
6.2.2 Syntax . . 7
6.2.3 Semantics . 7
7 Sample format . 8
7.1 General . 8
7.2 Syntax . 8
8 Derivation transformations . 8
8.1 Overview . 8
8.2 Identity . 9
8.2.1 Definition . 9
8.2.2 Syntax . . 9
8.3 sRGB Fill . 9
8.3.1 Definition . 9
8.3.2 Syntax . . 9
8.3.3 Semantics . 10
8.4 Dissolve . 10
8.4.1 Definition . 10
8.4.2 Syntax . . 10
8.4.3 Semantics . 10
8.5 Crop . 11
8.5.1 Definition . 11
8.5.2 Syntax . . 11
8.5.3 Semantics . 11
8.6 Rotation . 11
8.6.1 Definition . 11
8.6.2 Syntax . . 11
8.6.3 Semantics . 11
8.7 Mirror . 12
8.7.1 Definition .12
8.7.2 Syntax . .12
8.7.3 Semantics . 12
8.8 Scaling .12
8.8.1 Definition .12
8.8.2 Syntax . .12
8.8.3 Semantics . 12
8.9 Region of interest (ROI) selection .12
8.9.1 Definition .12
8.9.2 Syntax . . 13
iii
© ISO/IEC 2021 – All rights reserved

---------------------- Page: 3 ----------------------
ISO/IEC 23001-16:2021(E)
8.10 Grid composition. 13
8.10.1 Definition .13
8.10.2 Syntax . .13
8.10.3 Semantics . 13
8.11 Overlay composition . 14
8.11.1 Definition . 14
8.11.2 Syntax . . 14
8.11.3 Semantics . 14
Annex A (informative) Examples of derivation operations usage .15
Bibliography .18
iv
  © ISO/IEC 2021 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC 23001-16:2021(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance
are described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria
needed for the different types of document should be noted. This document was drafted in
accordance with the editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives or
www.iec.ch/members_experts/refdocs).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents) or the IEC
list of patent declarations received (see patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see
www.iso.org/iso/foreword.html. In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
A list of all parts in the ISO/IEC 23001 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards
body. A complete listing of these bodies can be found at www.iso.org/members.html and
www.iec.ch/national-committees.
v
© ISO/IEC 2021 – All rights reserved

---------------------- Page: 5 ----------------------
ISO/IEC 23001-16:2021(E)
Introduction
Derived visual tracks are designed to enable defining a timed sequence of visual transformation
operations to be applied to input still images and/or samples of timed sequences of images in the same
presentation. It is built using tools defined in the ISO base media file format (ISO/IEC 14496-12). This
document specifies the core design and an initial base set of transformation operations.
The International Organization for Standardization (ISO) and International Electrotechnical
Commission (IEC) draw attention to the fact that it is claimed that compliance with this document may
involve the use of a patent.
ISO and IEC take no position concerning the evidence, validity and scope of this patent right.
The holder of this patent right has assured ISO and IEC that he/she is willing to negotiate licences under
reasonable and non-discriminatory terms and conditions with applicants throughout the world. In this
respect, the statement of the holder of this patent right is registered with ISO and IEC. Information may
be obtained from the patent database available at www.iso.org/patents.
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights other than those in the patent database. ISO and IEC shall not be held responsible for
identifying any or all such patent rights.
vi
  © ISO/IEC 2021 – All rights reserved

---------------------- Page: 6 ----------------------
INTERNATIONAL STANDARD ISO/IEC 23001-16:2021(E)
Information technology — MPEG systems technologies —
Part 16:
Derived visual tracks in the ISO base media file format
1 Scope
This document defines a storage format for derived visual tracks and an initial base set of related
transformation operations. The format defined in this document enables the interchange, editing, and
display of timed sequences of images that result from transformation operations applied to input still
images or samples of timed sequences of images in the same presentation.
This format defines normative structures used to contain the description of transformation
operations, how to link that transformation operations to the inputs, and defines how to process those
transformation operations to obtain a timed sequence of video frames.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 14496-12, Information technology — Coding of audio-visual objects — Part 12: ISO Base Media
file format
ISO/IEC 23001-10, Information technology — MPEG systems technologies — Part 10: Carriage of timed
metadata metrics of media in ISO base media file format
ISO/IEC 23008-12, Information technology — High efficiency coding and media delivery in heterogeneous
environments — Part 12: Image File Format
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 14496-12 and the
following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
3.1
derivation operation
container box representing an operation applying a derivation transformation (3.2) on an ordered list of
inputs (3.5)
3.2
derivation transformation
visual transformation operation identified by a 32-bit value and a set of parameters that transforms
inputs (3.5) into visual outputs (3.8)
Note 1 to entry: The 32-bit value is also known as a four-character code in ISO/IEC 14496-12.
1
© ISO/IEC 2021 – All rights reserved

---------------------- Page: 7 ----------------------
ISO/IEC 23001-16:2021(E)
3.3
derived sample
sample containing an ordered list of derivation operations (3.1)
3.4
derived visual track
video or picture track that contains a timed sequence of derived samples (3.3)
3.5
input
parameter input (3.6) or visual input (3.7)
3.6
parameter input
metadata from an input item or track that is used as input for a derivation transformation (3.2) of a
derivation operation (3.1)
Note 1 to entry: The parameter input is either an input metadata item from file-level MetaBox or an interval of an
input metadata track (possibly spanning multiple samples).
3.7
visual input
video or still image that is used as input for a derivation transformation (3.2) of a derivation operation
(3.1)
Note 1 to entry: The visual input is either an input image item from file-level MetaBox, an interval of an input
track (possibly spanning multiple samples), the visual output of a preceding derivation operation (3.1) or the
default input fill picture signalled in the configuration record of the derived visual track (3.4).
3.8
visual output
one video frame or a sequence of video frames that is output from a derivation transformation (3.2) of a
derivation operation (3.1)
4 Derived visual tracks, design principles
A derived visual track describes a timed sequence of derived samples composed of an ordered list of
derivation operations, each derivation operation applying a derivation transformation for the duration
of the derived sample on an ordered list of inputs represented in the same presentation.
A derived visual track shall be either a video track (with the 'vide' handler type in the HandlerBox
of the MediaBox as defined in ISO/IEC 14496-12) or a picture track (with the 'pict' handler type in
the HandlerBox of the MediaBox as defined in ISO/IEC 23008-12). A derived visual track is identified by
its containing sample entry of type 'dtrk' DerivedVisualSampleEntry. Each sample described by a
DerivedVisualSampleEntry is a derived sample.
A derived visual track shall include a TrackReferenceTypeBox with reference_type equal to 'dtrk'
referring to all the inputs. Each reference shall be one of:
a) the track_ID of a track used by derived samples in the track, or, if unified IDs are in use as defined
by ISO/IEC 14496-12, a track_group_id;
b) the item_ID of an image item, in the file-level MetaBox, used by derived samples in the track.
An ID value in the track references is resolved to a track_ID whenever the file contains a track with
such ID, is resolved to a track_group_id whenever unified IDs are in use and the file contains a track
group with such ID, and is resolved to an item_ID otherwise.
NOTE 1 A track_ID can be an ID of a derived visual track.
2
  © ISO/IEC 2021 – All rights reserved

---------------------- Page: 8 ----------------------
ISO/IEC 23001-16:2021(E)
If a referenced track is a member of an alternate group or switch group, or if the reference is to a track
group, then the reader should pick a track from the group as the input to the derived visual track.
NOTE 2 The TrackSelectionBox can be used to provide guidance on the selection between members of an
alternate group or switch group.
Similarly, if a referenced image item is a member of an alternate group (which may contain both tracks
and images), then the reader should pick one member of the group as the input to the derived visual
track.
A derived sample contains an ordered list of the derivation operations to be performed, each derivation
operation applying a derivation transformation on an ordered list of inputs. The layer syntax element
in TrackHeaderBox has no impact on ordering the inputs for derived samples.
The four-character codes of derivation transformation from all derivation operations used by the
derived samples in the track are listed in the DerivedVisualSampleEntry, and also default inputs and
parameter values can be supplied there. A derived sample in the track may use all or some of the
derivation operations listed in the linked DerivedVisualSampleEntry, but derived samples shall not use
a derivation operation not listed in the sample entry.
The derived sample durations document the time over which the derivation represented by the ordered
list of derivation operations is active. Therefore, the number of samples defined in a derived visual track
does not necessarily match 1:1 with the number of input image items or samples of input tracks that
are being transformed. A single derivation duration may span multiple samples in the source track(s),
and also derivation transformations in derived samples may have 'internal time structure' (e.g. a cross-
fade) so the picture may change during the sample duration. This is in contrast to 'classic video'.
Derived visual tracks do not respect edit lists on inputs. They operate on the composition timeline (i.e.
before the application of edit lists) of their input tracks (including on derived visual tracks when used
as visual inputs). However, the input tracks shall not have edit lists. Any edit lists of the input tracks
shall be ignored if present.
NOTE 3 When time-alignment adjustment between input tracks is needed, signed composition offsets in input
tracks can be used.
NOTE 4 A derived visual track can have an edit list; thus, a derived visual track using the identity transform,
and with an edit-list, can provide a visual output that is a temporal re-mapping of the input track.
The inputs for a derivation operation in a derived sample can be either input image items from file-level
MetaBox or intervals (possibly spanning multiple samples) of input video tracks, image sequence tracks,
metadata items or metadata tracks, the visual output of a preceding derivation operation or a default
input fill picture.
Transformative item properties or transformations (e.g. clean aperture, track matrix etc…) associated
with input image items or samples of input tracks are always applied before performing the derivation
operation.
NOTE 5 If a derived sample needs to refer to one explicit sample value in a referred track (other than the time-
aligned sample value), an item can be created and referred to that has the same data as the desired sample value.
The visual inputs in a derived sample shall have consistent pixel aspect ratio and bit depth. The input
image items, samples of input tracks or derived samples may have various width and height. When
differences in width and height result in pixels that never get ‘painted’ by a derivation operation, those
empty pixels are filled according to the value of default_derivation_input parameter signalled in
DerivedVisualTrackConfigRecord (black, white or grey pixels). When differences in width and height
result in pixels that end up outside the visual output size by a derivation operation, those pixels are
cropped. This default behaviour may be overridden by derivation operation specifications.
A derived sample is reconstructed by performing the specified derivation operations in sequence. Some
derivation operations can be marked as non-essential which indicates that the derivation operation
may be skipped by the reader. However, the operations marked as essential shall be used in order to
obtain a valid derived sample.
3
© ISO/IEC 2021 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/IEC 23001-16:2021(E)
When more than one derivation operation is listed in a derived sample, the derivation operation that is
not first in the list may include the output result (e.g. the visual output) of any of the previous derivation
operations, only new inputs, or a combination of both.
In many cases the source tracks pointed to by the 'dtrk' track reference are not intended for display.
When a track is not intended for display, track_in_movie shall be equal to 0 for that track.
The visual output of a derived sample is the output from the last derivation operation in the sample.
If there is no derivation operation, an empty derived sample (i.e. sample size of 0) is equivalent to an
empty edit, i.e., there is no visual output from the derived visual track at that time.
Using derived visual tracks, it is possible to build either a chain of derivation operations on one single
derived visual track or a hierarchy of multiple derived visual tracks when they are used as a visual
input to another derived visual track. The latter should only be used when each derived visual track in
the hierarchy is also needed on its own.
5 Derivation operation
5.1 Definition
Box Type: 'dimg'
Container: derived sample or DerivedVisualTrackConfigRecord in a
DerivedVisualSampleEntry
Mandatory: Yes, in a DerivedVisualTrackConfigRecord, and No in a derived sample
Quantity: At least one in a DerivedVisualTrackConfigRecord, and Zero or more in a
derived sample
A derivation operation in either a derived sample entry or derived sample is represented by a
container box of type 'dimg' that always carries a derivation transformation box inherited from
VisualDerivationBase, and can carry a VisualDerivationInputs providing the inputs for the derivation
transformation.
A derivation transformation in a derivation operation is identified by a 32-bit value, also known as a
four-character code in ISO/IEC 14496-12, unless that code is 'uuid', whereupon a UUID identifies a
vendor-specific derivation transformation.
A derivation transformation's parameters shall
a) be single, countable
b) have defined default values in the specification
For both inputs and parameters, there is a bit-m
...

FINAL
INTERNATIONAL ISO/IEC
DRAFT
STANDARD FDIS
23001-16
ISO/IEC JTC 1/SC 29
Information technology — MPEG
Secretariat: JISC
systems technologies —
Voting begins on:
2021-09-03
Part 16:
Voting terminates on:
Derived visual tracks in the ISO base
2021-10-29
media file format
RECIPIENTS OF THIS DRAFT ARE INVITED TO
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
Reference number
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
ISO/IEC FDIS 23001-16:2021(E)
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
©
NATIONAL REGULATIONS. ISO/IEC 2021

---------------------- Page: 1 ----------------------
ISO/IEC FDIS 23001-16:2021(E)

COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2021
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2021 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC FDIS 23001-16:2021(E)

Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Derived visual tracks, design principles . 2
5 Derivation operation . 4
5.1 Definition . 4
5.2 Syntax . 5
5.3 Semantics . 6
6 Sample entry and configuration definition . 6
6.1 Sample entry definition . 6
6.2 Derived visual track configuration record . 7
6.2.1 Definition . 7
6.2.2 Syntax . 7
6.2.3 Semantics . 7
7 Sample format . 8
7.1 General . 8
7.2 Syntax . 8
8 Derivation transformations . 8
8.1 Overview . 8
8.2 Identity. 9
8.2.1 Definition . 9
8.2.2 Syntax . 9
8.3 sRGB Fill . 9
8.3.1 Definition . 9
8.3.2 Syntax . 9
8.3.3 Semantics .10
8.4 Dissolve.10
8.4.1 Definition .10
8.4.2 Syntax .10
8.4.3 Semantics .10
8.5 Crop .11
8.5.1 Definition .11
8.5.2 Syntax .11
8.5.3 Semantics .11
8.6 Rotation .11
8.6.1 Definition .11
8.6.2 Syntax .11
8.6.3 Semantics .11
8.7 Mirror .12
8.7.1 Definition .12
8.7.2 Syntax .12
8.7.3 Semantics .12
8.8 Scaling .12
8.8.1 Definition .12
8.8.2 Syntax .12
8.8.3 Semantics .12
8.9 Region of interest (ROI) selection .12
8.9.1 Definition .12
8.9.2 Syntax .13
© ISO/IEC 2021 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC FDIS 23001-16:2021(E)

8.10 Grid composition .13
8.10.1 Definition .13
8.10.2 Syntax .13
8.10.3 Semantics .13
8.11 Overlay composition .14
8.11.1 Definition .14
8.11.2 Syntax .14
8.11.3 Semantics .14
Annex A (informative) Examples of derivation operations usage .15
Bibliography .18
iv © ISO/IEC 2021 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC FDIS 23001-16:2021(E)

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives or www .iec .ch/ members
_experts/ refdocs).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www .iso .org/ patents) or the IEC
list of patent declarations received (see patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www .iso .org/
iso/ foreword .html. In the IEC, see www .iec .ch/ understanding -standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
A list of all parts in the ISO/IEC 23001 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html and www .iec .ch/ national
-committees.
© ISO/IEC 2021 – All rights reserved v

---------------------- Page: 5 ----------------------
ISO/IEC FDIS 23001-16:2021(E)

Introduction
Derived visual tracks are designed to enable defining a timed sequence of visual transformation
operations to be applied to input still images and/or samples of timed sequences of images in the same
presentation. It is built using tools defined in the ISO base media file format (ISO/IEC 14496-12). This
document specifies the core design and an initial base set of transformation operations.
The International Organization for Standardization (ISO) and International Electrotechnical
Commission (IEC) draw attention to the fact that it is claimed that compliance with this document may
involve the use of a patent.
ISO and IEC take no position concerning the evidence, validity and scope of this patent right.
The holder of this patent right has assured ISO and IEC that he/she is willing to negotiate licences under
reasonable and non-discriminatory terms and conditions with applicants throughout the world. In this
respect, the statement of the holder of this patent right is registered with ISO and IEC. Information may
be obtained from the patent database available at www .iso .org/ patents.
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights other than those in the patent database. ISO [and/or] IEC shall not be held responsible for
identifying any or all such patent rights.
vi © ISO/IEC 2021 – All rights reserved

---------------------- Page: 6 ----------------------
FINAL DRAFT INTERNATIONAL STANDARD ISO/IEC FDIS 23001-16:2021(E)
Information technology — MPEG systems technologies —
Part 16:
Derived visual tracks in the ISO base media file format
1 Scope
This document defines a storage format for derived visual tracks and an initial base set of related
transformation operations. The format defined in this document enables the interchange, editing, and
display of timed sequences of images that result from transformation operations applied to input still
images or samples of timed sequences of images in the same presentation.
This format defines normative structures used to contain the description of transformation
operations, how to link that transformation operations to the inputs, and defines how to process those
transformation operations to obtain a timed sequence of video frames.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 14496-12, Information technology — Coding of audio-visual objects — Part 12: ISO Base Media file
format
ISO/IEC 23001-10, Information technology — MPEG systems technologies — Part 10: Carriage of timed
metadata metrics of media in ISO base media file format
ISO/IEC 23008-12, Information technology — High efficiency coding and media delivery in heterogeneous
environments — Part 12: Image File Format
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 14496-12 and the
following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
3.1
derivation operation
container box representing an operation applying a derivation transformation (3.2) on an ordered list of
inputs (3.5)
3.2
derivation transformation
visual transformation operation identified by a 32-bit value and a set of parameters that transforms
inputs (3.5) into visual outputs (3.8)
Note 1 to entry: The 32-bit value is also known as a four-character code in ISO/IEC 14496-12.
© ISO/IEC 2021 – All rights reserved 1

---------------------- Page: 7 ----------------------
ISO/IEC FDIS 23001-16:2021(E)

3.3
derived sample
sample containing an ordered list of derivation operations (3.1)
3.4
derived visual track
video or picture track that contains a timed sequence of derived samples (3.3)
3.5
input
parameter input (3.6) or visual input (3.7)
3.6
parameter input
metadata from an input item or track that is used as input for a derivation transformation (3.2) of a
derivation operation (3.1)
Note 1 to entry: The parameter input is either an input metadata item from file-level MetaBox or an interval of an
input metadata track (possibly spanning multiple samples).
3.7
visual input
video or still image that is used as input for a derivation transformation (3.2) of a derivation operation
(3.1)
Note 1 to entry: The visual input is either an input image item from file-level MetaBox, an interval of an input
track (possibly spanning multiple samples), the visual output of a preceding derivation operation (3.1) or the
default input fill picture signalled in the configuration record of the derived visual track (3.4).
3.8
visual output
one video frame or a sequence of video frames that is output from a derivation transformation (3.2) of a
derivation operation (3.1)
4 Derived visual tracks, design principles
A derived visual track describes a timed sequence of derived samples composed of an ordered list of
derivation operations, each derivation operation applying a derivation transformation for the duration
of the derived sample on an ordered list of inputs represented in the same presentation.
A derived visual track shall be either a video track (with the 'vide' handler type in the HandlerBox
of the MediaBox as defined in ISO/IEC 14496-12) or a picture track (with the 'pict' handler type in
the HandlerBox of the MediaBox as defined in ISO/IEC 23008-12). A derived visual track is identified by
its containing sample entry of type 'dtrk' DerivedVisualSampleEntry. Each sample described by a
DerivedVisualSampleEntry is a derived sample.
A derived visual track shall include a TrackReferenceTypeBox with reference_type equal to 'dtrk'
referring to all the inputs. Each reference shall be one of:
a) the track_ID of a track used by derived samples in the track, or, if unified IDs are in use as defined
by ISO/IEC 14496-12, a track_group_id;
b) the item_ID of an image item, in the file-level MetaBox, used by derived samples in the track.
An ID value in the track references is resolved to a track_ID whenever the file contains a track with
such ID, is resolved to a track_group_id whenever unified IDs are in use and the file contains a track
group with such ID, and is resolved to an item_ID otherwise.
NOTE 1 A track_ID can be an ID of a derived visual track.
2 © ISO/IEC 2021 – All rights reserved

---------------------- Page: 8 ----------------------
ISO/IEC FDIS 23001-16:2021(E)

If a referenced track is a member of an alternate group or switch group, or if the reference is to a track
group, then the reader should pick a track from the group as the input to the derived visual track.
NOTE 2 The TrackSelectionBox can be used to provide guidance on the selection between members of an
alternate group or switch group.
Similarly, if a referenced image item is a member of an alternate group (which may contain both tracks
and images), then the reader should pick one member of the group as the input to the derived visual
track.
A derived sample contains an ordered list of the derivation operations to be performed, each derivation
operation applying a derivation transformation on an ordered list of inputs. The layer syntax element
in TrackHeaderBox has no impact on ordering the inputs for derived samples.
The four-character codes of derivation transformation from all derivation operations used by the
derived samples in the track are listed in the DerivedVisualSampleEntry, and also default inputs and
parameter values can be supplied there. A derived sample in the track may use all or some of the
derivation operations listed in the linked DerivedVisualSampleEntry, but derived samples shall not use
a derivation operation not listed in the sample entry.
The derived sample durations document the time over which the derivation represented by the ordered
list of derivation operations is active. Therefore, the number of samples defined in a derived visual track
does not necessarily match 1:1 with the number of input image items or samples of input tracks that
are being transformed. A single derivation duration may span multiple samples in the source track(s),
and also derivation transformations in derived samples may have 'internal time structure' (e.g. a cross-
fade) so the picture may change during the sample duration. This is in contrast to 'classic video'.
Derived visual tracks do not respect edit lists on inputs. They operate on the composition timeline (i.e.
before the application of edit lists) of their input tracks (including on derived visual tracks when used
as visual inputs). However, the input tracks shall not have edit lists. Any edit lists of the input tracks
shall be ignored if present.
NOTE 3 When time-alignment adjustment between input tracks is needed, signed composition offsets in input
tracks can be used.
NOTE 4 A derived visual track can have an edit list; thus, a derived visual track using the identity transform,
and with an edit-list, can provide a visual output that is a temporal re-mapping of the input track.
The inputs for a derivation operation in a derived sample can be either input image items from file-level
MetaBox or intervals (possibly spanning multiple samples) of input video tracks, image sequence tracks,
metadata items or metadata tracks, the visual output of a preceding derivation operation or a default
input fill picture.
Transformative item properties or transformations (e.g. clean aperture, track matrix etc…) associated
with input image items or samples of input tracks are always applied before performing the derivation
operation.
NOTE 5 If a derived sample needs to refer to one explicit sample value in a referred track (other than the time-
aligned sample value), an item can be created and referred to that has the same data as the desired sample value.
The visual inputs in a derived sample shall have consistent pixel aspect ratio and bit depth. The input
image items, samples of input tracks or derived samples may have various width and height. When
differences in width and height result in pixels that never get ‘painted’ by a derivation operation, those
empty pixels are filled according to the value of default_derivation_input parameter signalled in
DerivedVisualTrackConfigRecord (black, white or grey pixels). When differences in width and height
result in pixels that end up outside the visual output size by a derivation operation, those pixels are
cropped. This default behaviour may be overridden by derivation operation specifications.
A derived sample is reconstructed by performing the specified derivation operations in sequence. Some
derivation operations can be marked as non-essential which indicates that the derivation operation
may be skipped by the reader. However, the operations marked as essential shall be used in order to
obtain a valid derived sample.
© ISO/IEC 2021 – All rights reserved 3

---------------------- Page: 9 ----------------------
ISO/IEC FDIS 23001-16:2021(E)

When more than one derivation operation is listed in a derived sample, the derivation operation that is
not first in the list may include the output result (e.g. the visual output) of any of the previous derivation
operations, only new inputs, or a combination of both.
In many cases the source tracks pointed to by the 'dtrk' track reference are not intended for display.
When a track is not intended for display, track_in_movie shall be equal to 0 for that track.
The visual output of a derived sample is the output from the last derivation operation in the sample.
If there is no derivation operation, an empty derived sample (i.e. sample size of 0) is equivalent to an
empty edit, i.e., there is no visual output from the derived visual track at that time.
Using derived visual tracks, it is possible to build either a chain of derivation operations on one single
derived visual track or a hierarchy of multiple derived visual tracks when they are used as a visual
input to another derived visual track. The latter should only be used when each derived visual track in
the hierarchy is also needed on its own.
5 Derivation operation
5.1 Definition
Box Type: 'dimg'
Container: derived sample or DerivedVisualTrackConfigRecord in a
DerivedVisualSampleEntry
Mandatory: Yes, in a DerivedVisualTrackConfigRecord, and No in a derived sample
Quantity: At least one in a DerivedVisualTrackConfigRecord, and Zero or more in a
derived sample
A derivation operation in either a derived sample entry or derived sample is represented by a
container box of type 'dimg' that always carries a derivation transformation box inherited from
VisualDerivationBase, and can carry a VisualDerivationInputs providing the inputs for the deri
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.