Information technology — Coded representation of immersive media — Part 10: Carriage of visual volumetric video-based coding data

This document specifies carriage of coded media representations which comply with visual volumetric video-based coding and video-based point cloud compression (specified in ISO/IEC 23090-5).

Technologies de l'information — Représentation codée de média immersifs — Partie 10: Transport de données de codage basé sur la vidéo volumétrique

General Information

Status
Published
Publication Date
17-May-2022
Current Stage
9092 - International Standard to be revised
Start Date
21-Jul-2024
Completion Date
30-Oct-2025
Ref Project

Relations

Standard
ISO/IEC 23090-10:2022 - Information technology — Coded representation of immersive media — Part 10: Carriage of visual volumetric video-based coding data Released:5/18/2022
English language
85 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


INTERNATIONAL ISO/IEC
STANDARD 23090-10
First edition
2022-05
Information technology — Coded
representation of immersive media —
Part 10:
Carriage of visual volumetric video-
based coding data
Technologies de l'information — Représentation codée de média
immersifs —
Partie 10: Transport de données de codage basé sur la vidéo
volumétrique
Reference number
© ISO/IEC 2022
© ISO/IEC 2022
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
© ISO/IEC 2022 – All rights reserved

Contents Page
Foreword . vi
Introduction .vii
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Abbreviated terms . 2
5 Overview . 3
5.1 General . 3
5.2 Overall architecture for carriage of V3C data . 3
5.3 Summary of referenceable code points . 4
5.3.1 Brands . 4
5.3.2 Uniform resource names . 4
5.3.3 Restricted scheme types . 4
5.3.4 Sample entry types . 4
5.3.5 Box types . 5
5.3.6 Track reference types . 6
5.3.7 Track grouping types. 6
5.3.8 Entity grouping types . 6
5.3.9 Sample grouping types . 7
6 Volumetric media .7
6.1 General . 7
6.2 Volumetric visual media . 7
6.3 Volumetric visual media header . 7
6.3.1 Definition . 7
6.3.2 Syntax . . 7
6.3.3 Semantics . 7
6.4 Volumetric visual sample entry . 7
6.4.1 Definition . 7
6.4.2 Syntax . . 7
6.4.3 Semantics . 8
6.5 Volumetric visual sample group entry . 8
6.6 Volumetric visual samples . 8
7 Carriage of visual volumetric video-based coding data . 8
7.1 General . 8
7.2 Common boxes and data structures . 8
7.2.1 V3C decoder configuration record . 8
7.2.2 V3C decoder configuration box . 10
7.2.3 V3C unit header box . . 10
7.2.4 V3C atlas parameter set sample group . 10
7.2.5 Object switch alternatives box . 11
7.3 Single track encapsulation of V3C data . 11
7.3.1 General . 11
7.3.2 V3C bitstream sample entry .12
7.3.3 V3C bitstream track sample format .12
7.4 Multi-track encapsulation of V3C data . 13
7.4.1 General .13
7.4.2 V3C atlas sample entry . 14
7.4.3 V3C atlas tile sample entry. 16
7.4.4 V3C atlas sample format . 17
7.4.5 V3C video component track . 18
7.4.6 Track references. 19
7.4.7 Track alternatives and track grouping . 19
iii
© ISO/IEC 2022 – All rights reserved

7.4.8 Playout groups .20
7.4.9 Summary .20
8 Carriage of non-timed visual volumetric video-based coding data .21
8.1 General . 21
8.2 V3C atlas item . 22
8.3 V3C atlas tile item . 22
8.4 V3C component item . .22
8.5 V3C-related item properties .23
8.5.1 General .23
8.5.2 V3C configuration item property . 23
8.5.3 V3C unit header item property . 23
8.5.4 V3C atlas tile configuration item property . 24
8.5.5 Playout groups . 24
9 Partial access of volumetric visual data .25
9.1 General . 25
9.2 Common data structures. 25
9.2.1 3D vector . 25
9.2.2 Spatial region bounding box . 25
9.2.3 Tile mapping .26
9.2.4 Object collection . 27
9.3 Spatial region information structure .29
9.3.1 Definition .29
9.3.2 Syntax . .29
9.3.3 Semantics . 29
9.4 V3C tile video component track grouping .29
9.4.1 Definition .29
9.4.2 Syntax . .30
9.4.3 Semantics . 30
9.5 Volumetric media bounding box . 30
9.5.1 Definition .30
9.5.2 Syntax . . 31
9.6 Static spatial region collection box . 31
9.6.1 Definition . 31
9.6.2 Syntax . . 31
9.6.3 Semantics . 31
9.7 Dynamic spatial region information . 31
9.7.1 General . 31
9.7.2 Sample entry . 32
9.7.3 Sample format . 32
9.7.4 Sync samples . 32
9.8 Storage of atlas tiles using NALUMapEntry. 32
10 Viewport information .33
10.1 General . 33
10.2 Structures . 33
10.2.1 Extrinsic camera information . 33
10.2.2 Intrinsic camera information .34
10.2.3 Viewport information . 35
10.3 Viewport information timed-metadata track . 35
10.3.1 General . 35
10.3.2 Viewport information sample entry . 35
10.3.3 Viewport information sample format . 37
11 Encapsulation and signalling in MPEG-DASH.38
11.1 Single track mode .38
11.2 Multi-track mode .38
11.2.1 General .38
11.2.2 V3C preselections . 39
iv
© ISO/IEC 2022 – All rights reserved

11.2.3 V3C atlas tile preselections .40
11.3 DASH MPD descriptors for V3C content .40
11.3.1 XML namespace and schema .40
11.3.2 V3C video component descriptor .40
11.3.3 V3C descriptor . 43
11.4 Supporting multiple versions of a V3C media .44
11.5 Switching codecs for V3C video components .44
11.6 Signalling spatial regions for partial access .44
11.6.1 Static spatial regions .44
11.6.2 Dynamic spatial regions . 47
11.7 Signalling recommended viewports . 47
11.7.1 Static viewports . 47
11.7.2 Dynamic viewports.49
12 Encapsulation and signalling MMT .49
12.1 Introduction .49
12.2 MMT signalling descriptors for V3C content .50
12.2.1 Asset reference descriptor .50
12.2.2 V3C Asset descriptor . 51
12.3 MMT signalling messages for V3C Content . 52
12.3.1 General . 52
12.3.2 V3C Asset Group message . 52
12.3.3 V3C Selection message .54
12.3.4 V3C View Change Feedback message . 55
Annex A (normative) File format toolsets and brands .58
Annex B (normative) V3C DASH schema .59
Annex C (normative) MIME types and sub-parameters .61
Annex D (informative) DASH MPD examples .62
Annex E (informative) Partial access utilizing V3C volumetric annotation SEI message
family .77
Annex F (informative) Partial access using volumetric information timed-metadata tracks.80
Annex G (informative) Partial access for overlapping spatial subdivisions .82
Annex H (informative) Examples of using alternate groups .83
Bibliography .85
v
© ISO/IEC 2022 – All rights reserved

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work.
The procedures used to develop this document and those intended for its further maintenance
are described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria
needed for the different types of document should be noted. This document was drafted in
accordance with the editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives or
www.iec.ch/members_experts/refdocs).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents) or the IEC
list of patent declarations received (see https://patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see
www.iso.org/iso/foreword.html. In the IEC, see www.iec.ch/understanding-standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
A list of all parts in the ISO/IEC 23090 series can be found on the ISO and IEC websites.
Any feedback or questions on this document should be directed to the user’s national standards
body. A complete listing of these bodies can be found at www.iso.org/members.html and
www.iec.ch/national-committees.
vi
© ISO/IEC 2022 – All rights reserved

Introduction
This document addresses the storage of visual volumetric video-based coding data in files based on
ISO/IEC 14496-12, reusing existing tools for storage of video-coded components. Another important
aspect considered by this document is supporting flexible extraction of component streams at delivery
or decoding time, or both.
The International Organization for Standardization (ISO) and International Electrotechnical
Commission (IEC) draw attention to the fact that it is claimed that compliance with this document may
involve the use of a patent.
ISO and IEC take no position concerning the evidence, validity and scope of this patent right.
The holder of this patent right has assured ISO and IEC that they are willing to negotiate licences under
reasonable and non-discriminatory terms and conditions with applicants throughout the world. In this
respect, the statement of the holder of this patent right is registered with ISO and IEC. Information may
be obtained from the patent database available at www.iso.org/patents or patents.iec.ch.
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights other than those in the patent database. ISO and IEC shall not be held responsible for
identifying any or all such patent rights.
vii
© ISO/IEC 2022 – All rights reserved

INTERNATIONAL STANDARD ISO/IEC 23090-10:2022(E)
Information technology — Coded representation of
immersive media —
Part 10:
Carriage of visual volumetric video-based coding data
1 Scope
This document specifies carriage of coded media representations which comply with visual volumetric
video-based coding and video-based point cloud compression (specified in ISO/IEC 23090-5).
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
IEEE 754-2019, IEEE Standard for Floating-Point Arithmetic
IETF RFC 6381, The ‘Codecs’ and ‘Profiles’ Parameters for “Bucket” Media Types
ISO/IEC 14496-12, Information technology — Coding of audio-visual objects — Part 12: ISO base media file
format
ISO/IEC 14496-15, Information technology — Coding of audio-visual objects — Part 15: Carriage of
network abstraction layer (NAL) unit structured video in the ISO based media file format
ISO/IEC 23008-1:2017, Information technology — High efficiency coding and media delivery in
heterogeneous environments — Part 1: MPEG media transport (MMT)
ISO/IEC 23009-1:2019, Information technology — Dynamic adaptive streaming over HTTP (DASH) —
Part 1: Media presentation description and segment formats
ISO/IEC 23090-5:2021 , Information technology — Coded representation of immersive media — Part 5:
Visual Volumetric Video-based Coding (V3C) and Video-based Point Cloud Compression (V-PCC)
W3C Recommendation, XML schema part 1: Structures
W3C Recommendation, XML schema part 2: Datatypes
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 23090-5 and the following
apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
atlas parameter sets
non-ACL NAL units that have nal_unit_type equal to NAL_ASPS, NAL_AAPS, or NAL_AFPS.
© ISO/IEC 2022 – All rights reserved

3.2
V3C content
volumetric media that is encoded
Note 1 to entry: For the purposes of this document, the media shall be encoded using ISO/IEC 23090-5.
3.3
volumetric visual track
track with a handler type reserved to describe volumetric visual track
3.4
V3C track
V3C bitstream track, V3C atlas track or V3C atlas tile track
3.5
V3C bitstream track
volumetric visual track containing V3C bitstream in case of single-track container
3.6
V3C atlas track
volumetric visual track containing V3C atlas bitstream in case of multi-track container
3.7
V3C atlas tile track
volumetric visual track containing portion of V3C atlas bitstream corresponding to one or more tiles in
case of multi-track container
3.8
V3C video component track
video track which carries 2D video encoded data for any of the occupancy, geometry, or attribute
component video bitstreams of the V3C bitstream
4 Abbreviated terms
2D two-dimensional
3D three-dimensional
CVS coded V3C sequence
DASH dynamic adaptive streaming over HTTP
HTTP Hyper-text transfer protocol
IRAP intra random access point
ISOBMFF ISO base media file format
LoD level of detail
PCC point cloud compression
SEI supplemental enhancement information
V3C visual volumetric video-based coding
VPS V3C parameter set
V-PCC video-based Point Cloud Coding
© ISO/IEC 2022 – All rights reserved

5 Overview
5.1 General
Visual volumetric video-based coding (V3C) provides mechanism for coding visual volumetric frames.
Visual volumetric frames are coded by converting the 3D volumetric information into a collection of 2D
images and associated data. The converted 2D images are coded using widely available video and image
coding specifications and the associated data, i.e., atlas data, is coded according to ISO/IEC 23090-5.
The coded images and the coded atlas data are multiplexed and form a V3C bitstream.
A V3C bitstream consists of one or more CVSs. A CVS starts with a VPS, included in at least one V3C unit
or provided through external means, and contains one or more V3C units carrying V3C sub-bitstreams,
with each V3C sub-bitstream associated with a V3C component., e.g., atlas, occupancy, geometry, or
attribute.
5.2 Overall architecture for carriage of V3C data
Figure 1 shows a typical content flow process for V3C media.
Figure 1 — Content flow process for V3C media
A real-world or synthetic visual scene (A) is captured by a set of cameras, a camera device with multiple
lenses and sensors, or by virtual cameras. The acquisition results in source volumetric data (B). One
or multiple volumetric frames are encoded as a coded V3C bitstream including an atlas bitstream, at
most one occupancy bitstream, a geometry bitstream, and zero or more attribute bitstreams (E ). One
v
or more coded bitstreams are then packaged into a media file for local playback (F) or a sequence of an
initialization segment and media segments for streaming (F ), according to a particular media container
s
file format. In this document, the media container file format is the ISO Base Media File Format specified
in ISO/IEC 14496-12. The file encapsulator may also include metadata into the file or the segments. The
segments F are delivered using a delivery mechanism to a player.
s
The file that the file encapsulator outputs (F) is identical to the file that the file decapsulator takes as
input (F'). A file decapsulator processes the file (F') or the received segments (F' ) and extracts the
s
coded bitstreams (E' ) and parses the metadata. The V3C bitstream is then decoded into a decoded
v
signal (D'). The decoded volumetric data (D') are reconstructed, rendered, and displayed onto the
screen of a head-mounted display or any other display device based on the current viewing orientation
or viewport. The current viewing orientation is determined by the head tracking and possibly also eye
tracking functionality. In viewport-dependent delivery, the current viewing orientation is also passed
to the strategy module, which determines the tracks to be received based on the viewing orientation.
The process described above is applicable to both live and on-demand use cases.
© ISO/IEC 2022 – All rights reserved

The following interfaces are normatively specified in this document:
— F/F': media file including the specification of the track formats, which may contain constraints on
the elementary streams contained within the samples of the tracks; see Clause 7 for timed V3C
content and Clause 8 for non-timed V3C data.
— Clause 11 specifies the delivery related interfaces for DASH delivery.
— Clause 12 specifies the delivery related interfaces for MMT delivery.
5.3 Summary of referenceable code points
5.3.1 Brands
ISO/IEC 14496-12 defines the concept of brands, which may be indicated in the FileTypeBox. Brands are
used in this document to indicate conformance to an encapsulation mode and a specific set of tools, as
well as requirements on other specifications (e.g., ISO/IEC 14496-12).
The brands specified in this document are listed in Table 1 and defined in Annex A.
Table 1 — Brands specified in this document
Brand Clause Informative description
v3st
A.2 Single track encapsulation mode
v3mt
A.3 Multi-track encapsulation mode
v3mp
A.3 Multi-track encapsulation mode with partial access support
v3nt
A.4 Non-timed V3C data
5.3.2 Uniform resource names
The URNs specified in this document are listed in Table 2.
Table 2 — URNs specified in this document
URN Clause Informative description
urn:mpeg:mpegI:v3c:2020
11.3.1 Namespace for the XML elements and attrib-
utes specified in this document
urn:mpeg:mpegI:v3c:2020:component
11.3.2 Scheme identifier for the V3C component DASH
MPD descriptor
urn:mpeg:mpegI:v3c:2020:v3c
11.3.3 Scheme identifier for the V3C content DASH
MPD descriptor
urn:mpeg:mpegI:v3c:2020:v3sr
11.6.1 Scheme identifier for the V3C static spatial
region DASH MPD descriptor
5.3.3 Restricted scheme types
The restricted scheme types specified in this document are listed in Table 3.
Table 3 — Restricted scheme types specified in this document
Restricted scheme type Clause Informative description
vvvc
7.4.5.1 V3C component video
5.3.4 Sample entry types
The sample entry types specified in this document are listed in Table 4.
© ISO/IEC 2022 – All rights reserved

Table 4 — Sample entry types specified in this document
Sample entry type Clause Informative description
v3e1
7.3.2.2 For use with the single-track mode with all atlas parameter sets and SEI
messages carried in decoder configuration record
v3eg
7.3.2.2 For use with the single-track mode with atlas parameter sets and SEI mes-
sages carried in decoder configuration record and in track samples
v3c1
7.4.2 For use with the multi-track mode with a single atlas and all atlas parame-
ter sets and SEI messages carried in decoder configuration record
v3cg
7.4.2 For use with the multi-track mode with a single atlas and atlas parame-
ter sets and SEI messages carried in decoder configuration record and in
track samples
v3cb
7.4.2 For use with a base track in the multi-track mode with multiple atlases
v3a1
7.4.2 For use with an atlas track in the multi-track mode with multiple atlases
and all atlas parameter sets and SEI messages carried in decoder configu-
ration record
v3ag
7.4.2 For use with an atlas track in multi-track mode with multiple atlases and
atlas parameter sets and SEI messages carried in decoder configuration
record and in track samples
v3t1
7.4.3 For use with an atlas tile track in the multi-track mode
dyvm
9.7.1 For use with a timed metadata track indicating the dynamic spatial re-
gions that are dynamically changing over time
6vpt
10.3.2 For use with a timed metadata track indicating viewport information that
are dynamically changing over time
5.3.5 Box types
The box types specified in this document are listed in bold in Table 5. Mandatory boxes are marked
with an asterisk. Box types without a four-character code are marked with ‘-‘ in the structure.
Table 5 — Box types specified in this document
Box types, structure, and cross-reference (Informative)
moov     * ISOBMFF container for all the metadata
trak    * ISOBMFF container for an individual track or stream
trgr    ISOBMFF track grouping indication
potg    7.4.8.2 playout track group box
vtcg    9.4 atlas tile components track group box
mdia    * ISOBMFF container for the media information in a track
minf   * ISOBMFF media information container
stbl   * ISOBMFF sample table box, container for the time/space map
stsd  * ISOBMFF sample descriptions (codec types, initialization etc.)
-  ISOBMFF sample entry or restricted sample entry
rinf  ISOBMFF restricted scheme info box
frma ISOBMFF original format box
schm ISOBMFF scheme type box
schi ISOBMFF scheme information box
vunt 7.2.3 V3C unit header box
mmvi 7.4.5.2 Multimap video box
dyvm  9.7.2 dynamic volumetric metadata sample entry
6vpt  10.3.2 viewport information sample entry
6vpC  10.3.2 viewport information configuration box
-  ISOBMFF visual sample entry
© ISO/IEC 2022 – All rights reserved

Table 5 (continued)
Box types, structure, and cross-reference (Informative)
-  6.4 volumetric visual sample entry
v3cC  7.2.2 V3C decoder configuration box
vunt  7.2.3 V3C unit header box
v3tC  7.4.3 V3C atlas tile configuration box
vpbb  9.5
v3sc  9.6 Static spatial region collection box
meta     ISOBMFF Metadata
grpl     ISOBMFF group list box
eply    8.5.5.2 playout entity group box
swpc    7.2.5 object switch alternatives box
iprp     ISOBMFF item properties box
ipco    ISOBMFF item property container box
v3cp    8.5.2 V3C configuration item property
vutp    8.5.3 V3C unit header item property
v3tp    8.5.4 V3C atlas tile configuration item property
5.3.6 Track reference types
The track reference types specified in this document are listed in Table 6.
Table 6 — Track reference types specified in this document
Track reference type Clause Informative description
v3cs
7.4.6.1 Referenced track is a V3C atlas track
v3ct
7.4.6.2 Referenced track is a V3C atlas tile track
v3vo
7.4.6.3 Referenced track is a V3C video component track carrying occupancy
data
v3vg
7.4.6.3 Referenced track is a V3C video component track carrying geometry
data
v3va
7.4.6.3 Referenced track is a V3C video component track carrying attribute
data
5.3.7 Track grouping types
The track grouping types specified in this document are listed in Table 7.
Table 7 — Track grouping types specified in this document
Track grouping type Clause Informative description
potg
7.4.8.2 Playout track grouping
vtcg
9.4 V3C tile components track grouping
5.3.8 Entity grouping types
The entity grouping types s
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...