ISO/IEC 23000-11:2009/Amd 3:2014
(Amendment)Information technology — Multimedia application format (MPEG-A) — Part 11: Stereoscopic video application format — Amendment 3: Support movie fragment for Stereoscopic Video AF
Information technology — Multimedia application format (MPEG-A) — Part 11: Stereoscopic video application format — Amendment 3: Support movie fragment for Stereoscopic Video AF
Technologies de l'information — Format pour application multimédia (MPEG-A) — Partie 11: Format pour application vidéo stéréoscopique — Amendement 3: Prise en charge de fragments de film pour format d'application vidéo stéréoscopique
STANDARD 23000-11
First edition
Information technology — Multimedia
application format (MPEG-A) —
Part 11:
Stereoscopic video application format
AMENDMENT 3: Support movie fragment
for Stereoscopic Video AF
Technologies de l’information — Format pour application multimédia
(MPEG-A) —
Partie 11: Format pour application vidéo stéréoscopique
AMENDEMENT 3: Prise en charge de fragments de film pour format
d’application vidéo stéréoscopique
Information technology — Multimedia application format
(MPEG-A) —
Part 11:
Stereoscopic video application format
AMENDMENT 3: Support movie fragment for Stereoscopic
Video AF
In Clause 2, add the following normative reference:
ISO/IEC 23008-2, Information technology — High efficiency coding and media delivery in heterogeneous
environments — Part 2: High efficiency video coding
In 3.8, replace definition with:
maximum disparity value within successive stereoscopic samples
In 3.9, replace definition with:
set of samples which represents only monoscopic sequence
in 3.10, replace definition with:
minimum disparity value within a group of successive stereoscopic samples
In 3.18, replace text with:
stereoscopic samples
In 3.19, replace text with:
stereoscopic left samples
In 3.21, replace text with:
stereoscopic right samples
In Clause 4, add the following abbreviation:
HEVC High Efficiency Video Coding
In 6.1, replace text with:
Table 1 shows a brief summary of the supported components of the Stereoscopic Video AF which consists
of the ISO/IEC Standards and non-ISO/IEC Standards.
The Stereoscopic Video AF includes ISO/IEC 14496-2 Simple Profile at Level 3, ISO/IEC 14496-10 Baseline
Profile at Level 1.3, and ISO/IEC 23008-2 Main/Main10 Profile for visual, ISO/IEC 14496-3 AAC and
HE-AAC Profile for audio, 3GPP TS 26.071 AMR and TIA/EIA/IS-127 EVRC for voice, ISO/IEC 14496-20
LASeR for scene description, and various kinds of image such as ISO/IEC 10918-1 JPEG and ISO/IEC 15948
PNG. For this specification, ISO/IEC 14496-12 ISO base media file format is used for a base file format
Table 1 — Supported components of Stereoscopic Video AF
Type Component Name Specification Standard
File for-
ISO base media file format ISO/IEC 14496-12
ISO/IEC 14496-2 Simple Profile Level 3,
MPEG-4 Video
ISO/IEC 14496-2 Advanced Simple Profile Level
ISO/IEC 14496-10 Baseline Profile Level 1.3,
ISO/IEC 14496-10 High Profile Level 4.1
ISO/IEC 23008-2 Main Profile,
ISO/IEC Standards
MPEG-H HEVC ISO/IEC 23008-2 Main10 Profile,
ISO/IEC 23008-2 Main Still Picture Profile
MPEG-4 Audio AAC ISO/IEC 14496-3
MPEG-4 Audio HE-AAC ISO/IEC 14496-3
MPEG-4 LASeR ISO/IEC 14496-20
Data JPEG Image ISO/IEC 10918-1
PNG Image ISO/IEC 15948
AMR 3GPP TS 26.071
In 7.1, replace
The ‘mdia’ box contains a ‘svmi’ box for the stereoscopic visual type and fragment information of the
stereoscopic contents in the track.
The ‘iloc’ box describes the absolute offset in bytes (‘extent_offset’) and the size (‘extent_
length’) of stereoscopic fragments. An item_ID is assigned to each fragment of the stereoscopic
sequence for resource referencing.
The ‘mdia’ box contains a ‘svmi’ box for the stereoscopic visual type and sample information of the
stereoscopic contents in the track.
The ‘iloc’ box describes the absolute offset in bytes (‘extent_offset’) and the size (‘extent_
length’) of stereoscopic samples. An item_ID is assigned to successive samples of the stereoscopic
sequence for resource referencing.
In 7.2, add following text before 7.2.1:
In case of a stereoscopic content with Left/Right view sequence type, the ‘stss’ box which is in the track
for the primary view sequence is used for random access.
In 7.2.2, replace text with:
This subclause describes the file structures for a stereo-monoscopic mixed content, which is a video
sequence consisting of both stereoscopic and monoscopic samples in a single track. The stereoscopic
and monoscopic samples should be stored sequentially.
Figure 9 shows an example of the file structure containing a single track for a stereo-monoscopic mixed
content on the basis of the file format structure as shown in Figure 7. The item_ID under ‘iloc’ box is
assigned to each group of stereoscopic samples sequentially. For example, when a stereoscopic contents
is composed as illistrated in the below figure (S-M-S), the item_ID of the first group of samples in
the track, which is the first stereoscopic samples, is set to 1, and the item_ID of the third one (second
stereoscopic samples) in the track is set to 2.
Figure 9 — Example of a file structure for stereoscopic and monoscopic samples in a single ste-
reoscopic track
Figure 10 describes the file structure of a stereoscopic contents specified in 5.3.4, the composition
type for storing the left and the right view sequence of stereoscopic contents in two separate tracks.
Stereoscopic samples of each track have one view sequence on the basis of the file format structure as
shown in Figure 8. The item_ID is assigned to each stereoscopic samples of only one track sequentially.
Figure 10 — Example of a file structure for stereoscopic and monoscopic samples in Left/Right
view sequence type
In case of stereo-monoscopic mixed contents being shown in Figure 10, it could cause the same time
stamp for monoscopic samples in the individual tracks. This ambiguity of presentation can be figured
out as follows:
a) Check which track is indicating a primary view sequence by the ‘reference_type’ and ‘track_
ID’ of the ‘tref’ box in the track.
b) Display each monoscopic samples of primary view sequence.
Insert following clauses after 7.2.2:
7.3 File format brands
7.3.1 The ‘ss01’ and ‘ss02’ brand
The brand ‘ss01’ and ‘ss02’ shall be used to indicate that the file is conformant with the ‘stereoscopic
video application format’ in subclauses 7.1, 7.2, and Clause 8. If all the samples in content are stereoscopic
samples, ‘ss01’ is used. If the content is a mixture of stereoscopic samples and monoscopic samples,
‘ss02’ is used.
The ‘ss01’ and ‘ss02’ brand requires support of the boxes in Table 2.
7.3.2 The ‘ss03’ brand
The brand ‘ss03’ shall be used if grouping_types for stereoscopic composition type and camera display
information in Clause 8 are used.
The ‘ss03’ brand requires support of the ‘iso2’ brand. In addition, support of the following boxes is
sbgp sample-to-group
sgpd sample group description
Remove text from 8.1.
Remove text from 8.2.
Remove text from 8.3.
In 8.4, replace whole clause with:
8.1 Stereoscopic Video Media Information Box
8.1.1 Definition
Box Type : ‘svmi’
Container: Sample Table Box (‘stbl’)
Mandatory: Yes
Quantity: Exactly one
The ‘svmi’ box provides stereoscopic video media information regarding the stereoscopic visual
type and also, for the care of some mixed contents, stereoscopic or monoscopic samples information.
The visual type information signals the composition type of the stereoscopic video sequence and the
structure of samples. The stereoscopic samples or monoscopic samples information represents the
number of successive samples, the number of consecutive samples, and whether the current sample is
stereoscopic or not.
8.1.2 Syntax
aligned(8) class StereoscopicVideoMediaInformationBox extends
FullBox(‘svmi’, version = 0, 0){
// stereoscopic visual type information
unsigned int(8) stereoscopic_composition_type;
unsigned int(7) reserved = 0;
unsigned int(1) is_left_first;
// stereo_mono_change information
unsigned int(32) stereo_mono_change_count;
for(i=0; i<=stereo_mono_change_count; i++){
unsigned int(32) sample_count;
unsigned int(7) reserved = 0;
unsigned int(1) stereo_flag;
8.1.3 Semantics
stereoscopic_composition_type – the type of stereoscopic contents that are specified in Table 4.
Table 4 — Stereoscopic composition type
Value Stereoscopic_composition_type
0x00 Side-by-side (half) type
0x01 Vertical line interleaved type
0x02 Frame sequential type
0x03 Left/Right view sequence type
0x04 Top-Bottom (half) type
0x05 Side-by-side (full) type
0x06 Top-Bottom (full) type
0x07-0xFF Reserved
is_left_first – represents positions of left and right view sequence for 3D mobile devices as being
specified in Table 5. When is_left_first is ‘1’ and current stereoscopic video is composed of side-
by-side type, left side and right side of the image means left view and right view, respectively. When
is_left_first is ‘0’, left side and right side means right view and left view, respectively. When is_
left_first is ‘1’ and current stereoscopic video is composed of vertical line interleaved type, odd line
and even line of the image means left view and right view, respectively. When is_left_first is ‘0’, odd
line and even line means right view and left view, respectively. When is_left_first is ‘1’ and current
stereoscopic video is composed of frame sequential type, odd frame and even frame of the sequence
means left view and right view, respectively. When is_left_first is ‘0’, odd frame and even frame
means right view and left view, respectively. When is_left_first is ‘1’ and current stereoscopic video
is composed of Left/Right view sequence type, primary view sequence and secondary view se
