Information technology — Coding of audio-visual objects — Part 12: ISO base media file format — Amendment 2: Carriage of timed text and other visual overlays

Technologies de l'information — Codage des objets audiovisuels — Partie 12: Format ISO de base pour les fichiers médias — Amendement 2: Transport de texte temporisé et autres recouvrements visuels

General Information

Status
Withdrawn
Publication Date
12-Jan-2014
Withdrawal Date
12-Jan-2014
Current Stage
9599 - Withdrawal of International Standard
Completion Date
25-Nov-2015
Ref Project

Relations

Buy Standard

Standard
ISO/IEC 14496-12:2012/Amd 2:2014 - Carriage of timed text and other visual overlays
English language
7 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO/IEC
STANDARD 14496-12
Fourth edition
2012-07-15
Corrected version
2012-09-15
AMENDMENT 2
2014-01-15
Information technology — Coding of
audio-visual objects —
Part 12:
ISO base media file format
AMENDMENT 2: Carriage of timed text
and other visual overlays
Technologies de l’information — Codage des objets audiovisuels —
Partie 12: Format ISO de base pour les fichiers médias
AMENDEMENT 2: Transport de texte temporisé et autres
recouvrements visuels
Reference number
ISO/IEC 14496-12:2012/Amd.2:2014(E)
©
ISO/IEC 2014

---------------------- Page: 1 ----------------------
ISO/IEC 14496-12:2012/Amd.2:2014(E)

COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2014
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form
or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior
written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of
the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2014 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC 14496-12:2012/Amd.2:2014(E)

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work. In the field of information technology, ISO and IEC have established a joint technical committee,
ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting.
Publication as an International Standard requires approval by at least 75 % of the national bodies
casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 2 to ISO/IEC 14496-12:2012 was prepared by Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia
information.
© ISO/IEC 2014 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC 14496-12:2012/Amd.2:2014(E)
Information technology — Coding of audio-visual
objects —
Part 12:
ISO base media file format
AMENDMENT 2: Carriage of timed text and other visual
overlays
In subclause 6.2.3, Table 1, add a new row for sthd as follows (the other rows of the table are provided here
to show the position but are unchanged):
minf * media information container
vmhd video media header, overall information (video
track only)
smhd sound media header, overall information (sound
track only)
sthd 8.4.5.6 subtitle media header, overall information (subtitle
track only)
hmhd hint media header, overall information (hint track
only)
nmhd Null media header, overall information (some tracks
only)
In section 8.4.3.1, replace
This box within a Media Box declares the process by which the media-data in the track is presented, and
thus, the nature of the media in a track. For example, a video track would be handled by a video handler.
with
This box within a Media Box declares media type of the track, and thus the process by which the media-
data in the track is presented. For example, a format for which the decoder delivers video would be
stored in a video track, identified by being handled by a video handler. The documentation of the storage
of a media format identifies the media type which that format uses.
In section 8.4.3.1, replace
There is a general handler for metadata streams of any type; the specific format is identified by the
sample entry, as for video or audio, for example. If they are in text, then a MIME format is supplied to
document their format; if in XML, each sample is a complete XML document, and the namespace of the
XML is also supplied.
with
There is a general handler for metadata streams of any type; the specific format is identified by the
sample entry, as for video or audio, for example.
and add the following before the final Notes
The timed text media type indicates that the associated decoder will process only text data. The subtitle
media type indicates that the associated decoder will process text data and possibly images.
© ISO/IEC 2014 – All rights reserved 1

---------------------- Page: 4 ----------------------
ISO/IEC 14496-12:2012/Amd.2:2014(E)

In 8.4.3.3, add the following lines to the list of handler_types:

‘text’ Timed text track
‘subt’ Subtitle track
Add to 8.4.5.1, before the end
Which type of media header is used is determined by the media handler:
— video track VideoMediaHeaderBox
— audio track SoundMediaHeaderBox
— timed metadata track NullMediaHeaderBox
— timed text track NullMediaHeaderBox
— subtitle track SubtitleMediaHeaderBox
— hint tracks HintMediaHeaderBox
Change subclause 8.4.5.5 as follows:
Streams for which no specific media header is identified use a null Media Header Box, as defined here.
Add a new subclause 8.4.5.6 as follows:
8.4.5.6 Subtitle Media Header Box
The subtitle media header contains general presentation information, independent of the coding, for
subtitle media. This header is used for all tracks containing subtitles.
8.4.5.6.1 Syntax
aligned(8) class SubtitleMediaHeaderBox
  extends FullBox (‘sthd’, version = 0, flags = 0){
}

8.4.5.6.2 Semantics
version is an integer that specifies the version of this box.
flags is a 24-bit integer with flags for future use (currently all zero)
In 8.5.2.1, replace the paragraph
For video tracks, a VisualSampleEntry is used, for audio tracks, an AudioSampleEntry and for metadata
tracks, a MetaDataSampleEntry. Hint tracks use an entry format specific to their protocol, with an
appropriate name.
with
Which type of sample entry form is used is determined by the media handler:
— video track VisualSampleEntry
— audio track AudioSampleEntry
— timed metadata track MetaDataSampleEntry
— timed text track PlainTextSampleEntry
— subtitle track SubtitleSampleEntry
— hint tracks an entry format specific to their protocol, with an appropriate name.
In 8.5.2.1 replace the paragraph
2 © ISO/IEC 2014 – All rights reserved

---------------------- Page: 5 ----------------------
ISO/IEC 14496-12:2012/Amd.2:2014(E)

The samplerate, samplesize and channelcount fields document the default audio output playback format
for this media. The timescale for an audio track should be chosen to match the sampling rate, or be an
integer multiple of it, to enable sample-accurate timing. ChannelCount is a value greater than zero
that indicates the maximum number of channels that the audio could deliver. A ChannelCount of 1
indicates mono audio, and 2 indicates stereo (left/right). When values greater than 2 are used, the codec
configuration should identify the channel assignment.
with
The samplerate, samplesize and channelcount fields document the default audio output playback format
for this media. The timescale for an audio track should be chosen to match the sampling rate, or be an
integer multiple of it, to enable sample-accurate timing. ChannelCount is a value greater than zero
that indicates the maximum number of channels that the audio could deliver. A ChannelCount of 1
indicates mono audio, and 2 indicates stereo (left/right). When values greater than 2 are used, the codec
configuration should identify the channel assignment.
When it is desired to indicate an audio sampling rate greater than the value that can be represented in
the samplerate field, the following may be used:
— an AudioSampleEntryV1 is used, which requires that the enclosing Sample Description Box also
take the version 1;
— a Sampling Rate box may be present only in an AudioSampleEntryV1, and when present, it over-
rides the samplerate field and documents the actual sampling rate;
— when the Sampling Rate box is present, the media timescale should be the same as the sampling
rate, or an integer division or multiple of it;
— the samplerate field in the sample entry should contain a value left-shifted 16 bits (as for
AudioSampleEntry) that matches the media timescale, or be an integer division or multiple of it.
An AudioSampleEntryV1 should only be used when needed; otherwise, for maximum compatibility, an
AudioSampleEntry should be used. An AudioSampleEntryV1 must not occur in a SampleDescriptionBox
with version set to 0.
A TextSubtitleSampleEntry, TextMetaDataSampleEntry, or SimpleTextSampleEntry, all of which contain
a MIME type, may be used to identify the format of streams for which a MIME type applies. A MIME
type applies if the contents of a set of samples, starting with a sync sample and ending at the sample
immediately preceding a sync sample, are concatenated in their entirety, and the result meets the
decoding requirements for documents of that MIME type. Non-sync samples should be used only if that
format specifies the behaviour of ‘progressive decoding’, and then the sample times indicate when the
results of such progressive decoding should be presented (according to the media type).
NOTE The samples in a track that is all sync samples are therefore each a valid document for that MIME type.
In 8.5.2.2 add the subt
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.