ISO/IEC 14496-1:2001
(Main)Information technology — Coding of audio-visual objects — Part 1: Systems
Information technology — Coding of audio-visual objects — Part 1: Systems
Technologies de l'information — Codage des objets audiovisuels — Partie 1: Systèmes
General Information
Relations
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 14496-1
Second edition
2001-10-01
Information technology — Coding of
audio-visual objects —
Part 1:
Systems
Technologies de l'information — Codage des objets audiovisuels —
Partie 1: Systèmes
Reference number
ISO/IEC 14496-1:2001(E)
©
ISO/IEC 2001
---------------------- Page: 1 ----------------------
ISO/IEC 14496-1:2001(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but shall not
be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this
file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat accepts no liability in this
area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters
were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event
that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2001
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic
or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISO's member body
in the country of the requester.
ISO copyright office
Case postale 56 � CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.ch
Web www.iso.ch
Printed in Switzerland
ii © ISO/IEC 2001 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC 14496-1:2001(E)
Contents Page
0 Introduction .viii
0.1 Overview .viii
0.2 Architecture .viii
0.3 Terminal Model: Systems Decoder Model.x
0.4 Multiplexing of Streams: The Delivery Layer .x
0.5 Synchronization of Streams: The Sync Layer.x
0.6 The Compression Layer .xi
0.7 Application Engine.xii
1 Scope.1
2 Normative references .1
3 Additional reference .2
4 Terms and definitions.2
5 Abbreviations and Symbols.7
6 Conventions .8
7 Systems Decoder Model.8
7.1 Introduction .8
7.2 Concepts of the systems decoder model.9
7.3 Timing Model Specification.10
7.4 Buffer Model Specification.12
8 Object Description Framework.14
8.1 Introduction .14
8.2 Common data structures.15
8.3 Intellectual Property Management and Protection (IPMP).17
8.4 Object Content Information (OCI).19
8.5 Object Descriptor Stream.21
8.6 Object Descriptor Components.24
8.7 Rules for Usage of the Object Description Framework .46
8.8 Usage of the IPMP System interface.55
9 Scene Description.58
9.1 Introduction .58
9.2 Concepts.60
9.3 BIFS Syntax .74
9.4 Node Semantics .133
10 Synchronization of Elementary Streams.226
10.1 Introduction .226
10.2 Sync Layer .227
© ISO/IEC 2001 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/IEC 14496-1:2001(E)
10.3 DMIF Application Interface.236
11 MPEG-J.236
11.1 Introduction .236
11.2 Architecture .237
11.3 MPEG-J Session.239
11.4 Delivery of MPEG-J Data .240
11.5 MPEG-J API List .243
12 Multiplexing of Elementary Streams .249
12.1 Introduction .249
12.2 FlexMux Tool .249
13 File Format .255
13.1 Introduction .255
13.2 File organization.260
13.3 Extensibility .284
14 Syntactic Description Language .285
14.1 Introduction .285
14.2 Elementary Data Types.285
14.3 Composite Data Types.288
14.4 Arithmetic and Logical Expressions.292
14.5 Non-Parsable Variables .292
14.6 Syntactic Flow Control .292
14.7 Built-In Operators.294
14.8 Scoping Rules .294
15 Profiles .294
15.1 Introduction .294
15.2 OD Profile Definitions .295
15.3 Scene Graph Profile Definitions .295
15.4 Graphics Profile Definitions.299
15.5 MPEG-J Profile Definitions.301
15.6 MPEG-J Profiles Tools.301
15.7 MPEG-J Profiles .301
15.8 MPEG-J Profiles@Levels.302
Annex A (informative) Bibliography .303
Annex B (informative) Time Base Reconstruction.304
B.1 Time Base Reconstruction.304
B.2 Temporal aliasing and audio resampling .305
B.3 Reconstruction of a Synchronised Audio-visual Scene: A Walkthrough .305
Annex C (normative) View Dependent Object Scalability .307
C.1 Introduction .307
C.2 Bitstream Syntax.307
iv © ISO/IEC 2001 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/IEC 14496-1:2001(E)
C.3 Bitstream Semantics.308
Annex D (informative) Registration procedure .310
D.1 Procedure for the request of a Registration ID (RID) .310
D.2 Responsibilities of the Registration Authority.310
D.3 Contact information for the Registration Authority .310
D.4 Responsibilities of Parties Requesting a RID .310
D.5 Appeal Procedure for Denied Applications.311
D.6 Registration Application Form .311
Annex E (informative) The QoS Management Model for ISO/IEC 14496 Content .313
Annex F (informative) Conversion Between Time and Date Conventions .314
Annex G (normative) Adaptive Arithmetic Decoder for BIFS-Anim.316
Annex H (normative) Node coding tables.318
H.1 Node Tables.318
H.2 Node Definition Type Tables.341
H.3 Node Tables for Extended Nodes.348
H.4 Node Definition Type Tables for extended node types.355
Annex I (informative) MPEG-4 Audio TTS application with Facial Animation.357
Annex J (informative) Graphical representation of object descriptor and sync layer syntax.358
J.1 Length encoding of descriptors and commands .358
J.2 Object Descriptor Stream and OD commands.358
J.3 IPMP stream.359
J.4 OCI stream .359
J.5 Object descriptor and its components .359
J.6 OCI Descriptors.362
J.7 Sync layer configuration and syntax .365
Annex K (informative) Patent statements .366
K.1 Patent Statements for Version 1.366
K.2 Patent Statements for Version 2.367
Annex L (informative) Elementary Stream Interface.369
Annex M (Informative) Definition of bodySceneGraph nodes.371
M.1 Introduction .371
M.2 Detailed Semantics .371
M.3 Overview .371
M.4 The Nodes.371
Annex N (Informative) Implementation of MaterialKey node.380
Annex O (Informative) Example implementation of spatial audio processing (perceptual approach) .382
O.1 Example algorithm implementation .382
O.2 Elementary spectral corrector .383
O.3 Input Filter.384
O.4 Direct path .384
© ISO/IEC 2001 – All rights reserved v
---------------------- Page: 5 ----------------------
ISO/IEC 14496-1:2001(E)
O.5 Directional early reflections .385
O.6 Diffuse late reverberation.385
O.7 Setting the delays.386
O.8 Scalability.387
Annex P (informative) Upstream walkthrough .388
P.1 Introduction .388
P.2 Configuration.388
P.3 Content access procedure with DAI.389
P.4 Example.389
Annex Q (Informative) Layout of Media Data.393
Annex R (Informative) Random Access .394
Annex S (Informative) Starting the Java Virtual Machine .395
Annex T (Informative) Examples of MPEG-J API usage.396
T.1 Scene APIs :.396
T.2 Resource and Decoder APIs .400
T.3 Net APIs.402
T.4 Section Filtering APIs .403
Annex U (Normative) MPEG-J APIs Listing (HTML) .405
Annex V (Normative) MPEG-J APIs Listing.406
V.1 package org.iso.mpeg.mpegj.406
V.2 package org.iso.mpeg.mpegj.resource .413
V.3 package org.iso.mpeg.mpegj.decoder.442
V.4 package org.iso.mpeg.mpegj.net .454
V.5 package org.iso.mpeg.mpegj.scene .461
vi © ISO/IEC 2001 – All rights reserved
---------------------- Page: 6 ----------------------
ISO/IEC 14496-1:2001(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission)
form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC
participate in the development of International Standards through technical committees established by the
respective organization to deal with particular fields of technical activity. ISO and IEC technical committees
collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in
liaison with ISO and IEC, also take part in the work.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 3.
In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting.
Publication as an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this part of ISO/IEC 14496 may be the subject of
patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
International Standard ISO/IEC 14496-1 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information
technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
This second edition cancels and replaces the first edition (ISO/IEC 14496-1:1999), which has been technically
revised.
ISO/IEC 14496 consists of the following parts, under the general title Information technology — Coding of audio-
visual objects:
— Part 1: Systems
— Part 2: Visual
— Part 3: Audio
— Part 4: Conformance testing
— Part 5: Reference software
— Part 6: Delivery Multimedia Integration Framework (DMIF)
— Part 7: Optimized software for MPEG-4 visual tools
Annexes C, G, H, U and V form a normative part of this part of ISO/IEC 14496. Annexes A, B, D, E, F and I to T are
for information only.
© ISO/IEC 2001 – All rights reserved vii
---------------------- Page: 7 ----------------------
ISO/IEC 14496-1:2001(E)
0 Introduction
0.1 Overview
ISO/IEC 14496 specifies a system for the communication of interactive audio-visual scenes. This specification
includes the following elements:
1. the coded representation of natural or synthetic, two-dimensional (2D) or three-dimensional (3D) objects that
can be manifested audibly and/or visually (audio-visual objects) (specified in part 1,2 and 3 of ISO/IEC 14496);
2. the coded representation of the spatio-temporal positioning of audio-visual objects as well as their behavior in
response to interaction (scene description, specified in this part of ISO/IEC 14496);
3. the coded representation of information related to the management of data streams (synchronization,
identification, description and association of stream content, specified in this part of ISO/IEC 14496); and
4. a generic interface to the data stream delivery layer functionality (specified in part 6 of ISO/IEC 14496).
5. an application engine for programmatic control of the player: format, delivery of downloadable Java byte code as
well as its execution lifecycle and behavior through APIs (specified in this part of ISO/IEC 14496); and
6. a file format to contain the media information of an ISO/IEC 14496 presentation in a flexible, extensible format to
facilitate interchange, management, editing, and presentation of the media.
The overal operation of a system communicating audio-visual scenes can be paraphrased as follows:
At the sending terminal, the audio-visual scene information is compressed, supplemented with synchronization
information and passed to a delivery layer that multiplexes it into one or more coded binary streams that are
transmitted or stored. At the receiving terminal, these streams are demultiplexed and decompressed. The audio-
visual objects are composed according to the scene description and synchronization information and presented to
the end user. The end user may have the option to interact with this presentation. Interaction information can be
processed locally or transmitted back to the sending terminal. ISO/IEC 14496 defines the syntax and semantics of
the bitstreams that convey such scene information, as well as the details of their decoding processes.
This part of ISO/IEC 14496 specifies the following tools:
� a terminal model for time and buffer management;
� a coded representation of interactive audio-visual scene description information (Binary Format for Scenes –
BIFS);
� a coded representation of metadata for the identification, description and logical dependencies of the
elementary streams (object descriptors and other descriptors);
� a coded representation of descriptive audio-visual content information (object content information – OCI);
� an interface to intellectual property management and protection (IPMP) systems;
� a coded representation of synchronization information (sync layer – SL); and
� a multiplexed representation of individual elementary streams in a single stream (FlexMux).
� an application engine (MPEG-Java - MPEG-J).
These various elements are described functionally in this subclause and specified in the normative clauses that
follow.
0.2 Architecture
The information representation specified in ISO/IEC 14496-1 describes the means to create an interactive audio-
visual scene in terms of coded audio-visual information and associated scene description information. The entity
viii © ISO/IEC 2001 – All rights reserved
---------------------- Page: 8 ----------------------
ISO/IEC 14496-1:2001(E)
that composes and sends, or receives and presents such a coded representation of an interactive audio-visual
scene is generically referred to as an "audio-visual terminal" or just "terminal". This terminal may correspond to a
standalone application or be part of an application system.
Display and
User
Interaction
Interactive Audiovisual
Scene
Composition and Rendering
Compression
Upstream
...
Information
Layer
Scene
Object
AV Object
Description
Descriptor
data
Information
Elementary Streams
Elementary Stream Interface
SL SL SL SL SL SL
...
Sync
Layer
SL
SL-Packetized Streams
DMIF Application Interface
FlexMux FlexMux FlexMux
Delivery
Layer
(PES) (RTP)
AAL2 H223 DAB
MPEG-2 UDP
...
ATM PSTN Mux
TS IP
Multiplexed Streams
Transmission/Storage Medium
Figure 1 - The ISO/IEC 14496 terminal architecture
The basic operations performed by such a receiver terminal are as follows. Information that allows access to
content complying with ISO/IEC 14496 is provided as initial session set up information to the terminal. Part 6 of
ISO/IEC 14496 defines the procedures for establishing such session contexts as well as the interface to the
delivery layer that generically abstracts the storage or transport medium. The initial set up information allows, in a
recursive manner, to locate one or more elementary streams that are part of the coded content representation.
Some of these elementary streams may be grouped together using the multiplexing tool described in ISO/IEC
14496-1.
© ISO/IEC 2001 – All rights reserved ix
---------------------- Page:
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.