Information technology - Coding of audio-visual objects - Part 1: Systems

Technologies de l'information — Codage des objets audiovisuels — Partie 1: Systèmes

General Information

Status
Withdrawn
Publication Date
10-Oct-2001
Withdrawal Date
10-Oct-2001
Current Stage
9599 - Withdrawal of International Standard
Start Date
05-Dec-2005
Completion Date
30-Oct-2025
Ref Project

Relations

Standard
ISO/IEC 14496-1:2001 - Information technology -- Coding of audio-visual objects
English language
670 pages
sale 15% off
Preview
sale 15% off
Preview

Frequently Asked Questions

ISO/IEC 14496-1:2001 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - Coding of audio-visual objects - Part 1: Systems". This standard covers: Information technology - Coding of audio-visual objects - Part 1: Systems

Information technology - Coding of audio-visual objects - Part 1: Systems

ISO/IEC 14496-1:2001 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.

ISO/IEC 14496-1:2001 has the following relationships with other standards: It is inter standard links to ISO 11979-4:2008/Amd 1:2012, ISO/IEC 14496-1:2001/FDAM 2, ISO/IEC 14496-1:2001/Amd 3:2004, ISO/IEC 14496-1:2001/Amd 4:2003, ISO/IEC 14496-1:2001/Amd 8:2004, ISO/IEC 14496-1:2001/Amd 1:2001, ISO/IEC 14496-1:2001/Amd 7:2004, ISO/IEC 14496-1:2004, ISO/IEC 14496-1:1999, ISO/IEC 14496-11:2005; is excused to ISO/IEC 14496-1:2001/Amd 1:2001, ISO/IEC 14496-1:2001/Amd 3:2004, ISO/IEC 14496-1:2001/Amd 8:2004, ISO/IEC 14496-1:2001/Amd 4:2003, ISO/IEC 14496-1:2001/FDAM 2. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

You can purchase ISO/IEC 14496-1:2001 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.

Standards Content (Sample)


INTERNATIONAL ISO/IEC
STANDARD 14496-1
Second edition
2001-10-01
Information technology — Coding of
audio-visual objects —
Part 1:
Systems
Technologies de l'information — Codage des objets audiovisuels —
Partie 1: Systèmes
Reference number
©
ISO/IEC 2001
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but shall not
be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this
file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat accepts no liability in this
area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters
were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event
that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2001
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic
or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISO's member body
in the country of the requester.
ISO copyright office
Case postale 56 � CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.ch
Web www.iso.ch
Printed in Switzerland
ii © ISO/IEC 2001 – All rights reserved

Contents Page
0 Introduction .viii
0.1 Overview .viii
0.2 Architecture .viii
0.3 Terminal Model: Systems Decoder Model.x
0.4 Multiplexing of Streams: The Delivery Layer .x
0.5 Synchronization of Streams: The Sync Layer.x
0.6 The Compression Layer .xi
0.7 Application Engine.xii
1 Scope.1
2 Normative references .1
3 Additional reference .2
4 Terms and definitions.2
5 Abbreviations and Symbols.7
6 Conventions .8
7 Systems Decoder Model.8
7.1 Introduction .8
7.2 Concepts of the systems decoder model.9
7.3 Timing Model Specification.10
7.4 Buffer Model Specification.12
8 Object Description Framework.14
8.1 Introduction .14
8.2 Common data structures.15
8.3 Intellectual Property Management and Protection (IPMP).17
8.4 Object Content Information (OCI).19
8.5 Object Descriptor Stream.21
8.6 Object Descriptor Components.24
8.7 Rules for Usage of the Object Description Framework .46
8.8 Usage of the IPMP System interface.55
9 Scene Description.58
9.1 Introduction .58
9.2 Concepts.60
9.3 BIFS Syntax .74
9.4 Node Semantics .133
10 Synchronization of Elementary Streams.226
10.1 Introduction .226
10.2 Sync Layer .227
© ISO/IEC 2001 – All rights reserved iii

10.3 DMIF Application Interface.236
11 MPEG-J.236
11.1 Introduction .236
11.2 Architecture .237
11.3 MPEG-J Session.239
11.4 Delivery of MPEG-J Data .240
11.5 MPEG-J API List .243
12 Multiplexing of Elementary Streams .249
12.1 Introduction .249
12.2 FlexMux Tool .249
13 File Format .255
13.1 Introduction .255
13.2 File organization.260
13.3 Extensibility .284
14 Syntactic Description Language .285
14.1 Introduction .285
14.2 Elementary Data Types.285
14.3 Composite Data Types.288
14.4 Arithmetic and Logical Expressions.292
14.5 Non-Parsable Variables .292
14.6 Syntactic Flow Control .292
14.7 Built-In Operators.294
14.8 Scoping Rules .294
15 Profiles .294
15.1 Introduction .294
15.2 OD Profile Definitions .295
15.3 Scene Graph Profile Definitions .295
15.4 Graphics Profile Definitions.299
15.5 MPEG-J Profile Definitions.301
15.6 MPEG-J Profiles Tools.301
15.7 MPEG-J Profiles .301
15.8 MPEG-J Profiles@Levels.302
Annex A (informative) Bibliography .303
Annex B (informative) Time Base Reconstruction.304
B.1 Time Base Reconstruction.304
B.2 Temporal aliasing and audio resampling .305
B.3 Reconstruction of a Synchronised Audio-visual Scene: A Walkthrough .305
Annex C (normative) View Dependent Object Scalability .307
C.1 Introduction .307
C.2 Bitstream Syntax.307
iv © ISO/IEC 2001 – All rights reserved

C.3 Bitstream Semantics.308
Annex D (informative) Registration procedure .310
D.1 Procedure for the request of a Registration ID (RID) .310
D.2 Responsibilities of the Registration Authority.310
D.3 Contact information for the Registration Authority .310
D.4 Responsibilities of Parties Requesting a RID .310
D.5 Appeal Procedure for Denied Applications.311
D.6 Registration Application Form .311
Annex E (informative) The QoS Management Model for ISO/IEC 14496 Content .313
Annex F (informative) Conversion Between Time and Date Conventions .314
Annex G (normative) Adaptive Arithmetic Decoder for BIFS-Anim.316
Annex H (normative) Node coding tables.318
H.1 Node Tables.318
H.2 Node Definition Type Tables.341
H.3 Node Tables for Extended Nodes.348
H.4 Node Definition Type Tables for extended node types.355
Annex I (informative) MPEG-4 Audio TTS application with Facial Animation.357
Annex J (informative) Graphical representation of object descriptor and sync layer syntax.358
J.1 Length encoding of descriptors and commands .358
J.2 Object Descriptor Stream and OD commands.358
J.3 IPMP stream.359
J.4 OCI stream .359
J.5 Object descriptor and its components .359
J.6 OCI Descriptors.362
J.7 Sync layer configuration and syntax .365
Annex K (informative) Patent statements .366
K.1 Patent Statements for Version 1.366
K.2 Patent Statements for Version 2.367
Annex L (informative) Elementary Stream Interface.369
Annex M (Informative) Definition of bodySceneGraph nodes.371
M.1 Introduction .371
M.2 Detailed Semantics .371
M.3 Overview .371
M.4 The Nodes.371
Annex N (Informative) Implementation of MaterialKey node.380
Annex O (Informative) Example implementation of spatial audio processing (perceptual approach) .382
O.1 Example algorithm implementation .382
O.2 Elementary spectral corrector .383
O.3 Input Filter.384
O.4 Direct path .384
© ISO/IEC 2001 – All rights reserved v

O.5 Directional early reflections .385
O.6 Diffuse late reverberation.385
O.7 Setting the delays.386
O.8 Scalability.387
Annex P (informative) Upstream walkthrough .388
P.1 Introduction .388
P.2 Configuration.388
P.3 Content access procedure with DAI.389
P.4 Example.389
Annex Q (Informative) Layout of Media Data.393
Annex R (Informative) Random Access .394
Annex S (Informative) Starting the Java Virtual Machine .395
Annex T (Informative) Examples of MPEG-J API usage.396
T.1 Scene APIs :.396
T.2 Resource and Decoder APIs .400
T.3 Net APIs.402
T.4 Section Filtering APIs .403
Annex U (Normative) MPEG-J APIs Listing (HTML) .405
Annex V (Normative) MPEG-J APIs Listing.406
V.1 package org.iso.mpeg.mpegj.406
V.2 package org.iso.mpeg.mpegj.resource .413
V.3 package org.iso.mpeg.mpegj.decoder.442
V.4 package org.iso.mpeg.mpegj.net .454
V.5 package org.iso.mpeg.mpegj.scene .461
vi © ISO/IEC 2001 – All rights reserved

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission)
form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC
participate in the development of International Standards through technical committees established by the
respective organization to deal with particular fields of technical activity. ISO and IEC technical committees
collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in
liaison with ISO and IEC, also take part in the work.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 3.
In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting.
Publication as an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this part of ISO/IEC 14496 may be the subject of
patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
International Standard ISO/IEC 14496-1 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information
technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
This second edition cancels and replaces the first edition (ISO/IEC 14496-1:1999), which has been technically
revised.
ISO/IEC 14496 consists of the following parts, under the general title Information technology — Coding of audio-
visual objects:
— Part 1: Systems
— Part 2: Visual
— Part 3: Audio
— Part 4: Conformance testing
— Part 5: Reference software
— Part 6: Delivery Multimedia Integration Framework (DMIF)
— Part 7: Optimized software for MPEG-4 visual tools
Annexes C, G, H, U and V form a normative part of this part of ISO/IEC 14496. Annexes A, B, D, E, F and I to T are
for information only.
© ISO/IEC 2001 – All rights reserved vii

0 Introduction
0.1 Overview
ISO/IEC 14496 specifies a system for the communication of interactive audio-visual scenes. This specification
includes the following elements:
1. the coded representation of natural or synthetic, two-dimensional (2D) or three-dimensional (3D) objects that
can be manifested audibly and/or visually (audio-visual objects) (specified in part 1,2 and 3 of ISO/IEC 14496);
2. the coded representation of the spatio-temporal positioning of audio-visual objects as well as their behavior in
response to interaction (scene description, specified in this part of ISO/IEC 14496);
3. the coded representation of information related to the management of data streams (synchronization,
identification, description and association of stream content, specified in this part of ISO/IEC 14496); and
4. a generic interface to the data stream delivery layer functionality (specified in part 6 of ISO/IEC 14496).
5. an application engine for programmatic control of the player: format, delivery of downloadable Java byte code as
well as its execution lifecycle and behavior through APIs (specified in this part of ISO/IEC 14496); and
6. a file format to contain the media information of an ISO/IEC 14496 presentation in a flexible, extensible format to
facilitate interchange, management, editing, and presentation of the media.
The overal operation of a system communicating audio-visual scenes can be paraphrased as follows:
At the sending terminal, the audio-visual scene information is compressed, supplemented with synchronization
information and passed to a delivery layer that multiplexes it into one or more coded binary streams that are
transmitted or stored. At the receiving terminal, these streams are demultiplexed and decompressed. The audio-
visual objects are composed according to the scene description and synchronization information and presented to
the end user. The end user may have the option to interact with this presentation. Interaction information can be
processed locally or transmitted back to the sending terminal. ISO/IEC 14496 defines the syntax and semantics of
the bitstreams that convey such scene information, as well as the details of their decoding processes.
This part of ISO/IEC 14496 specifies the following tools:
� a terminal model for time and buffer management;
� a coded representation of interactive audio-visual scene description information (Binary Format for Scenes –
BIFS);
� a coded representation of metadata for the identification, description and logical dependencies of the
elementary streams (object descriptors and other descriptors);
� a coded representation of descriptive audio-visual content information (object content information – OCI);
� an interface to intellectual property management and protection (IPMP) systems;
� a coded representation of synchronization information (sync layer – SL); and
� a multiplexed representation of individual elementary streams in a single stream (FlexMux).
� an application engine (MPEG-Java - MPEG-J).
These various elements are described functionally in this subclause and specified in the normative clauses that
follow.
0.2 Architecture
The information representation specified in ISO/IEC 14496-1 describes the means to create an interactive audio-
visual scene in terms of coded audio-visual information and associated scene description information. The entity
viii © ISO/IEC 2001 – All rights reserved

that composes and sends, or receives and presents such a coded representation of an interactive audio-visual
scene is generically referred to as an "audio-visual terminal" or just "terminal". This terminal may correspond to a
standalone application or be part of an application system.
Display and
User
Interaction
Interactive Audiovisual
Scene
Composition and Rendering
Compression
Upstream
...
Information
Layer
Scene
Object
AV Object
Description
Descriptor
data
Information
Elementary Streams
Elementary Stream Interface
SL SL SL SL SL SL
...
Sync
Layer
SL
SL-Packetized Streams
DMIF Application Interface
FlexMux FlexMux FlexMux
Delivery
Layer
(PES) (RTP)
AAL2 H223 DAB
MPEG-2 UDP
...
ATM PSTN Mux
TS IP
Multiplexed Streams
Transmission/Storage Medium
Figure 1 - The ISO/IEC 14496 terminal architecture
The basic operations performed by such a receiver terminal are as follows. Information that allows access to
content complying with ISO/IEC 14496 is provided as initial session set up information to the terminal. Part 6 of
ISO/IEC 14496 defines the procedures for establishing such session contexts as well as the interface to the
delivery layer that generically abstracts the storage or transport medium. The initial set up information allows, in a
recursive manner, to locate one or more elementary streams that are part of the coded content representation.
Some of these elementary streams may be grouped together using the multiplexing tool described in ISO/IEC
14496-1.
© ISO/IEC 2001 – All rights reserved ix

Elementary streams contain the coded representation of either audio or visual data or scene description
information. Elementary streams may as well themselves convey information to identify streams, to describe logical
dependencies between streams, or to describe information related to the content of the streams. Each elementary
stream contains only one type of data.
Elementary streams are decoded using their respective stream-specific decoders. The audio-visual objects are
composed according to the scene description information and presented by the terminal’s presentation device(s).
All these processes are synchronized according to the systems decoder model (SDM) using the synchronization
information provided at the synchronization layer.
These basic operations are depicted in Figure 1, and are described in more detail below.
0.3 Terminal Model: Systems Decoder Model
The systems decoder model provides an abstract view of the behavior of a terminal complying with
ISO/IEC 14496-1. Its purpose is to enable a sending terminal to predict how the receiving terminal will behave in
terms of buffer management and synchronization when reconstructing the audio-visual information that comprises
the presentation. The systems decoder model includes a systems timing model and a systems buffer model which
are described briefly in the following subclauses.
0.3.1 Timing Model
The timing model defines the mechanisms through which a receiving terminal establishes a notion of time that
enables it to process time-dependent events. This model also allows the receiving terminal to establish
mechanisms to maintain synchronization both across and within particular audio-visual objects as well as with user
interaction events. In order to facilitate these functions at the receiving terminal, the timing model requires that the
transmitted data streams contain implicit or explicit timing information. Two sets of timing information are defined in
ISO/IEC 14496-1: clock references and time stamps. The former convey the sending terminal’s timebasetothe
receiving terminal, while the latter convey a notion of relative time for specific events such as the desired decoding
or composition time for portions of the encoded audio-visual information.
0.3.2 Buffer Model
The buffer model enables the sending terminal to monitor and control the buffer resources that are needed to
decode each elementary stream in a presentation. The required buffer resources are conveyed to the receiving
terminal by means of descriptors at the beginning of the presentation. The terminal can then decide whether or not
it is capable of handling this particular presentation. The buffer model allows the sending terminal to specify when
information may be removed from these buffers and enables it to schedule data transmission so that the
appropriate buffers at the receiving terminal do not overflow or underflow.
0.4 Multiplexing of Streams: The Delivery Layer
The term delivery layer is used as a generic abstraction of any existing transport protocol stack that may be used to
transmit and/or store content complying with ISO/IEC 14496. The functionality of this layer is not within the scope of
ISO/IEC 14496-1, and only the interface to this layer is considered. This interface is the DMIF Application Interface
(DAI) specified in ISO/IEC 14496-6. The DAI defines not only an interface for the delivery of streaming data, but
also for signaling information required for session and channel set up as well as tear down. A wide variety of
delivery mechanisms exist below this interface, with some of them indicated in Figure 1. These mechanisms serve
for transmission as well as storage of streaming data, i.e., a file is considered to be a particular instance of a
delivery layer. For applications where the desired transport facility does not fully address the needs of a service
according to the specifications in ISO/IEC 14496, a simple multiplexing tool (FlexMux) with low delay and low
overhead is defined in ISO/IEC 14496-1.
0.5 Synchronization of Streams: The Sync Layer
Elementary streams are the basic abstraction for any streaming data source. Elementary streams are conveyed as
sync layer-packetized (SL-packetized) streams at the DMIF Application Interface. This packetized representation
additionally provides timing and synchronization information, as well as fragmentation and random access
information. The sync layer (SL) extracts this timing information to enable synchronized decoding and,
subsequently, composition of the elementary stream data.
x © ISO/IEC 2001 – All rights reserved

0.6 The Compression Layer
The compression layer receives data in its encoded format and performs the necessary operations to decode this
data. The decoded information is then used by the terminal’s composition, rendering and presentation subsystems.
0.6.1 Object Description Framework
The purpose of the object description framework is to identify and describe elementary streams and to associate
them appropriately to an audio-visual scene description. Object descriptors serve to gain access to ISO/IEC 14496
content. Object content information and the interface to intellectual property management and protection systems
are also part of this framework.
An object descriptor is a collection of one or more elementary stream descriptors that provide the configuration and
other information for the streams that relate to either an audio-visual object or a scene description. Object
descriptors are themselves conveyed in elementary streams. Each object descriptor is assigned an identifier
(object descriptor ID), which is unique within a defined name scope. This identifier is used to associate audio-visual
objects in the scene description with a particular object descriptor, and thus the elementary streams related to that
particular object.
Elementary stream descriptors include information about the source of the stream data, in form of a unique numeric
identifier (the elementary stream ID) or a URL pointing to a remote source for the stream. Elementary stream
descriptors also include information about the encoding format, configuration information for the decoding process
and the sync layer packetization, as well as quality of service requirements for the transmission of the stream and
intellectual property identification. Dependencies between streams can also be signaled within the elementary
stream descriptors. This functionality may be used, for example, in scalable audio or visual object representations
to indicate the logical dependency of a stream containing enhancement information, to a stream containing the
base information. It can also be used to describe alternative representations for the same content (e.g. the same
speech content in various languages).
0.6.1.1 Intellectual Property Management and Protection
The intellectual property management and protection (IPMP) framework for ISO/IEC 14496 content consists of a
normative interface that permits an ISO/IEC 14496 terminal to host one or more IPMP Systems. The IPMP
interface consists of IPMP elementary streams and IPMP descriptors. IPMP descriptors are carried as part of an
object descriptor stream. IPMP elementary streams carry time variant IPMP information that can be associated to
multiple object descriptors.
The IPMP System itself is a non-normative component that provides intellectual property management and
protection functions for the terminal. The IPMP System uses the information carried by the IPMP elementary
streams and descriptors to make protected ISO/IEC 14496 content available to the terminal. An application may
choose not to use an IPMP System, thereby offering no management and protection features.
0.6.1.2 Object Content Information
Object content information (OCI) descriptors convey descriptive information about audio-visual objects. The main
content descriptors are: content classification descriptors, keyword descriptors, rating descriptors, language
descriptors, textual descriptors, and descriptors about the creation of the content. OCI descriptors can be included
directly in the related object descriptor or elementary stream descriptor or, if it is time variant, it may be carried in
an elementary stream by itself. An OCI stream is organized in a sequence of small, synchronized entities called
events that contain a set of OCI descriptors. OCI streams can be associated to multiple object descriptors.
0.6.2 Scene Description Streams
Scene description addresses the organization of audio-visual objects in a scene, in terms of both spatial and
temporal attributes. This information allows the composition and rendering of individual audio-visual objects after
the respective decoders have reconstructed the streaming data for them. For visual data, ISO/IEC 14496-1 does
not mandate particular composition algorithms. Hence, visual composition is implementation dependent. For audio
data, the composition process is defined in a normative manner in 9.2.2.13 and ISO/IEC 14496-3.
The scene description is represented using a parametric approach (BIFS - Binary Format for Scenes). The
description consists of an encoded hierarchy (tree) of nodes with attributes and other information (including event
sources and targets). Leaf nodes in this tree correspond to elementary audio-visual data, whereas intermediate
nodes group this material to form audio-visual objects, and perform grouping, transformation, and other such
© ISO/IEC 2001 – All rights reserved xi

operations on audio-visual objects (scene description nodes). The scene description can evolve over time by using
scene description updates.
In order to facilitate active user involvement with the presented audio-visual information, ISO/IEC 14496-1 provides
support for user and object interactions. Interactivity mechanisms are integrated with the scene description
information, in the form of linked event sources and targets (routes) as well as sensors (special nodes that can
trigger events based on specific conditions). These event sources and targets are part of scene description nodes,
and thus allow close coupling of dynamic and interactive behavior with the specific scene at hand. ISO/IEC 14496-
1, however, does not specify a particular user interface or a mechanism that maps user actions (e.g., keyboard key
presses or mouse movements) to such events.
Such an interactive environment may not need an upstream channel, but ISO/IEC 14496 also provides means for
client-server interactive sessions with the ability to set up upstream elementary streams and associate them to
specific downstream elementary streams.
0.6.3 Audio-visual Streams
The coded representations of audio and visual information are described in ISO/IEC 14496-3 and ISO/IEC 14496-
2, respectively. The reconstructed audio-visual data are made available to the composition process for potential
use during the scene rendering.
0.6.4 Upchannel Streams
Downchannel elementary streams may require upchannel information to be transmitted from the receiving terminal
to the sending terminal (e.g., to allow for client-server interactivity). Figure 1 indicates the flowpath for an
elementary stream from the receiving terminal to the sending terminal. The content of upchannel streams is
specified in the same part of the specification that defines the content of the downstream data. For example,
upchannel control streams for video downchannel elementary streams are defined in ISO/IEC 14496-2.
0.7 Application Engine
The MPEG-J is a programmatic system (as opposed to a conventional parametric system) which specifies API for
interoperation of MPEG-4 media players with Java code. By combining MPEG-4 media and safe executable code,
content creators may embed complex control and data processing mechanisms with their media data to intelligently
manage the operation of the audio-visual session. The parametric MPEG-4 System forms the Presentation Engine
while the MPEG-J subsystem controlling the Presentation Engine forms the Application Engine.
The Java application is delivered as a separate elementary stream to the MPEG-4 terminal. There it will be directed
to the MPEG-J run time environment, from where the MPEG-J program will have access to the various components
and required data of the MPEG-4 player to control it.
In addition to the basic packages of the language (java.lang, java.io, java.util) a few categories of APIs have been
defined for different scopes. For Scene graph API the objective is to provide access to the scene graph: to inspect
the graph, to alter nodes and their fields, and to add and remove nodes within the graph. The Resource API is used
for regulation of performance: it provides a centralized facility for managing resources. This is used when the
program execution is contingent upon the terminal configuration and its capabilities, both static (that do not change
during execution) and dynamic. Decoder API allows the control of the decoders that are present in the terminal.
The Net API provides a way to interact with the network, being compliant to the MPEG-4 DMIF Application
Interface. Complex applications and enhanced interactivity are possible with these basic packages. The
architecture of MPEG-J will be presented in more detail in clause 11.
xii © ISO/IEC 2001 – All rights reserved

INTERNATIONAL STANDARD ISO/IEC 14496-1:2001(E)
Information technology — Coding of audio-visual objects —
Part 1:
Systems
1 Scope
This part of ISO/IEC 14496 specifies system level functionalities for the communication of interactive audio-visual
scenes. More specifically:
1. system level description of the coded representation of natural or synthetic, two-dimensional (2D) or three-
dimensional (3D) objects that can be manifested audibly and/or visually (audio-visual objects);
2. the coded representation of the spatio-temporal positioning of audio-visual objects as well as their behavior in
response to interaction (scene description); and
3. the coded representation of information related to the management of data streams (synchronization,
identification, description and association of stream content).
4. a system level description of an application engine (format, delivery, lifecycle, and behavior of downloadable
Java byte code applications); and
5. a system level interchange and storage format of interactive audio-visual scenes.
2 Normative references
The following normative documents contain provisions which, through reference in this text, constitute provisions of
this part of ISO/IEC 14496. For dated references, subsequent amendments to, or revisions of, any of these
publications do not apply. However, parties to agreements based on this part of ISO/IEC 14496 are encouraged to
investigate the poss
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...