Information technology - MPEG systems technologies - Part 1: Binary MPEG format for XML

ISO/IEC 23001-1:2006 provides a standardized set of generic technologies for encoding XML documents. It addresses a broad spectrum of applications and requirements by providing generic methods for transmitting and compressing XML documents. ISO/IEC 23001-1:2006 provides a specification which gives rules for the preparation of XML documents for efficient transport and storage, and enables the development of ISO/IEC 23001-1 terminals to receive, decode and assemble possibly partitioned and compressed XML documents. The binary MPEG format for XML relies on schema knowledge between encoder and decoder in order to reach high compression efficiency, while providing fragmentation mechanisms for ensuring transmission and processing flexibility. ISO/IEC 23001-1:2006 also defines means to compile and transmit schema knowledge information to enable the decoding of compressed XML documents without a priori schema knowledge at the receiving terminal. The binary MPEG format for XML is described in four main sections: System Architecture presents the architecture of an ISO/IEC 23001-1-compliant terminal and general characteristics of an ISO/IEC 23001-1 decoder, such as decoder behaviour. Binary Format specifies binary syntax and associated semantics of the structural elements. In particular, this section describes the structure of a binary access unit. Binary Fragment Update Payload specifies binary syntax and associated semantics of the payload content. In particular, this section describes the decoding process of complex Type content using finite state automaton decoders. Advanced Optimized Decoders describes the mechanisms for decoding simple types of an XML document using advanced optimised decoders. The binary format for XML described in this specification can be used for encoding MPEG-7 and MPEG-21 descriptions, as specified in ISO/IEC 15938-1 (MPEG-7 Systems) and 21000-16 (MPEG-21 Binary Format), respectively.

Technologies de l'information — Technologies des systèmes MPEG — Partie 1: Format binaire de MPEG pour XML

General Information

Status
Published
Publication Date
23-Mar-2006
Current Stage
9060 - Close of review
Completion Date
02-Sep-2027
Ref Project

Relations

Overview

ISO/IEC 23001-1:2006 - Binary MPEG format for XML (BiM) - specifies a standardized set of technologies for encoding, compressing and transporting XML documents in an efficient binary form. The standard defines system-level behavior for ISO/IEC 23001-1 compliant terminals (decoders), rules for preparing XML for optimized transport and storage, fragmentation and reassembly mechanisms, and methods to share or compile schema knowledge so compressed XML can be decoded even when the receiver lacks the original schema.

Key topics and technical requirements

  • System architecture and decoder behavior: normative definitions of an ISO/IEC 23001-1 terminal, initialization sequence, and decoder processing expectations.
  • Binary format (BiM): binary syntax and semantics for structural elements, including the concept of a Binary Access Unit and units for fragment, schema and command handling.
  • Fragmentation and transport: mechanisms for partitioning XML documents into fragment/update units to support flexible transmission and partial processing.
  • Schema-based compression: leveraging shared schema knowledge between encoder and decoder to achieve high compression efficiency; also provisions for transmitting schema information when receivers lack a priori schema.
  • Fragment update payloads and decoding: binary payload syntax and semantics for element content and complex type decoding (finite-state automaton approach).
  • Advanced optimized decoders: optional decoder mechanisms for efficient decoding of simple types (e.g., quantizers, zlib-based approaches listed in the standard) to improve performance and compression.
  • Normative references: aligns with XML, XML Schema, XPath, RFCs (URI syntax), ZLIB format and character-set standards.

Applications and who uses it

ISO/IEC 23001-1:2006 is targeted at implementers and architects who need compact, transport-friendly representations of XML in multimedia and metadata systems:

  • Multimedia systems and streaming services using XML metadata (e.g., MPEG-7) to reduce bandwidth and storage.
  • Device and terminal manufacturers building decoders that receive compressed XML over networks or broadcast.
  • Developers of metadata frameworks (MPEG-7, MPEG-21) needing standardized binary XML encodings.
  • Broadcast, mobile and low-bandwidth applications where efficient encoding, fragmentation and partial updates of XML are required.
  • Standards implementers and integrators requiring deterministic decoder behavior and interoperable binary XML formats.

Related standards

  • ISO/IEC 15938-1 (MPEG-7 Systems) - for MPEG-7 metadata usage.
  • ISO/IEC 21000-16 (MPEG-21 Binary Format) - for MPEG-21 usage.
  • W3C XML and XML Schema (normative references), RFC 2396 (URI), RFC 1950 (ZLIB).

Note: Implementers should review Annex C of the standard for patent statements and licensing considerations when deploying BiM-based solutions.

Standard
ISO/IEC 23001-1:2006 - Information technology -- MPEG systems technologies
English language
133 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


INTERNATIONAL ISO/IEC
STANDARD 23001-1
First edition
2006-04-01
Information technology — MPEG systems
technologies —
Part 1:
Binary MPEG format for XML
Technologies de l'information — Technologies des systèmes MPEG —
Partie 1: Format binaire de MPEG pour XML

Reference number
©
ISO/IEC 2006
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.

©  ISO/IEC 2006
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2006 – All rights reserved

Contents Page
Foreword. iv
Introduction . v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions. 2
3.1 Conventions . 2
3.2 Definitions . 5
4 Symbols and abbreviated terms . 11
4.1 Abbreviations . 11
4.2 Mathematical operators. 12
4.3 Mnemonics . 14
5 System architecture. 15
5.1 Terminal architecture . 15
5.2 General characteristics of the decoder . 15
5.3 Sequence of events during decoder initialisation. 16
5.4 Decoder behaviour . 18
5.5 Issues in encoding documents . 19
5.6 Characteristics of the delivery layer. 20
5.7 Decoding of Fragment References . 21
6 Binary format- BiM. 22
6.1 Overview . 22
6.2 Binary DecoderInit. 22
6.3 Binary Access Unit . 31
6.4 Binary Fragment Update Unit . 32
6.5 Binary Fragment Update Command . 34
6.6 Binary Fragment Update Context. 36
6.7 Binary Schema Update Unit. 60
7 Binary Fragment Update Payload . 81
7.1 Overview . 81
7.2 Definitions . 81
7.3 Fragment Update Payload syntax and semantics. 82
7.4 Element syntax and semantics . 84
7.5 Element Content decoding process . 96
8 Advanced optimised decoders. 113
8.1 Overview . 113
8.2 Decoder behaviour . 114
8.3 Advanced Optimised Decoder Initialization. 116
8.4 Advanced Optimised Decoder Classification scheme. 118
8.5 UniformQuantizer advanced optimised decoder. 118
8.6 NonUniformQuantizer optimized decoder . 120
8.7 Zlib advanced optimised decoder. 122
Annex A (normative) MPEG-7 Specific Simple Type Codecs. 125
Annex B (informative) Informative Examples. 129
Annex C (informative) Patent Statements. 132
Bibliography . 133

© ISO/IEC 2006 – All rights reserved iii

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
ISO/IEC 23001-1 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
ISO/IEC 23001 consists of the following parts, under the general title Information technology — MPEG
systems technologies:
⎯ Part 1: Binary MPEG format for XML
iv © ISO/IEC 2006 – All rights reserved

Introduction
This International Standard provides a standardized set of generic technologies for encoding XML documents.
It addresses a broad spectrum of applications and requirements by providing generic methods for transmitting
and compressing XML documents.
Part 1 – Binary Format for XML: specifies the tools for preparing XML documents for efficient transport and
storage and for compressing XML documents.
The International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC)
draw attention to the fact that it is claimed that compliance with this document may involve the use of a patent.
The ISO and IEC take no position concerning the evidence, validity and scope of this patent right.
The holder of this patent right has assured ISO and IEC that he is willing to negotiate licences under
reasonable and non-discriminatory terms and conditions with applicants throughout the world. In this respect,
the statement of the holder of this patent right is registered with the ISO and IEC. Information may be obtained
from the companies listed in Annex C.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights other than those identified in Annex C. ISO and IEC shall not be held responsible for identifying any or
all such patent rights.
© ISO/IEC 2006 – All rights reserved v

INTERNATIONAL STANDARD ISO/IEC 23001-1:2006(E)

Information technology — MPEG systems technologies —
Part 1:
Binary MPEG format for XML
1 Scope
This part of ISO/IEC 23001 provides a standardized set of technologies for encoding XML documents. It
addresses a broad spectrum of applications and requirements by providing a generic method for transmitting
and compressing XML documents.
This part of ISO/IEC 23001 specifies system level functionalities for the communication of XML documents. It
provides a specification which will:
⎯ enable the development of ISO/IEC 23001-1 receiving sub-systems, called ISO/IEC 23001-1 Terminal, or
Terminal in short, to receive and assemble possibly partitioned and compressed XML documents
⎯ provide rules for the preparation of XML documents for efficient transport and storage.
The decoding process within the ISO/IEC 23001-1 Terminal is normative. The rules mentioned provide
guidance for the preparation and encoding of XML documents without leading to a unique encoded
representation of such documents.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
• ISO/IEC 10646:2003, Information technology — Universal Multiple-Octet Coded Character Set (UCS)
Note: The UTF-8 encoding scheme is described in Annex R of ISO/IEC 10646-1:1993, published as
Amendment 2 of ISO/IEC 10646-1:1993.
• XML, Extensible Markup Language (XML) 1.0, October 2000.
• XML Schema, W3C Recommendation, 2 May 2001.
• XML Schema Part 0: Primer, W3C Recommendation, 2 May 2001.
• XML Schema Part 1: Structures, W3C Recommendation, 2 May 2001.
• XML Schema Part 2: Datatypes, W3C Recommendation 2 May 2001.
• XPath, XML Path Language, W3C Recommendation, 16 November 1999.
• Namespaces in XML, W3C Recommendation, 14 January 1999.
© ISO/IEC 2006 – All rights reserved 1

Note: These documents are maintained by the W3C (http://www.w3.org). The relevant documents can be
obtained as follows:
o Extensible Markup Language (XML) 1.0 (Second Edition),6 October 2000,
http://www.w3.org/TR/2000/REC-xml-20001006
o XML Schema: W3C Recommendation, 2 May 2001, http://www.w3.org/XML/Schema
• XML Schema Part 0: Primer, W3C Recommendation, 2 May 2001,
http://www.w3.org/TR/xmlschema-0/
• XML Schema Part 1: Structures, W3C Recommendation, 2 May 2001,
http://www.w3.org/TR/xmlschema-1/
• XML Schema Part 2: Datatypes, W3C Recommendation 2 May 2001,
http://www.w3.org/TR/xmlschema-2/
o xPath, XML Path Language, W3C Recommendation, 16 November 1999,
http://www.w3.org/TR/1999/REC-xpath-19991116
o Namespaces in XML, W3C Recommendation, 14 January 1999,
http://www.w3.org/TR/1999/REC-xml-names-19990114
• RFC 2396, Uniform Resource Identifiers (URI): Generic Syntax.
• RFC 1950, ZLIB Compressed Data Format Specification version 3.3.
• IEEE Standard for Binary Floating-Point Arithmetic, Std 754-1985 Reaffirmed1990,
http://standards.ieee.org/reading/ieee/std_public/description/busarch/754-1985_desc.html
3 Terms and definitions
3.1 Conventions
3.1.1 Naming convention
In order to specify data types and documents model, this part of ISO/IEC 23001 uses constructs specified in
XML Schema, such as “element”, “attribute”, “simpleType” and “complexType”. The names associated with
these constructs are created on the basis of the following conventions:
If the name is composed of various words, the first letter of each word is capitalized. The rule for the
capitalization of the first word depends on the type of construct and is described below.
⎯ Element naming: the first letter of the first word is capitalized (e.g. TimePoint element of TimeType).
⎯ Attribute naming: the first letter of the first word is not capitalized (e.g. timeUnit attribute of
IncrDurationType).
⎯ complexType naming: the first letter of the first word is capitalized, the suffix “Type” is used at the end of
the name.
⎯ simpleType naming: the first letter of the first word is not capitalized, the suffix “Type” may be used at
the end of the name.
2 © ISO/IEC 2006 – All rights reserved

3.1.2 Documentation convention
3.1.2.1 Textual syntax
The syntax of each XML schema item is specified using the constructs specified in XML Schema. It is
depicted in this document using a specific font and background, as shown in the example below:






Non-normative XML examples are included in separate subclauses. They are depicted in this document using
a separate font and background than the normative syntax specifications, as shown in the example below:

example element content

3.1.2.2 Binary syntax
3.1.2.2.1 Overview
The binary document stream retrieved by the decoder is specified in Clause 6, Clause 7, and Clause 8. Each
data item in the binary document stream is printed in bold type. It is described by its name, its length in bits,
and by a mnemonic for its type and order of transmission. The construct “N+” in the length field indicates that
the length of the element is an integer multiple of N.
The action caused by a decoded data element in a bitstream depends on the value of the data element and
on data elements that have been previously decoded. The following constructs are used to express the
conditions when data elements are present:
while ( condition ) { If the condition is true, then the group of data elements
data_element occurs next in the data stream. This repeats until the
. . . condition is not true.
}
do {
data_element The data element always occurs at least once.
. . .
} while ( condition ) The data element is repeated until the condition is not true.
if ( condition ) { If the condition is true, then the first group of data
data_element elements occurs next in the data stream.
. . .
} else { If the condition is not true, then the second group of data
data_element elements occurs next in the data stream.
. . .
}
© ISO/IEC 2006 – All rights reserved 3

for ( i = m; i < n; i++) { The group of data elements occurs (n-m) times. Conditional
data_element constructs within the group of data elements may depend
. . . on the value of the loop control variable i, which is set to
} m for the first occurrence, incremented by one for
the second occurrence, and so forth.
/* comment */ Explanatory comment that may be deleted entirely without
in any way altering the syntax.

This syntax uses the 'C-code' convention that a variable or expression evaluating to a non-zero value is
equivalent to a condition that is true and a variable or expression evaluating to a zero value is equivalent to a
condition that is false.
Use of function-like constructs in syntax tables
In some syntax tables, function-like constructs are used in order to pass the value of a certain syntax element
or decoding parameter down to a further syntax table. In that table, the syntax part is then defined like a
function in e.g. C program language, specifying in brackets the type and name of the passed syntax element
or decoding parameter, and the returned syntax element type, as shown in the following example:
datatype Function(datatype parameter_name) { Number of bits Mnemonic
if (parameter_name == .) {
OtherFunction(parameter_name)
} else if .
.....
} else {
.....
}
Return return_value
}
Here, the syntax table describing the syntax part called “Function” receives the parameter “parameter_name”
which is of datatype “datatype”. The parameter “parameter_name” is used within this syntax part, and it can
also be passed further to other syntax parts, in the table above e.g. to the syntax part “OtherFunction”.
The parsing of the binary syntax is expressed in procedural terms. However, it should not be assumed that
Clause 6, 7 and 8 implement a complete decoding procedure. In particular, the binary syntax parsing in this
specification assumes a correct and error-free binary document stream. Handling of erroneous binary
document streams is left to individual implementations.
Syntax elements and data elements are depicted in this document using a specific font such as the following
example: FragmentUpdatePayload.
boolean
In some syntax tables, the “true” and “false” constructs are used. If present in the stream “true” shall be
represented with a single bit of value “1” and “false” shall be represented with a single bit of value “0”.
3.1.2.2.2 Arrays
Arrays of data elements are represented according to the C-syntax as described below. It should be noted
that each index of an array starts with the value “0”.
4 © ISO/IEC 2006 – All rights reserved

data_element[n]  is the n+1th element of an array of data.
data_element[m][n] is the m+1, n+1th element of a two-dimensional array of data.
data_element[l][m][n] is the l+1, m+1, n+1th element of a three-dimensional array of data.
3.1.2.2.3 Functions
3.1.2.2.3.1 nextByteBoundary()
The function “nextByteBoundary()” reads and consumes bits from the binary document stream until but not
including the next byte-aligned position in the binary document stream.
3.1.2.2.4 Reserved values and forbidden values
The terms “reserved” and “forbidden” are used in the description of some values of several code and index
tables.
The term “reserved” indicates that the value shall not occur in a binary document stream. It may be used in
the future for ISO/IEC defined extensions.
The term “forbidden” indicates a value that shall not occur in a binary document stream.
3.1.2.2.5 Reserved bits and stuffing bits
ReservedBits: a binary syntax element whose length is indicated in the syntax table. The value of each bit of
this element shall be “1”. These bits may be used in the future for ISO/IEC defined extensions.
Stuffing bits: bits inserted to align the binary document stream, for example to a byte boundary. The value of
each of these bits in the binary document stream shall be “1”.
ReservedBitsZero: a binary syntax element whose length is indicated in the syntax table. The value of each
bit of this element shall be “0”. These bits may be used in the future for ISO/IEC defined extensions.
3.1.2.3 Textual and binary semantics
The semantics of each schema or binary syntax component, is specified using a table format, where each row
contains the name and a definition of that schema or binary syntax component:
Name Definition
ExampleType Specifies an .
element1 Describes the …
attribute1 Describes the …
3.2 Definitions
3.2.1
access unit
An entity within an XML document that is atomic in time, i.e., to which a composition time can be attached. An
access unit is composed of one or more fragment update units.
© ISO/IEC 2006 – All rights reserved 5

3.2.2
additional schema
A schema that can be updated after the start of the decoding process.
3.2.3
advanced optimised decoder
An optimised decoder used to decode a simple type. Advanced optimised decoders parameters and their
mappings to types can be modified during binary document stream lifetime.
3.2.4
advanced optimised decoder instance
An advanced optimised decoder initialised and ready to be used for the decoding of some data types.
Note - There can be several instances of the same advanced optimised decoder with different or identical parameters.
3.2.5
advanced optimised decoder instances table
A table of all the advanced optimised decoders available at a certain instant in time.
3.2.6
advanced optimised decoder parameters
The parameters of an advanced optimised decoder.
3.2.7
advanced optimised decoder type
The type, identified by a URI, of an advanced optimised decoder.
3.2.8
application
An abstraction of any entity that makes use of the decoded document stream.
3.2.9
binary access unit
An access unit in binary format as specified in Clause 6 and 7.
3.2.10
binary document stream
A concatenation of binary access units as specified in Clause 6 and 7.
3.2.11
binary format document tree
The internal binary decoder model.
3.2.12
byte-aligned
A bit in a binary document stream is byte-aligned if its position is a multiple of 8-bits from the first bit in the
binary document stream.
3.2.13
composition time
The point in time when a specific access unit becomes known to the application.
3.2.14
content particle
A particle is a term in the XML Schema grammar for element content, consisting of an element declaration, a
wildcard or a model group, together with occurrence constraints. Refers to XML SCHEMA.
6 © ISO/IEC 2006 – All rights reserved

3.2.15
context mode
Information in the fragment update context specifying how to interpret the subsequent context path information.
3.2.16
context node
The context node is specified by the context path of the current fragment update context. It is the parent of the
operand node.
3.2.17
context path
Information that identifies and locates the context node and the operand node in the current document tree.
3.2.18
contextual optimised decoder
“An optimised decoder which behavior is dependent on the current context of the decoding.
Note - For instance, the ZLib optimised decoder (see Clause 8) is a contextual optimised decoder.
Note - Upon certain events, the context must be reset. Upon a certain command or events they are flushed to release their
contents. Only contextual optimised decoders are flushable.”
3.2.19
contextual optimised decoder reset
An operation that resets the optimised decoder to put it in a defined initial state. All contextual information is
discarded.
3.2.20
current context node
The starting node for the context path in case of relative addressing.
3.2.21
current document
The document that is conveyed by the initial document and all access units up to a given composition time.
3.2.22
current document tree
The XML document tree that represents the current document.
3.2.23
deferred fragment reference
A fragment reference that can be resolved at any time by the application using the terminal.
3.2.24
deferred node
A node which is present in the document tree at encoder side and for which the following is true: No part of
that node has been sent to the decoder but the existence of that node has been signalled to the decoder.
3.2.25
delivery layer
An abstraction of any underlying transport or storage functionality.
3.2.26
derived type
A type defined by the derivation of an other type.
3.2.28
document
Short term for a structured XML document.
© ISO/IEC 2006 – All rights reserved 7

3.2.29
document composer
An entity that reconstitutes the current document tree from the fragment update units.
3.2.30
document fragment
A contiguous part of a document attached at a single node. Using the representation model of a document
tree, the document fragment is represented by a sub-tree of the document tree.
3.2.31
document fragment reference
A reference to a document fragment.
Note - For instance, a fragment reference can be a URI which serves to locate the fragment on the world wide web.
3.2.32
document stream
The ordered concatenation of either binary or textual access units conveying a single, possibly time-variant,
document.
3.2.33
document tree
A model that is used throughout this specification in order to represent documents. A document tree consists
of nodes, which represent elements or attributes of an XML document. Each node may have zero, one or
more child nodes. Simple content are considered as child nodes in Clause 6 of the specification.
3.2.34
effective content particle
The particle of a complexType used for the validation process.
3.2.35
fixed optimised decoder
An optimised decoder used to decode either a complex type or a simple type. Fixed optimised decoders are
set up at decoder initialisation phase and their mapping to types can't be modified during binary document
stream lifetime.
3.2.36
fragment reference
short term for document fragment reference.
3.2.37
fragment reference format
An encoding format of fragment references.
3.2.38
fragment reference marker
A specific information used to describe a deferred fragment reference, which is present within the current
document tree. It consists of a fragment reference, the name and type of the top most element of the
referenced fragment.
3.2.39
fragment reference resolver
An entity that is capable of resolving the fragment reference provided in the fragment update payload.
3.2.40
fragment update command
A command within a fragment update unit expressing the type of modification to be applied to the part of the
current document tree that is identified by the associated fragment update context.
8 © ISO/IEC 2006 – All rights reserved

3.2.41
fragment update component extractor
An entity that de-multiplexes a fragment update unit, resulting in the unit’s components: fragment update
command, fragment update context, and fragment update payload.
3.2.42
fragment update context
Information in a fragment update unit that specifies on which node in the current document tree the fragment
update command shall be executed. Additionally, the fragment update context specifies the data type of the
element encoded in the subsequent fragment update payload.
3.2.43
fragment update decoder parameters
Configuration parameters conveyed in the DecoderInit (see 6.2) that are required to specify the decoding
process of the fragment update decoder.
3.2.44
fragment update payload
Information in a fragment update unit that conveys the information which is added to the current document or
which replaces a part of the current document.
3.2.45
fragment update payload decoder
The entity that decodes the fragment update payload information of the fragment update.
3.2.46
fragment update unit
Information in an access unit, conveying a document or a portion thereof. Fragment update units provide the
means to modify the current document. They are nominally composed of a fragment update command, a
fragment update context and a fragment update payload.
3.2.47
initial document
A document that initialises the current document tree without conveying it to the application (see 5.3). The
initial document is part of the DecoderInit (see 6.2).
3.2.48
initial schema
The schema that is known by the decoder before the decoding process starts.
3.2.49
initialisation extractor
An entity that de-multiplexes the DecoderInit (see 6.2), resulting in its components initial document, fragment
update decoder parameters and schema URI.
3.2.51
non-deferred fragment reference
A fragment reference that shall be resolved by the terminal at the composition time of the access unit
containing the fragment reference.
3.2.52
operand node
The node in the binary format document tree that is either added, deleted or replaced according to the current
fragment update command and fragment update payload. The operand node is always a child node of the
context node.
© ISO/IEC 2006 – All rights reserved 9

3.2.53
optimised decoder
A decoder associated to a type and dedicated to certain encoding methods better suited than the generic
ones.
3.2.54
optimised decoder mapping
An association between a type and a set of optimised decoders.
3.2.55
schema
A schema is represented in XML by one or more “schema documents”, that is, one or more “”
element information items. A “schema document” contains representations for a collection of schema
components, e.g. type definitions and element declarations, which have a common target namespace. A
schema document which has one or more “” element information items corresponds to a schema with
components with more than one target namespace. Refer also to XML Schema.
3.2.56
schema resolver
An entity that is capable of resolving the schema identification provided in the DecoderInit (see 6.2), and to
possibly retrieve the specified schemas.
3.2.57
schema update unit
Information in an access unit, conveying a schema or a portion thereof. Schema update units provide the
means to modify the current decoder schema knowledge.
3.2.58
schema URI
A URI that uniquely identifies a schema.
3.2.59
schema valid
A document that is schema valid satisfies the constraints embodied in the Schema to which it should conform.
3.2.60
selector node
The parent node of the topmost node of a document tree. It artificially extends the document tree to allow the
addressing of the topmost node.
3.2.61
skippable subtree
A subtree of an XML document that the decoder is permitted not to decode.
3.2.62
super type
The parent of a type in its type hierarchy.
3.2.63
systems layer
An abstraction of the tools and processes specified in this specification.
3.2.64
terminal
The entity that makes use of a coded representation of a document.
10 © ISO/IEC 2006 – All rights reserved

3.2.67
topmost node
The node specified by the first element in the document, instantiating one of the global elements declared in
the schema.
3.2.68
type codec
Synonym to optimised decoder.
3.2.69
type hierarchy
The hierarchy of type derivations.
3.2.70
validation
The process of parsing an XML document to determine whether it satisfies the constraints embodied in the
Schema to which it should conform.
3.2.71
XML Schema parser
An application that is capable of validating document schemes (content and structure) and descriptor data
types against their schema definition.
4 Symbols and abbreviated terms
4.1 Abbreviations
AU Access Unit
BiM Binary format for document streams
D Descriptor
DL Delivery Layer
FU Fragment Update
FUU Fragment Update Unit
FSAD Finite State Automaton Decoder
MPC Multiple element Position Code
SBC Schema Branch Code
SPC Single element Position Code
TBC Tree Branch Code
URI Uniform Resource Identifier
URL Uniform Resource Locator
UTF Universal Character Set Transformation Formats
XML Extensible Markup Language
XPath XML Path Language
MSB Most Significant Bit
SU Schema Update
SUU Schema Update Unit
© ISO/IEC 2006 – All rights reserved 11

4.2 Mathematical operators
The mathematical operators used to describe this specification are similar to those used in the C programming
language. However, integer divisions with truncation and rounding are specifically defined. Numbering and
counting loops generally begin from zero.
4.2.1 Arithmetic operators
+ Addition.
- Subtraction (as a binary operator) or negation (as a unary operator).
++ Increment. i.e. x++ is equivalent to x = x + 1
- - Decrement. i.e. x-- is equivalent to x = x - 1
* Multiplication.
^ Power.
1 x≥ 0

sign(x)=

−1 x< 0

sign( )
abs(x)= x⋅ sign(x)
abs( )
log 2(x)= log (x)
log2(.)
Ceil (x) denotes the smallest integer larger than or equal to x.
int(.) truncation of the argument to its integer value, e.g. 1.3 is truncated to 1 and –3.7 is truncated
to –3.
i f(i)

i=a
the summation of the f(i) with i taking integral values from a up to, but not including b.
4.2.2 Logical operators
|| Logical OR.
&& Logical AND.
! Logical NOT.
4.2.3 Relational operators
> Greater than.
>= Greater than or equal to.
< Less than.
<= Less than or equal to.
== Equal to.
!= Not equal to.
max (, .,) the maximum value in the argument list.
min (, . ,) the minimum value in the argument list.
12 © ISO/IEC 2006 – All rights reserved

4.2.4 Assignment
= Assignment operator.
4.2.5 Character string comparison
Many phases of the fragment encoding rely on a string comparison method. This method is based on the
Unicode value of each character in the strings. The following defines the notion of lexicographic ordering:
Two strings are different if they have different characters at some index that is a valid index for both strings, or
if their lengths are different, or both.
If they have different characters at one or more index positions, let k be the smallest such index; then the
string whose character at position k has the smaller value, as determined by using the < operator,
lexicographically precedes the other string.
If there is no index position at which they differ, then the shorter string lexicographically precedes the longer
string.
This string comparison is described by each method that is functionally equivalent to the following procedure:
compare_strings(string1, string2) {
len1 = length(string1);
len2 = length(string2);
n = min(len1, len2);
i = 0;
j = 0;
while (n-- != 0) {
c1 = string1[i++];
c2 = string2[j++];
if (c1 != c2) {
return c1 - c2;
}
}
return len1 - len2;
}
© ISO/IEC 2006 – All rights reserved 13

4.3 Mnemonics
The following mnemonics are defined to describe the different data types used in the coded bitstream.
Name Definition
bslbf Bit string, left bit first, where “left” is the order in which bit strings are written in this
specification. Bit strings are generally written as a string of 1s and 0s within single quote
marks, e.g. '1000 0001'. Blanks within a bit string are for ease of reading and have no
significance. For convenience large strings are occasionally written in hexadecimal, in
this case conversion to a binary in the conventional manner will yield the value of the bit
string. Thus the left most hexadecimal digit is first and in each hexadecimal digit the most
significant of the four bits is first.
uimsbf Unsigned integer, most significant bit first.
vlclbf Variable length code, left bit first, where “left” refers to the order in which the VLC codes
are written. The byte order of multibyte words is most significant byte first.
vluimsbf8 Variable length code unsigned integer, most significant bit first. The size of vluimsbf8 is a
multiple of one byte. The first bit (Ext) of each byte specifies if set to 1 that another byte
is present for this vluimsbf8 code word. The unsigned integer is encoded by the
concatenation of the seven least significant bits of each byte belonging to this vluimsbf8
code word
An example for this type is shown in Figure 1.
vluimsbf5 Variable length code unsigned integer, most significant bit first. The first n bits (Ext)
which are 1 except of the n-th bit which is 0, indicate that the integer is encoded by n
times 4 bits.
An example for this type is shown in Figure 2.
Variable length code unsigned rational number, most significant bit first. The first n bits
vlurmsbf5
(Ext) which are ‘1’ except of the nth bit which is ‘0’, indicate that the rational number R in
the interval 0≤R<1 is encoded by n times 4 bits. The ith bit of the n times 4 bits
representing the rational number corresponds to a value of 2^-i. Thus the (n+1)st bit of
the vlurmsbf5 code word (which corresponds to the MSB of the rational number)
represents a value of ½, the (n+2)nd bit of the vlurmsbf5 code word represents a value of
¼., and so forth.
An example for this type is shown in Figure 3.
Note - Comparing two rational numbers A and B represented by a vlurmsbf5 code word
can be done by comparing bit by bit the rational numbers starting from their respective
MSBs. Then the rational number A is bigger if there is a ‘1’ bit at a position at which there
is a ‘0’ for B. A is also bigger if there is a ‘1’ bit at a position which is not present for B
and when A is longer than B.
7 most significant bits of a 14 bit integer 7 least significant bits of a 14 bit integer
Ext Ext
4.Bit 5.Bit 6.Bit 7.Bit 0 8.Bit 9.Bit 10.Bit 11.Bit 12.Bit 13.Bit 14.Bit
1 MSB 2.Bit 3.Bit
Figure 1 — Informative example for the vluimsbf8 data type
14 © ISO/IEC 2006 – All rights reserved

unsigned integer represented by 12 bits
Ext bits
MSB Bit 2 Bit 3 Bit 4 Bit 5 Bit 6 Bit 7 Bit 8 Bit 9 Bit 10 Bit 11 Bit 12
1 1 0
Figure 2 — Informative example for the vluimsbf5 data type

rational number represented by N* =12 bits
Ext bits
MSB BitBit 2 Bit 3 4 Bit 5 Bit 6 Bit 7 Bit 8 Bit 9 Bit 10 Bit 11 Bit 12
1 1 0
-1 -1 -1 -1 -1
value of bits: 1/2 1/4 1/8 1/16 1/32 1/64 5121/128 256 2048 4096

Figure 3 — Informative example for the vlurmsbf5 data type
5 System architecture
5.1 Terminal architecture
ISO/IEC 23001-1 provides the means to represent coded XML documents. The entity that makes use of such
coded representations of XML documents is generically referred to as the “ISO/IEC 23001-1 terminal” or just
“terminal” in short. This terminal may correspond to a standalone application or be part of an application
system.
This and the following three subclauses provide the description of an ISO/IEC 23001-1 terminal, its
components, and their operation. The architecture of such a terminal is depicted in Figure 4. The following
subclauses introduce the tools specified in this specification.
In Figure 4, there are three main layers outlined: the application, the normative systems layer, and the delivery
layer. ISO/IEC 23001-1 is not concerned with any storage and/or transmission media (whose behaviours and
characteristics are abstracted by the delivery layer) or the way the application processes the current document.
This specification does make specific assumptions about the delivery layer, and those assumptions are
outlined in subclause 5.5.4. The systems layer defines a decoder whose architecture is described here to
provide an overview and to establish common terms of reference. A compliant decoder need not implement
the constituent parts as visualised in Figure 4, but shall implement the normative decoding process specified
in Clauses 6 through 8.
5.2 General characteristics of the decoder
5.2.1 General characteristics of document streams
An ISO/IEC 23001-1 terminal consumes document streams and outputs a – potentially dynamic –
representation of the document called the current document tree. Document streams shall consist of a
sequence of one or more individually accessible portions of data named access units. An Access Unit (AU) is
the smallest data entity to which “terminal-oriented” (as opposed to “described-media oriented”) timing
information can be attributed. This timing information is called the “composition” time, meaning the point in
time when the resulting current document tree corresponding to a specific access unit becomes known to the
application. The timing information shall be carried by the delivery layer (see subclause 5.6). The current
document tree shall be schema-valid after processing each access unit.
A document stream consists of binary access units is termed a binary document stream and is processed by a
binary decoder (see subclause 5.2.2 and Clauses 6 and 7).
© ISO/IEC 2006 – All rights reserved 15

5.2.2 Principles of the binary decoder
Using the ISO/IEC 23001-1 generic method for binary encoding, called BiM, a document (nominally in a
textual XML form) can be compressed, partitioned, streamed, and reconstructed at terminal side. The
reconstructed XML document will not be byte-equivalent to the original document. Namely, the binary
encoding method does not preserve processing instructions, attribute order, comments, or non-significant
whitespace. However, the encoding process ensures that XML element order is preserved.
The BiM, in order to gain its compression efficiency, relies on a schema analysis phase. During this phase,
internal tables are computed to associate binary code to schema components (XML elements, types and
attributes). BiM defines two methods to address schema components.
The first method allows the decoder to resolve a schema, possibly including schema components originat
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...

Frequently Asked Questions

ISO/IEC 23001-1:2006 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - MPEG systems technologies - Part 1: Binary MPEG format for XML". This standard covers: ISO/IEC 23001-1:2006 provides a standardized set of generic technologies for encoding XML documents. It addresses a broad spectrum of applications and requirements by providing generic methods for transmitting and compressing XML documents. ISO/IEC 23001-1:2006 provides a specification which gives rules for the preparation of XML documents for efficient transport and storage, and enables the development of ISO/IEC 23001-1 terminals to receive, decode and assemble possibly partitioned and compressed XML documents. The binary MPEG format for XML relies on schema knowledge between encoder and decoder in order to reach high compression efficiency, while providing fragmentation mechanisms for ensuring transmission and processing flexibility. ISO/IEC 23001-1:2006 also defines means to compile and transmit schema knowledge information to enable the decoding of compressed XML documents without a priori schema knowledge at the receiving terminal. The binary MPEG format for XML is described in four main sections: System Architecture presents the architecture of an ISO/IEC 23001-1-compliant terminal and general characteristics of an ISO/IEC 23001-1 decoder, such as decoder behaviour. Binary Format specifies binary syntax and associated semantics of the structural elements. In particular, this section describes the structure of a binary access unit. Binary Fragment Update Payload specifies binary syntax and associated semantics of the payload content. In particular, this section describes the decoding process of complex Type content using finite state automaton decoders. Advanced Optimized Decoders describes the mechanisms for decoding simple types of an XML document using advanced optimised decoders. The binary format for XML described in this specification can be used for encoding MPEG-7 and MPEG-21 descriptions, as specified in ISO/IEC 15938-1 (MPEG-7 Systems) and 21000-16 (MPEG-21 Binary Format), respectively.

ISO/IEC 23001-1:2006 provides a standardized set of generic technologies for encoding XML documents. It addresses a broad spectrum of applications and requirements by providing generic methods for transmitting and compressing XML documents. ISO/IEC 23001-1:2006 provides a specification which gives rules for the preparation of XML documents for efficient transport and storage, and enables the development of ISO/IEC 23001-1 terminals to receive, decode and assemble possibly partitioned and compressed XML documents. The binary MPEG format for XML relies on schema knowledge between encoder and decoder in order to reach high compression efficiency, while providing fragmentation mechanisms for ensuring transmission and processing flexibility. ISO/IEC 23001-1:2006 also defines means to compile and transmit schema knowledge information to enable the decoding of compressed XML documents without a priori schema knowledge at the receiving terminal. The binary MPEG format for XML is described in four main sections: System Architecture presents the architecture of an ISO/IEC 23001-1-compliant terminal and general characteristics of an ISO/IEC 23001-1 decoder, such as decoder behaviour. Binary Format specifies binary syntax and associated semantics of the structural elements. In particular, this section describes the structure of a binary access unit. Binary Fragment Update Payload specifies binary syntax and associated semantics of the payload content. In particular, this section describes the decoding process of complex Type content using finite state automaton decoders. Advanced Optimized Decoders describes the mechanisms for decoding simple types of an XML document using advanced optimised decoders. The binary format for XML described in this specification can be used for encoding MPEG-7 and MPEG-21 descriptions, as specified in ISO/IEC 15938-1 (MPEG-7 Systems) and 21000-16 (MPEG-21 Binary Format), respectively.

ISO/IEC 23001-1:2006 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.

ISO/IEC 23001-1:2006 has the following relationships with other standards: It is inter standard links to ISO/IEC 23001-1:2006/Amd 1:2007, ISO/IEC 23001-1:2006/Amd 2:2008; is excused to ISO/IEC 23001-1:2006/Amd 1:2007, ISO/IEC 23001-1:2006/Amd 2:2008. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

You can purchase ISO/IEC 23001-1:2006 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.