ISO/IEC TR 15938-8:2002
(Main)Information technology — Multimedia content description interface — Part 8: Extraction and use of MPEG-7 descriptions
Information technology — Multimedia content description interface — Part 8: Extraction and use of MPEG-7 descriptions
ISO/IEC TR 15938-8:2002 forms an informative part of ISO/IEC 15938 on extraction and use of metadata descriptions for multimedia content. ISO/IEC TR 15938-8:2002 provides two types of information: informative examples that illustrate the instantiation of description tools in creating descriptions conforming to ISO/IEC 15938, and detailed technical information on extracting descriptions automatically from multimedia content and using them in multimedia applications. ISO/IEC TR 15938-8:2002 is a companion for ISO/IEC 15938-3 (Visual) and ISO/IEC 15938-5 (Multimedia Description Schemes), which provide normative definitions of the description tools. Effort has been made in this Technical Report to preserve the subclause numbering of ISO/IEC 15938-3 and ISO/IEC 15938-5 to allow easy mapping of the information on extraction and use with those technical specifications.
Technologies de l'information — Interface de description du contenu multimédia — Partie 8: Extraction et utilisation des descriptions MPEG-7
General Information
- Status
- Published
- Publication Date
- 12-Dec-2002
- Current Stage
- 9093 - International Standard confirmed
- Start Date
- 12-Oct-2019
- Completion Date
- 14-Feb-2026
Relations
- Effective Date
- 06-Jun-2022
- Effective Date
- 06-Jun-2022
- Effective Date
- 06-Jun-2022
- Effective Date
- 06-Jun-2022
- Effective Date
- 25-Dec-2021
- Effective Date
- 23-Apr-2020
- Effective Date
- 14-Aug-2008
- Effective Date
- 15-Apr-2008
- Effective Date
- 15-Apr-2008
- Effective Date
- 15-Apr-2008
Overview
ISO/IEC TR 15938-8:2002 - "Information technology - Multimedia content description interface - Part 8: Extraction and use of MPEG-7 descriptions" is an informative Technical Report in the ISO/IEC 15938 (MPEG‑7) family. It complements the normative parts (notably ISO/IEC 15938-3 Visual and ISO/IEC 15938-5 Multimedia Description Schemes) by providing illustrative examples and detailed technical guidance on how to instantiate, extract and apply MPEG‑7 metadata in real multimedia systems. The report preserves subclause numbering from the normative parts to simplify mapping between extraction/use guidance and the formal description tools.
Key topics
- Instantiation of description tools: examples demonstrating how MPEG‑7 description schemes (MDS) are used to create conformant descriptions.
- Automatic extraction guidance: technical information on extracting metadata from audio, visual and audiovisual content.
- Core MDS areas covered (from the table of contents): basic datatypes, schema tools, linking/identification, time description, media locators, basic and media description tools, structure and semantics description, navigation and access, content organization and user interaction.
- Visual and audiovisual features: color, texture, shape, motion, localization and summarization descriptors and how they map to MPEG‑7 tools.
- Mapping and integration: guidance to align extraction outputs with the normative element definitions in ISO/IEC 15938-3 and -5.
- Practical examples: illustrative instantiations to show how descriptions look and how they are used in applications.
Practical applications
ISO/IEC TR 15938-8 is valuable for practical multimedia tasks that rely on standardized metadata:
- Content-based retrieval & search - video and image indexing using MPEG‑7 descriptions.
- Media asset management - tagging, locating and organizing large multimedia collections.
- Broadcasting and streaming - metadata extraction for program guides, chaptering and thumbnails.
- Digital libraries & archives - standardized descriptions for preservation and discovery.
- Computer vision / audio analysis - mapping extracted features (color, motion, audio events) to MPEG‑7 descriptors.
- Recommendation & personalization - using usage, user-preference and summarization tools.
Who should use this standard
- Multimedia developers and engineers implementing MPEG‑7 metadata pipelines
- Metadata architects and librarians working on media catalogs
- Researchers in multimedia indexing, retrieval and computer vision/audio analysis
- System integrators building media-management, search and streaming platforms
Related standards
- ISO/IEC 15938-1..-5 (MPEG‑7 core and description schemes) - normative definitions.
- ISO/IEC 15938-3 (Visual) and ISO/IEC 15938-5 (MDS) - primary companions for technical mappings.
Keywords: MPEG‑7, ISO/IEC 15938, multimedia metadata, metadata extraction, content description, media locators, visual descriptors, audio descriptors, multimedia indexing.
Buy Documents
ISO/IEC TR 15938-8:2002 - Information technology -- Multimedia content description interface
ISO/IEC TR 15938-8:2002 - Information technology -- Multimedia content description interface
Get Certified
Connect with accredited certification bodies for this standard

BSI Group
BSI (British Standards Institution) is the business standards company that helps organizations make excellence a habit.

NYCE
Mexican standards and certification body.
Sponsored listings
Frequently Asked Questions
ISO/IEC TR 15938-8:2002 is a technical report published by the International Organization for Standardization (ISO). Its full title is "Information technology — Multimedia content description interface — Part 8: Extraction and use of MPEG-7 descriptions". This standard covers: ISO/IEC TR 15938-8:2002 forms an informative part of ISO/IEC 15938 on extraction and use of metadata descriptions for multimedia content. ISO/IEC TR 15938-8:2002 provides two types of information: informative examples that illustrate the instantiation of description tools in creating descriptions conforming to ISO/IEC 15938, and detailed technical information on extracting descriptions automatically from multimedia content and using them in multimedia applications. ISO/IEC TR 15938-8:2002 is a companion for ISO/IEC 15938-3 (Visual) and ISO/IEC 15938-5 (Multimedia Description Schemes), which provide normative definitions of the description tools. Effort has been made in this Technical Report to preserve the subclause numbering of ISO/IEC 15938-3 and ISO/IEC 15938-5 to allow easy mapping of the information on extraction and use with those technical specifications.
ISO/IEC TR 15938-8:2002 forms an informative part of ISO/IEC 15938 on extraction and use of metadata descriptions for multimedia content. ISO/IEC TR 15938-8:2002 provides two types of information: informative examples that illustrate the instantiation of description tools in creating descriptions conforming to ISO/IEC 15938, and detailed technical information on extracting descriptions automatically from multimedia content and using them in multimedia applications. ISO/IEC TR 15938-8:2002 is a companion for ISO/IEC 15938-3 (Visual) and ISO/IEC 15938-5 (Multimedia Description Schemes), which provide normative definitions of the description tools. Effort has been made in this Technical Report to preserve the subclause numbering of ISO/IEC 15938-3 and ISO/IEC 15938-5 to allow easy mapping of the information on extraction and use with those technical specifications.
ISO/IEC TR 15938-8:2002 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO/IEC TR 15938-8:2002 has the following relationships with other standards: It is inter standard links to ISO/IEC TR 15938-8:2002/Amd 3:2007, ISO/IEC TR 15938-8:2002/Amd 4:2009, ISO/IEC TR 15938-8:2002/Amd 1:2004, ISO/IEC TR 15938-8:2002/Amd 2:2006, ISO/IEC TR 15938-8:2002/Amd 5:2010, ISO/IEC TR 15938-8:2002/Amd 6:2011; is excused to ISO/IEC TR 15938-8:2002/Amd 4:2009, ISO/IEC TR 15938-8:2002/Amd 1:2004, ISO/IEC TR 15938-8:2002/Amd 2:2006, ISO/IEC TR 15938-8:2002/Amd 3:2007. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
ISO/IEC TR 15938-8:2002 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.
Standards Content (Sample)
TECHNICAL ISO/IEC
REPORT TR
15938-8
First edition
2002-12-15
Information technology — Multimedia
content description interface —
Part 8:
Extraction and use of MPEG-7
descriptions
Technologies de l'information — Interface de description du contenu
multimédia —
Partie 8: Extraction et utilisation des descriptions MPEG-7
Reference number
©
ISO/IEC 2002
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2002
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2002 – All rights reserved
Contents
Foreword.vi
Introduction .vii
1 Scope.1
2 Terms and definitions.1
2.1 Conventions.1
2.1.1 Description tools .1
2.1.2 Naming convention .1
2.2 Terminology.2
2.2.1 Schema-related terminology .2
2.2.2 Content-related terminology .2
2.3 Symbols and abbreviated terms.6
2.3.1 Generic .6
2.3.2 Arithmetic operators .6
2.3.3 Logical operators.7
2.3.4 Relational operators.7
2.3.5 Bitwise operators.7
2.3.6 Conditional operators .7
2.3.7 Assignment .7
2.3.8 Constants .7
2.3.9 Functions.7
2.4 Default reference axis.8
3 MDS tools.8
3.1 Introduction .8
3.2 Schema tools.8
3.2.1 Introduction.8
3.2.2 Base types.8
3.2.3 Root element.8
3.2.4 Top-level types.10
3.2.5 Description metadata tools .18
3.3 Basic datatypes.21
3.3.1 Introduction.21
3.3.2 Integer datatypes.21
3.3.3 Real datatypes .21
3.3.4 Vectors and matrices .21
3.3.5 Probability datatypes .23
3.3.6 String datatypes.24
3.4 Linking, identification and localization tools .25
3.4.1 Introduction.25
3.4.2 References to Ds and DSs.25
3.4.3 Unique Identifier .26
3.4.4 Time description tools .26
3.4.5 Media Locators .29
3.5 Basic description tools.30
3.5.1 Introduction.30
3.5.2 Language identification .31
3.5.3 Textual annotation.32
3.5.4 Classification Schemes and Terms .37
3.5.5 Description of agents.49
3.5.6 Description of places .53
3.5.7 Graphs and relations.53
3.5.8 Ordering Tools.55
3.5.9 Affective description .56
3.5.10 Phonetic description. .67
3.6 Media description tools .67
3.6.1 Introduction.67
3.6.2 Media information tools.68
3.7 Creation and production description tools .73
© ISO/IEC 2002 – All rights reserved iii
3.7.1 Introduction.73
3.7.2 Creation information tools.74
3.8 Usage description tools .76
3.8.1 Introduction.76
3.8.2 Usage information tools .77
3.9 Structure description tools .78
3.9.1 Introduction.78
3.9.2 Base segment description tools .79
3.9.3 Segment attribute description tools.80
3.9.4 Visual segment description tools .87
3.9.5 Audio segment description tools.109
3.9.6 Audio-visual segment description tools .110
3.9.7 Multimedia segment description tools.113
3.9.8 Ink segment description tools.114
3.9.9 Video editing segment description tools .122
3.9.10 Structural relation classification schemes .129
3.10 Semantics description tools .133
3.10.1 Introduction.133
3.10.2 Abstraction model .134
3.10.3 Semantic entity description tools.134
3.10.4 Semantic attribute description tools .150
3.10.5 Semantic relation classification schemes .153
3.11 Navigation and access tools.157
3.11.1 Introduction.157
3.11.2 Summarization.158
3.11.3 Views, partitions and decompositions.184
3.11.4 Variations of the content .199
3.12 Content organization tools.202
3.12.1 Introduction.202
3.12.2 Collections .202
3.12.3 Models .208
3.12.4 Probability models .209
3.12.5 Analytic models .214
3.12.6 Cluster models.219
3.12.7 Classification models.220
3.13 User interaction tools .223
3.13.1 Introduction.223
3.13.2 User preferences .223
3.13.3 Usage History.235
4 Visual tools .240
4.1 Basic visual tools.240
4.1.1 Grid layout.240
4.1.2 Visual time series .240
4.1.3 2D-3D multiple view.247
4.1.4 Spatial 2D coordinates.251
4.1.5 Temporal interpolation.254
4.2 Color description tools.257
4.2.1 Color space .257
4.2.2 Color quantization .258
4.2.3 Dominant color .259
4.2.4 Scalable color .262
4.2.5 Color layout.264
4.2.6 Color structure.268
4.2.7 GoF/GoP color .279
4.3 Texture description tools .280
4.3.1 Homogeneous texture.280
4.3.2 Texture browsing.283
4.3.3 Edge histogram.286
4.4 Shape description tools .291
4.4.1 Region-based shape .291
iv © ISO/IEC 2002 – All rights reserved
4.4.2 Contour-based shape.294
4.4.3 Shape 3D .298
4.5 Motion description tools .302
4.5.1 Camera motion.302
4.5.2 Motion trajectory.307
4.5.3 Parametric motion .309
4.5.4 Motion activity.313
4.6 Localization tools.319
4.6.1 Region locator.319
4.6.2 Spatio-temporal locator .322
4.7 Other visual tools.329
4.7.1 Face recognition.329
Annex A Patent statements . 338
Bibliography .340
© ISO/IEC 2002 – All rights reserved v
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
In exceptional circumstances, the joint technical committee may propose the publication of a Technical Report
of one of the following types:
— type 1, when the required support cannot be obtained for the publication of an International Standard,
despite repeated efforts;
— type 2, when the subject is still under technical development or where for any other reason there is the
future but not immediate possibility of an agreement on an International Standard;
— type 3, when the joint technical committee has collected data of a different kind from that which is
normally published as an International Standard (“state of the art”, for example).
Technical Reports of types 1 and 2 are subject to review within three years of publication, to decide whether
they can be transformed into International Standards. Technical Reports of type 3 do not necessarily have to
be reviewed until the data they provide are considered to be no longer valid or useful.
ISO/IEC TR 15938-8, which is a Technical Report of type 3, was prepared by Joint Technical Committee
ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and
hypermedia information.
ISO/IEC 15938 consists of the following parts, under the general title Information technology — Multimedia
content description interface:
— Part 1: Systems
— Part 2: Description definition language
— Part 3: Visual
— Part 4: Audio
— Part 5: Multimedia description schemes
— Part 6: Reference software
— Part 7: Conformance testing
— Part 8: Extraction and use of MPEG-7 descriptions
vi © ISO/IEC 2002 – All rights reserved
Introduction
This standard, also known as "Multimedia Content Description Interface," provides a standardized set of
technologies for describing multimedia content. The standard addresses a broad spectrum of multimedia
applications and requirements by providing a metadata system for describing the features of multimedia
content.
The following are specified in this standard:
� Description Schemes (DS) describe entities or relationships pertaining to multimedia content.
Description Schemes specify the structure and semantics of their components, which may be
Description Schemes, Descriptors, or datatypes.
� Descriptors (D) describe features, attributes, or groups of attributes of multimedia content.
� Datatypes are the basic reusable datatypes employed by Description Schemes and Descriptors
� Systems tools support delivery of descriptions, multiplexing of descriptions with multimedia content,
synchronization, file format, and so forth.
This standard is subdivided into eight parts:
Part 1 – Systems: specifies the tools for preparing descriptions for efficient transport and storage,
compressing descriptions, and allowing synchronization between content and descriptions.
Part 2 – Description definition language: specifies the language for defining the standard set of
description tools (DSs, Ds, and datatypes) and for defining new description tools.
Part 3 – Visual: specifies the description tools pertaining to visual content.
Part 4 – Audio: specifies the description tools pertaining to audio content.
Part 5 – Multimedia description schemes: specifies the generic description tools pertaining to multimedia
including audio and visual content.
Part 6 – Reference software: provides a software implementation of the standard.
Part 7 – Conformance testing: specifies the guidelines and procedures for testing conformance of
implementations of the standard.
Part 8 – Extraction and use of MPEG-7 descriptions: provides guidelines and examples of the extraction
and use of descriptions.
© ISO/IEC 2002 – All rights reserved vii
TECHNICAL REPORT ISO/IEC TR 15938-8:2002(E)
Information technology — Multimedia content description
interface —
Part 8:
Extraction and use of MPEG-7 descriptions
1 Scope
This International Standard specifies a metadata system for describing multimedia content. This document
gives examples of extraction and use of descriptions using Description Schemes, Descriptors, and datatypes
specified in ISO/IEC 15938. The following set of subclauses are provided for each description tool, where
optional subclauses are indicated as (optional):
� Informative examples (optional): provides informative examples that illustrate the instantiation of the
description tool in creating descriptions.
� Extraction (optional): provides informative examples that illustrate the extraction of descriptions from
multimedia content.
� Use (optional): provides informative examples that illustrate the use of descriptions.
This document is meant to be a companion technical report for Part 5 (Multimedia Description Schemes) and
Part 3 (Visual) of ISO/IEC 15938. As such, the content of this technical report is not easily understood
without the technical specifications. In this technical report, effort has been made to preserve the specific
subclause numbering of ISO/IEC 15938-5 and ISO/IEC 15938-3 to allow easy correlation of the content on
extraction and use in the technical report with the technical specifications.
2 Terms and definitions
2.1 Conventions
2.1.1 Description tools
This part of ISO/IEC 15938 specifies the multimedia description tools as follows:
� Description Scheme (DS) – a description tool that describes entities or relationships pertaining to
multimedia content. DSs specify the structure and semantics of their components, which may be
Description Schemes, Descriptors, or datatypes.
� Descriptor (D) – a description tool that describes a feature, attribute, or group of attributes of
multimedia content.
� Datatype – a basic reusable datatype employed by Description Schemes and Descriptors.
� Description Tool (or tool) – refers to a Description Scheme, Descriptor, or Datatype.
2.1.2 Naming convention
In order to specify the multimedia description tools, this part of ISO/IEC 15938 uses constructs provided by
the Description Definition Language (DDL) specified in ISO/IEC 15938-2, such as "element", "attribute",
"simpleType" and "complexType". The names associated to these constructs are created on the basis of the
following conventions:
� If the name is composed of multiple words, the first letter of each word is capitalized, with the exception
that the capitalization of the first word depends on the type of construct as follows:
� Element naming: the first letter of the first word is capitalized (e.g. TimePoint element of TimeType).
� Attribute naming: the first letter of the first word is not capitalized (e.g. timeUnit attribute of
IncrDurationType).
� complexType naming: the first letter of the first word is capitalized, and the suffix "Type" is used at the
end of the name (e.g. PersonType).
© ISO/IEC 2002 – All rights reserved 1
� simpleType naming: the first letter of the first word is not capitalized, the suffix "Type" may be used at the
end of the name (e.g. timePointType).
Note that when referencing a complexType or simpleType in the definition of a description tool, the "Type"
suffix is not used. For instance, the text refers to the "Time datatype" (instead of "TimeType datatype"),
to the "MediaLocator D" (instead of "MediaLocatorType D") and to the "Person DS" (instead of
"PersonType DS").
2.2 Terminology
For the purposes of this part of ISO/IEC 15938, the following terms and definitions apply.
2.2.1 Schema-related terminology
2.2.1.1
Attribute
A field of a description tool which is of simple type.
2.2.1.2
Base type
A type that serves as the root type of a derivation hierarchy for other types.
2.2.1.3
Datatype
A primitive reusable type employed by Description Schemes and Descriptors.
2.2.1.4
Derived type
A type that is defined in terms of extension or restriction of other types.
2.2.1.5
Description
An instantiation of one or more description tools.
2.2.1.6
Description Scheme
A description tool that describes entities or relationships pertaining to multimedia content. Description
Schemes specify the structure and semantics of their components, which may be Description Schemes,
Descriptors, or datatypes.
2.2.1.7
Description Tool
A Description Scheme, Descriptor, or datatype.
2.2.1.8
Descriptor
A description tool that describes a feature, attribute, or group of attributes of multimedia content.
2.2.1.9
Instantiation
Assignment of values to the fields (elements, attributes) of one or more description tools.
2.2.1.10
Element
A field of a description tool which is of complex type.
2.2.1.11
Schema
The set of related description tools, for example, those specified in ISO/IEC 15938.
2.2.1.12
Type
The format used for collection of letters, digits, and/or symbols, to depict values of an element or attribute of
description tool. A type consists of a set of distinct values, a set of lexical representations, and a set of
facets that characterize properties of the value space, individual values, or lexical items.
2.2.2 Content-related terminology
2.2.2.1
Abstraction
A secondary representation that is created from or is related to the content. For example, a summary of a
video or a model of a feature.
2 © ISO/IEC 2002 – All rights reserved
2.2.2.2
AC coefficient
Any DCT coefficient for which the frequency in one or both dimensions is non-zero.
2.2.2.3
Acquisition
The process of acquiring audio or visual data from a source.
2.2.2.4
Action
A semantically identifiable behavior of an object or group of objects, for example, a soccer player kicking
ball.
2.2.2.5
Agent
A person, organization, or group of persons.
2.2.2.6
Audio
Time-varying data or signal intended for listening or hearing. Also, related to the aural modality.
2.2.2.7
Audio-visual
content consisting of both audio and video data.
2.2.2.8
Automatic
Processing of multimedia data, content, or metadata by means of computer, hardware, or other software
device.
2.2.2.9
Classification Scheme
A list of defined terms and their meanings.
2.2.2.10
Content
Multimedia content
A representation of the information contained in or related to multimedia data in a formalized manner
suitable for interpretation by human means. Content refers to the data and the metadata.
2.2.2.11
Copyright
A right that establishes the ownership of data, content, or metadata.
2.2.2.12
Data
Essence
Multimedia Data
A representation of multimedia in a formalized manner suitable for communication, interpretation, or
processing by automatic means.
2.2.2.13
DC coefficient
The DCT coefficient for which the frequency in both dimensions is zero.
2.2.2.14
DCT coefficient
The signed amplitude of a specific cosine basis function.
2.2.2.15
Editing
The process of combining, extracting, and refining multimedia data.
2.2.2.16
Eigenface
An eigenvector obtained from the principal component analysis of facial images.
2.2.2.17
Entity
Any concrete or abstract thing of interest related to the multimedia content.
© ISO/IEC 2002 – All rights reserved 3
2.2.2.18
Event
A noteworthy occurrence that happens at a point in time or during a temporal interval. Alternatively used as a
change in state.
2.2.2.19
Feature
A distinctive characteristic of multimedia content that signifies something to a human observer, such as the
"color" or "texture" of an image.
2.2.2.20
Filtering
A process for selecting multimedia content that satisfies certain criteria. This process may include ranking
the content according to the extent that it satisfies the criteria.
2.2.2.21
Format
The characteristics of the stored or physical representation of the data.
2.2.2.22
Frame
A single image from a video.
2.2.2.23
Image
2D spatially-varying visual data acquired from a visual source.
2.2.2.24
Key frame
A representative frame of a video or a segment.
2.2.2.25
Locator
Specifies the location or address of multimedia data or a segment.
2.2.2.26
Model
A parametric or statistical representation of multimedia content or features.
2.2.2.27
Manual
Processing of multimedia data, content, or metadata by human means.
2.2.2.28
Metadata
The information and documentation which makes multimedia data understandable and shareable to users
over time.
2.2.2.29
Multimedia
Data comprising one or modalities, such as images, audio, video, 3D models, ink content, and so forth.
2.2.2.30
Navigation
A process by which a user accesses multimedia content and steers a course through the content in a
controlled manner.
2.2.2.31
Object
An object with a physical representation in the natural world.
2.2.2.32
Region
A spatial unit of multimedia, for example, a 2D spatial region of an image, or a moving region of video.
2.2.2.33
Relation
Any association among entities.
4 © ISO/IEC 2002 – All rights reserved
2.2.2.34
Rights
Information that determines the ownership and terms of use of multimedia data, content, or metadata.
Refers to Intellectual Property Rights, Copyrights, and the Access Rights.
2.2.2.35
Scene
An episode or sequence of events representing continuous action in one location.
2.2.2.36
Search
A process for searching multimedia content that satisfies certain criteria. This process may include ranking
the content according to the extent that it satisfies the criteria.
2.2.2.37
Segment
A spatial or temporal unit of multimedia, for example, a temporal segment of video, or a segment of an
image.
2.2.2.38
Semantics
Information relating to the underlying meaning or understanding of multimedia content. Alternatively, refers
to the specification of the meaning of description tools.
2.2.2.39
Summary
An abstraction of multimedia content that summarizes the content.
2.2.2.40
User
An end-user or consumer of multimedia content.
2.2.2.41
User Preferences
The preferences of a user pertaining to multimedia content. This includes the user's tastes, likes and
dislikes with respect to the content and its properties, as well as preferences with respect to the consumption
process.
2.2.2.42
Usage History
A history of actions that a user of multimedia content has carried out over a certain period of time, such as
recording a specific piece of content, or playing back recorded content at a specific time.
2.2.2.43
Variation
An alternative version of multimedia content., which may be derived through transcoding, summarization,
translation, reduction, and so forth.
2.2.2.44
Video
A space- and time-varying visual data or signal intended for viewing; commonly represented as a discrete
sequence of images or frames.
2.2.2.45
View
A portion of an image, video or audio signal, defined in terms of a partition. A partition is a multi-
dimensional region defined in the space, time and/or frequency plane.
2.2.2.46
Visual
Related to the visual modality.
2.2.2.47
View Decomposition
An organized set of views that provides a structured decomposition of an image, video or audio signal in
multi-dimensional space, time and/or frequency.
2.2.2.48
3D mesh model
Representation model of the surface of 3D objects using a set of faces and nodes. (i.e. polygonal meshes)
© ISO/IEC 2002 – All rights reserved 5
2.3 Symbols and abbreviated terms
2.3.1 Generic
For the purposes of this part of ISO/IEC 15938, the symbols and abbreviated terms given in the following
apply:
ART: Angular-Radial Transform
AV: Audio-visual
CSS: Curvature Scale Space
CIE: International Commission on Illumination
CIF: Common Intermediate Format
CS: Classification Scheme
D: Descriptor
Ds: Descriptors
DCT: Discrete Cosine Transform
DDL: Description Definition Language
DS: Description Scheme
DSs: Description Schemes
FOC: Focus of Contraction
FOE: Focus of Expansion
GLA: Generalized Lloyd Algorithm
GoF: Group of Frames
GoP: Group of Pictures
HMMD: Hue-Min-Max-Difference
HSV: Hue-Saturation-Value
IANA: Internet Assigned Numbers Authority
IETF: Internet Engineering Task Force
IPMP: Intellectual Property Management and Protection
ISO: International Organization for Standardization
JPEG: Joint Photographic Experts Group
MDS: Multimedia Description Scheme
MNV: Mean Normal Vector
MPEG: Moving Picture Experts Group
MPEG-2: Generic coding of moving pictures and associated audio information (see ISO/IEC 13818)
MPEG-4: Coding of audio-visual objects (see ISO/IEC 14496)
MPEG-7: Multimedia Content Description Interface Standard (see ISO/IEC 15938)
MP3: MPEG-2 layer 3 audio coding
NAC: Normalized Auto-Correlation
QCIF: Quarter Common Intermediate Format
PWM: Pseudo Weighted Measure
RGB: Red-Green-Blue
SMPTE: Society of Motion Picture and Television Engineers
SSD: Shape Spectrum Descriptor
TZ: Time Zone
TZD: Time Zone Difference
URI: Uniform Resource Identifier (see RFC 2396)
URL: Uniform Resource Locator (see RFC 2396)
W3C: World Wide Web Consortium
XML: Extensible Markup Language
XOR : eXclusive-OR
2.3.2 Arithmetic operators
+Addition
- Subtraction (as a binary operator) or negation (as a unary operator)
++ Increment, i.e. x++ is equivalent to x=x+1
-- Decrement, i.e. x-- is equivalent to x=x-1
+= Accumulation, i.e. x+=2 is equivalent to x=x+2
/= divide and substitute, i.e. x/=2 is equivalent to x=x/2
* Multiplication
x Multiplication
^Power
6 © ISO/IEC 2002 – All rights reserved
/ Integer division with truncation of the result towards zero. For example, 7/4 and -7/-4 are truncated to 1,
-7/4 and 7/-4 are truncated to -1.
// Integer division with rounding to the nearest integer. Half-integer values are rounded away from zero
unless otherwise specified. For example, 3//2 is rounded to 2, and -3//2 is rounded to -2.
� Used to indicate division in mathematical equations where no rounding is intended
% Modulus operator, defined only for positive numbers
ld Logarithm base 2
ceil Minimum integer number greater or equal than the given floating point number
1 x� 0
�
Sign() Sign(x)�
�
�1 x� 0
�
x x� 0
�
Abs() Abs(x)�
�
� x x� 0
�
i�b
f (i) Summation of f (i) with i taking integer values from a up to, but not including b.
�
i�a
2.3.3 Logical operators
|| Logical OR
&& Logical AND
! Logical NOT
2.3.4 Relational operators
> Greater than
>= Greater than or equal to
� Greater than or equal to
< Less than
<= Less than or equal to
� Less than or equal to
== Equal to
!= Not equal to
max[] Maximum value in argument list
min[] Minimum value in argument list
median[] median value in argument list
2.3.5 Bitwise operators
|OR
&AND
>> Shift right with sign extension
<< Shift left with zero fill
2.3.6 Conditional operators
a if (condition)
�
?: condition?a : b�
�
b otherwise
�
2.3.7 Assignment
= Assignment operator
2.3.8 Constants
� 3.141 592 653 58…
e 2.718 281 828 45…
2.3.9 Functions
max() Maximum value in argument list
min() Minimum value in argument list
© ISO/IEC 2002 – All rights reserved 7
� 1 x� 0
Sign() Sign(x)�
�
�1 x� 0
�
� x x� 0
Abs() Abs(x)�
�
� x x� 0
�
i�b
f (i) Summation of f (i) with i taking integer values from a up to, but not including b.
�
i�a
Distances between N-dimentional vectors x and y
N
L1 norm
L1(x,y)� x � y
� i i
i
L2 norm L2(x,y)� ��x � y
i i
�
i
� �
� �
Euclidean distance D(x,y) � sqrt (x � y )
i i
�
� �
i
� �
2.4 Default reference axis
The default reference axis for angle calculation is the positive x (horizontal) axis. Positive angle is calculated
anti-clockwise.
3 MDS tools
3.1 Introduction
Clause 3.1 provides guidelines and examples of the extraction and use of descriptions for tools defined in
ISO/IEC 15938-5.
3.2 Schema tools
3.2.1 Introduction
This clause specifies the schema tools that facilitate the making of descriptions. The following description
tools are specified: (1) the base type hierarchy of the description tools defined in ISO/IEC 15938, (2) the root
element, (3) the top-level tools, (4) the multimedia content entity tools, (5) package tool, and (6) description
metadata tool. The functionality of these tools is given as follows:
Table 1 - Overview of Schema Tools.
Tool Functionality
Base types Form the type hierarchy for description tools.
Root element The initial wrapper or root element of descriptions.
Top-level tools The elements that follow the root element in descriptions.
Multimedia content entity Tools for describing different types of multimedia content such as
tools images, video, audio, mixed multimedia, collections, and so forth.
Package Tool for organizating or packaging of description tools.
Description metadata Tool for describing metadata about descriptions.
3.2.2 Base types
No additional informative material for extraction and use is provided.
3.2.3 Root element
3.2.3.1 Root element examples
The following example shows the use of the root element for describing an instance of the ScalableColor
D (defined in ISO/IEC 15398-3) using DescriptionUnit.
8 © ISO/IEC 2002 – All rights reserved
1.0
descriptionUnitExample
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
The following example shows the use of root element for describing an image using Description.
1.0
1.1
2001-09-20T03:20:25+09:00
ISO
International Organization of Standardization
098f2470-bae0-11cd-b579-08002b30bfeb
completeDescriptionExample
Creator
Yoshiaki
Shibata
jp
Tokyo
2000-10-10T19:45:00+09:00
Wizzo Extracto ver. 2
RID#
© ISO/IEC 2002 – All rights reserved 9
3.2.4 Top-level types
3.2.4.1 Complete description types
3.2.4.1.1 Content description types examples
The following example shows the use of the content description type ContentEntityType for describing a
photographic image depicting a sunset
...
3.12.4 Probability models
3.12.4.1 ProbabilityModel DS
Information on extraction and use is not provided.
3.12.4.2 ProbabilityDistribution DS
3.12.4.2.1 ProbabilityDistribution DS examples
The following example shows the use of the ProbabilityDistribution DS for describing a probability
distribution in terms of its statistics such as mean, variance, min, max, mode, median, and moment.
dim="2">
0.25 0.5
0.1 0.9
0.0 0.0
1.0 1.0
0.5 0.5
0.5 0.5
0.2 0.3
3.12.4.3 Discrete distribution description tools
3.12.4.3.1 Discrete distribution description tools examples
The following example illustrates the use of the HistogramProbability DS for describing an eight-
dimensional histogram.
dim="8">
0.125 0.0 0.0 0.25 0.125 0.25 0.25 0.0
The following example illustrates the use of the BinomialDistribution DS for describing a one-
dimensional binomial distribution.
0.5
16
The following example illustrates the use of the BinomialDistribution DS for describing a two-
dimensional hypergeometric distribution.
5 5
8 8
16 16
© ISO/IEC 2002 – All rights reserved 209
The following example illustrates the use of the PoissonDistribution DS for describing a three-
dimensional Poisson distribution.
0.4 0.3 0.75
The following example illustrates the use of the GeometricDistribution DS for describing a one-
dimensional geometric distribution.
0.75
The following example illustrates the use of the DiscreteUniformDistribution DS for describing a
one-dimensional discrete uniform distribution.
0.125
2.0
8.0
3.12.4.4 Continuous distribution description tools
3.12.4.4.1 Continuous distribution description tools examples
The following example illustrates the use of the GaussianDistribution DS for describing a one-
dimensional Gaussian distribution.
0.5
0.25
The following example illustrates the use of the ExponentialDistribution DS for describing a two-
dimensional generalized Gaussian distribution.
0.5 0.35
0.25 0.75
2 2
The following example illustrates the use of the ExponentialDistribution DS for describing a two-
dimensional exponential distribution.
0.5 0.25
210 © ISO/IEC 2002 – All rights reserved
The following example illustrates the use of the GammaDistribution DS for describing a three-
dimensional Gamma distribution.
0.4 0.3 0.75
8 4 16
The following example illustrates the use of the ContinuousUniformDistribution DS for describing a
one-dimensional continuous uniform distribution.
1.0
0.0
The following example illustrates the use of the LognormalDistribution DS for describing a one-
dimensional lognormal distribution.
0.5
0.35
3.12.4.5 Finite state model description tools
3.12.4.5.1 Finite state model description tools examples
The following example shows the use of the StateTransitionModel DS for describing a state-transition
model that has three states. In this example, assume that the observed weather has one of the following
states: "precipitation", "cloudy", or "sunny". Furthermore, the state-transition probabilities indicate the
probability of moving from one state to another. The initial probabilities specify the initial probability of each
state. The state-transition model allows questions to be answered such as, what is the probability that the
weather for eight consecutive days is "sun-sun-sun-rain-rain-sun-cloudy-sun", or given that the system is in a
known state, i.e., "sunny", what is the expected number of consecutive days the system will remain in that
state.
0.5 0.25 0.25
0.4 0.3 0.3 0.2 0.6 0.2 0.1 0.1 0.8
© ISO/IEC 2002 – All rights reserved 211
The following example shows the use of the StateTransitionModel DS for describing a state-transition
model of the events depicted a video of a soccer game. The model describes three states: "Pass", "Shot on
goal" and "Goal score". The StateTransitionModel DS describes the states, the initial state
probabilities, and the transition probabilities between the states.
0.25 0.5 0.25
0.2 0.2 0.6 0.1 0.8 0.1 0.3 0.3 0.4
The following example shows the use of DiscreteHiddenMarkovModelType for describing a discrete
hidden markov model of events in a soccer game, such as "passes" and "goal scores".
0.5 0.5
0.4 0.6 0.3 0.7
2
Team A pass
Team B pass
momentNormalized="1" dim="2">
0.2 0.8
2
Team A goal
Team B goal
212 © ISO/IEC 2002 – All rights reserved
momentNormalized="1" dim="2">
0.4 0.6
The following example shows the use of the ContinuousHiddenMarkovModel DS for describing a
continuous hidden Markov model of audio sound effects. Each continuous hidden Markov model has 5
states and represents a sound effect class. The parameters of the continuous density state model can be
estimated via training, for example, using the Baum-Welch algorithm. After training, the continuous HMM
model consists of a 5x5 state transition matrix, a 5x1 initial state density matrix, and 5 multi-dimensional
Gaussian distributions defined in terms of the mean and variance parameters. Each multi-dimensional
Gaussian distribution has six dimensions corresponding to audio features comprised of 5 channels of
Independent Component Analysis (ICA) data and 1 channel of spectral envelope data.
0.1 0.2 0.1 0.4 0.2
0.2 0.2 0.6 0.0 0.0
0.1 0.2 0.1 0.3 0.3
0.4 0.2 0.1 0.1 0.2
0.2 0.1 0.4 0.2 0.1
0.0 0.2 0.1 0.3 0.4
1 2 3 4 5 6
BeatData
0.5 0.5 0.25 0.3 0.5 0.3
0.25 0.75 0.5 0.45 0.75 0.3
© ISO/IEC 2002 – All rights reserved 213
0.25 0.4 0.25 0.3 0.2 0.1
0.5 0.25 0.5 0.45 0.5 0.2
0.2 0.5 0.35 0.3 0.5 0.5
0.5 0.5 0.5 0.5 0.75 0.5
0.5 0.3 0.25 0.2 0.5 0.6
0.5 0.1 0.5 0.25 0.75 0.4
0.5 0.15 0.25 0.3 0.5 0.35
0.5 0.75 0.5 0.5 0.75 0.35
3.12.4.5.2 FiniteStateModel DS use
State-transition models can be extracted from the events along the temporal dimension of a video sequence.
For example, a temporal sequence of scenes in video can be characterized by a state-transition model that
describes the probabilities of transitions between scenes.
The state-transition models can be matched using the following matching metrics to determine the similarity
of the underlying multimedia content being modeled:
Euclidean distance: calculates the sum of squared differences between transition probabilities.
dissimilar score� �A � B �
�� ij ij
ij
Quadratic distance: calculates the sum of weighted quadratic distance between transition probabilities.
�A � B ��A � B �
ij ij kl kl
dissimilar score�
����
��1� abs(i� k)� abc( j� l)
ij k l
Weighted transition frequency: calculates the weighted sum of ratios of transition probabilities.
A
ij
fA � A � A rA �
ij ij ji ij
A
ji
� fA fB � � rA rB �
ij ij ij ij
� � � �
match score � fA � fB � min , � min ,
�� ij ij
� � � �
fB fA rB rA
iij�
ij ij ij ij
� � � �
Euclidean distance of aggregated state transitions: calculates the sum of the squared differences of
aggregated transitions.
� �
� �
� �
dissimilar score � A � B � � A � B �
��� ij ij��� ij ij
� �
ijj jii
� �
� �
3.12.5 Analytic models
3.12.5.1.1 AnalyticModel DS examples
The example below illustrates the use of the ModelState DS to describe a an analytic model that
represents a state in a finite state model. The description gives three labels for the state and gives
information about the confidence and reliability of the state analytic model.
214 © ISO/IEC 2002 – All rights reserved
3.12.5.2 CollectionModel DS
3.12.5.2.1 CollectionModel DS examples
The following example illustrates the use of the CollectionModel DS for describing a collection model
consisting of a content collection of four images. In this example, the semantic concept is being described by
the collection of images. For example, the collection of images describe the concept of "soccer shots on
goal". In this example, the description gives a confidence of 0.75 and a reliability of 0.5.
function="described">
soccer1.jpg
soccer2.jpg
soccer3.jpg
soccer4.jpg
© ISO/IEC 2002 – All rights reserved 215
The following example illustrates the use of the CollectionModel DS for describing a model of a
collection of two color descriptions. In this example, the semantic concept "Sunsets" has the function of
describing the collection of descriptors.
function="describing">
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
The following example illustrates the use of the CollectionModel DS for describing a collection of of four
concepts. The concept collection has a semantic label: "soccer stadium objects", that is, the semantic
concept has the function of describing the collection of concepts. In this example, the confidence is 0.75 and
reliability is 0.5.
function="describing">
216 © ISO/IEC 2002 – All rights reserved
3.12.5.3 DescriptorModel DS
3.12.5.3.1 DescriptorModel DS examples
The following example illustrates the use of the DescriptorModel DS for describing a descriptor
model of the ScalableColor descriptor. The DescriptorModel DS specifies that the descriptor model is
formed from the Coefficients element of the ScalableColor D (defined in ISO/IEC 15938-3). For
example, in the DescriptorModel for ScalableColorType, the value of numOfCoefficients="16" is
constant in the descriptor model.
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
Coeff
The following example illustrates the use of the DescriptorModel DS for describing a descriptor
model of the ContourShape descriptor. The DescriptorModel DS specifies that the descriptor model is
formed from the concatenation of six elements of the ContourShape D (defined in ISO/IEC 15938-3). All
other elements and attributes not mentioned in the field statements are assumed to be constant in the
descriptor models, taking the values specified in the example Descriptors.
3 5
4 7
2
GlobalCurvature
PrototypeCurvature
HighestPeakY
Peak
3.12.5.4 ProbabilityModelClass DS
3.12.5.4.1 ProbabilityModelClass DS examples
The following example illustrates the use of the ProbabilityModelClass DS for describing a probability
model class that characterizes a class of "Nature scene" images by representing the statistics associated
with the color features of the images of that class. For example, the specification below indicates that the
Coefficients element of the ScalableColor D (defined in ISO/IEC 15938-3) for the Nature scene
images has a centroid or mean value of (0.5, 0.5, …) and a variance of (0.25, 0.75, …).
reliability="0.5">
© ISO/IEC 2002 – All rights reserved 217
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
Coeff
confidence="1.0"
dim="16">
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 5
The following example illustrates the use of the ProbabilityModelClass DS for describing a probability
model that characterizes checker patterns using a probability model of EdgeHistogram D (see ISO/IEC
15938-3).
reliability="0.5">
3 5 2 5 . . . 6
BinCounts
confidence="0.75"
dim="80">
5 3 9 4 . . . 5
4.5 5.0 8.0 7.5 . . . 5.2
The following example illustrates the use of the ProbabilityModelClass DS for describing a probability
model that characterizes oval shapes using a probability model of RegionShape D (see ISO/IEC 15938-3).
reliability="0.5">
3 5 2 5 . . . 6
MagnitudeOfART
confidence="1.0"
218 © ISO/IEC 2002 – All rights reserved
dim="35">
4 8 6 9 . . . 5
1.3 2.5 5.0 4.5 . . . 3.2
The following example illustrates the use of the ProbabilityModelClass DS for describing a probability
model that characterizes silhouettes using a probability model of ContourShape D (see ISO/IEC 15938-3).
reliability="0.5">
3 5
4 7
2
GlobalCurvature
PrototypeCurvature
HighestPeakY
Peak
confidence="1.0"
dim="11">
3 9 7 4 6 9 1 2 6 8 9
3.4 5.4 9.1 1.8 2.3 6.5 7.9 1.2 3.4 3.3
1.0
3.12.6 Cluster models
3.12.6.1.1 ClusterModel DS examples
The following example illustrates the use of the ClusterModel DS for describing a cluster model consisting
of a set of examples of two scalable color descriptions. The cluster models have a semantic label: "Nature
scenes". The example describes also the probability model of the ScalableColor D (see ISO/IEC 15938-
3).
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
© ISO/IEC 2002 – All rights reserved 219
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
Coeff
confidence="1.0"
dim="16">
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 5
3.12.7 Classification models
Information on extraction and use is not provided.
3.12.7.1 ClusterClassificationModel DS
3.12.7.1.1 ClusterClassificationModel examples
The following example illustrates the use of the ClusterClassificationModel DS for describing a
cluster classification model related to scenes from a soccer game. In this example, the
ClusterClassificationModel is comprised of two ClusterModels. The first ClusterModel
describes a cluster of two images that form the class with label "Soccer shots on goal". The second
ClusterModel describes a cluster of two images that form the class with label "Goal scores".
reliability="0.8" complete="true" redundant="false">
soccer1.jpg
soccer2.jpg
soccer3.jpg
soccer4.jpg
The following example illustrates the use of the ClusterClassificationModel DS for describing a
cluster classification model related to images depicting nature scenes. The
ClusterClassificationModel is comprised of two ClusterModels. The first ClusterModel
describes a collection of two scalable color descriptions that forms the class with label "Sunsets". The
second ClusterModel describes a collection of two scalable color descriptions that forms the class with
label "Nature scenes".
reliability="0.75" complete="true" redundant="false">
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
© ISO/IEC 2002 – All rights reserved 221
3.12.7.2 ProbabilityClassificationModel DS
3.12.7.2.1 ProbabilityClassificationModel DS examples
The following example illustrates the use of the ProbabilityClassificationModel DS for describing a
probability classification model for sunset and nature images. The ProbabilityClassificationModel
DS is comprised of two instances of ProbabilityModelClass DS. The first ProbabilityModelClass
DS instance specifies a class of scalable color descriptors that forms with label "Sunsets". The second
ProbabilityModelClass DS instance specifies a class scalable color descriptors with label "Nature
scenes".
reliability="0.75" complete="true" redundant="false">
numOfBitplanesDiscarded="0">
4 5 6 7 8 9 0 1 2 3 4 5 6 1 2 3
Coeff
confidence="1.0" dim="16">
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
5 6 7 8 9 0 1 2 3 4 5 6 5 2 3 4
numOfBitplanesDiscarded="0">
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
Coeff
confidence="1.0" dim="16">
3 4 5 0 1 2 3 4 5 6 7 8 9 0 1 2
5 6 5 2 3 4 5 6 7 8 9 0 1 2 3 4
222 © ISO/IEC 2002 – All rights reserved
3.13 User interaction tools
3.13.1 Introduction
This clause specifies the following description tools related to user interaction with multimedia content.
Table 15 - Overview of User Interaction Tools.
Tool Functionality
User Preferences Tools for describing user preferences pertaining to multimedia content, enabling
effective user interaction and personalization of content access and consumption.
See clause 3.13.2.
Usage History
Tools for describing usage history of users of multimedia content, enabling
effective user interaction and personalization of content access and consumption.
See clause 3.13.3.
3.13.2 User preferences
The User Preferences tools are used to specify user's preferences pertaining to filtering, searching and
browsing of multimedia content and can be used for personalized viewing and listening. Filtering and search
preferences describe, for example, favorite titles, genres, actors and sources of content. This information can
be used to find preferred multimedia content by matching it with information in multimedia content
descriptions. Browsing preferences, for example, describe preferred views of favorite content, where
preferences may be dependent on usage conditions such as available bandwidth or the time the user has to
consume the information. This allows the user to navigate and access different views of the content in a
personalized manner.
3.13.2.1 UserPreferences DS
3.13.2.1.1 UserPreferences DS examples
An example UserPreferences description containing several preferences pertaining to filtering and
searching of content is as follows. Note that XML comments indicate description elements that are discussed
in more detail in later subclauses.
Harrison
Ford
1995-01-01
P1825D
News
Sports
Documentary
© ISO/IEC 2002 – All rights reserved 223
3.13.2.1.2 UserPreferences DS extraction
In general, UserPreferences descriptions can be constructed manually or automatically. A
UserPreferences description may be constructed based on explicit input from the multimedia content
user. Alternatively, a UserPreferences description may be constructed automatically based on the user's
content usage history.
3.13.2.1.3 UserPreferences DS use
In general, the descriptions of the user's preferences can be used to automatically find preferred multimedia
content and to present preferred views of the content to the user.
In the following, a method is described for automatic filtering of multimedia content using content
descriptions and user preference descriptions. Each individual component of the
FilteringAndSearchPreferences DS in the UserPreferences DS corresponds to a test or matching operation
against components of descriptions of multimedia content. Examples of individual components are the Role,
GivenName, FamilyName, DatePeriod elements and the Genre elements in the example user preference
description shown in subclause 3.13.2.1.1.
For example, the name values of each Genre element in this user preference description can be matched
against the name values of the Genre element in the following partial content description. Likewise, the date
values of the DatePeriod element can be matched against the date value of the Release element of the
content description below.
Afternoon news
News
es
After individual preference components are tested, the results of these tests must be combined into a single
test result that reflects the user’s overall preferences. The individual test results can be combined according
to the hierarchy of the UserPreferences description. Each parent test becomes a combination of its children
tests, and this continues up the hierarchy, yielding one composite test result. The combination of children
tests within a single parent may depend on the types of the elements being tested. For example, the partial
user preferences description above specifies a preference for content with an actor having first name
"Harrison" and having last name "Ford". Therefore, in this case, an intersection operator can be used to
combine the individual test results. As another example, the user preferences description above specifies
preferences for multiple genres: "News", "Sports" and "Documentaries". Different multimedia programs, each
with a different genre, may each satisfy the individual tests against these genre preferences. Therefore, in
this case, a union operator can be used to combine the individual test results.
The combination of the user's preferences and the overall matching result against multimedia content
descriptions can be used to rank-order the corresponding multimedia content items and/or to automatically
filter the content items.
In general, individual preference components are the elements and attributes of the
CreationPreferences DS, the ClassificationPreferences DS, the SourcePreferences DS and
the SummaryPreferences DS. The following tables provide a more complete mapping from individual
224 © ISO/IEC 2002 – All rights reserved
elements (and attributes) of a user preference description to individual elements (and attributes) of content
descriptions. The first column of each table specifies the name of an element (or attribute) of a user
preference description. The second column of each table specifies the name(s) of one or more elements (or
attributes) of a content description that the preference element (or attribute) maps to. Note that elements in
both the first and second columns of each table may contain further children elements (or attributes) that
may be including in the mapping implicitly.
Note: This mapping is an example mapping and is not normative.
Table 16 - Informative mapping of elements (and attributes) from
UserPreferences/FilteringAndSearchPreferences/CreationPreferences to elements (and attributes) of
a content description.
Element/attribute Name Mapping
Title CreationInformation/Creation/Title
Creator CreationInformation/Creation/Creator
Keyword CreationInformation/Creation/Title
CreationInformation/Creation/Abstract
Location CreationInformation/Creation/CreationCoordinates/Location
DatePeriod CreationInformation/Creation/CreationCoordinates/Date
Tool CreationInformation/Creation/CreationTool/Tool
Table 17 - Informative mapping of elements (and attributes) from
UserPreferences/FilteringAndSearchPreferences/ClassificationPreferences to elements (and
attributes) of a content description.
Element/attribute Name Mapping
Country CreationInformation/Classification/Release/Region
DatePeriod CreationInformation/Classification/Release/date
LanguageFormat CreationInformation/Classification/Language
CreationInformation/Classification/CaptionLanguage
CreationInformation/Classification/SignLanguage
Language CreationInformation/Classification/Language
CaptionLanguage CreationInformation/Classification/CaptionLanguage
Form CreationInformation/Classification/Form
Genre CreationInformation/Classification/Genre
Subject CreationInformation/Classification/Subject
Review CreationInformation/Classification/MediaReview
ParentalGuidance CreationInformation/Classification/ParentalGuidance
Table 18 - Informative mapping of elements (and attributes) from
UserPreferences/FilteringAndSearchPreferences/SourcePreferences to elements (and attributes) of a
content description.
Element/attribute Name Mapping
DisseminationFormat UsageInformation/Availability/Dissemination/Format
DisseminationSource UsageInformation/Availability/Dissemination/Source
DisseminationLocation UsageInformation/Availability/Dissemination/Location
DisseminationDate UsageInformation/Availability/AvailabilityPeriod
© ISO/IEC 2002 – All rights reserved 225
Element/attribute Name Mapping
Disseminator UsageInformation/Availability/Dissemination/Disseminator
MediaFormat MediaInformation/MediaProfile/MediaFormat
noRepeat UsageInformation/Availability/AvailabilityPeriod/type
noEncryption UsageInformation/Availability/AvailabilityPeriod/type
noPayPerUse UsageInformation/Availability/AvailabilityPeriod/type
Table 19 - Informative mapping of elements (and attributes) from
UserPreferences/BrowsingPreferences/SummaryPreferences to elements (and attributes) of a
content description.
Element/attribute Name Mapping
SummaryType HierarchicalSummary/components
SequentialSummary/components
SummaryTheme HierarchicalSummary/SummaryThemeList/SummaryTheme
SequentialSummary/TextualSummaryComponent/FreeText
SummaryDuration HierarchicalSummary/SummarySegmentGroup/duration
MinSummaryDuration HierarchicalSummary/SummarySegmentGroup/duration
MaxSummaryDuration HierarchicalSummary/SummarySegmentGroup/duration
NumOfKeyFrames HierarchicalSummary/SummarySegmentGroup/numOfKeyFrames
MinNumOfKeyFrames HierarchicalSummary/SummarySegmentGroup/numOfKeyFrames
MaxNumOfKeyFrames HierarchicalSummary/SummarySegmentGroup/numOfKeyFrames
NumOfChars CreationInformation/Creation/Abstract/FreeTextAnnotation
MinNumOfChars CreationInformation/Creation/Abstract/FreeTextAnnotation
MaxNumOfChars CreationInformation/Creation/Abstract/FreeTextAnnotation
3.13.2.2 UserIdentifier datatype
3.13.2.2.1 UserIdentifier datatype examples
The UserIdentifier datatype may be used to identify a particular user preference description and
distinguish it from other user preference descriptions. The Name element may contain the user's actual
name, a nickname, a user's account name or email address, or any other name. The same user may have
multiple user preference descriptions, each identified by a different value of Name, for use under different
usage conditions. Also, a group of persons can use a single set of user preferences, using a single identifier
for the group. The following partial example illustrates a description owned by user "Jane", who desires to
keep the identifier part of this description private.
Jane
226 © ISO/IEC 2002 – All rights reserved
3.13.2.3 FilteringAndSearchPreferences DS
3.13.2.3.1 FilteringAndSearchPreferences DS examples
The FilteringAndSearchPreferences DS and its main parts, the CreationPreferences DS,
ClassificationPreferences DS and SourcePreferences DS, are container preference
components that group individual preference components.
In general, the FilteringAndSearchPreferences DS may be used to search or filter multimedia content
by matching the values of components of the filtering and search preferences to the values of parts of
descriptions of multimedia content. For instance, the values of CreationPreferences may be matched to
instances of the Creation DS as defined in subclause 9.1.2. The values of
ClassificationPreferences may be matched to instances of the Classification DS as defined in
subclause 9.1.3. The values of SourcePreferences may be matched to instances of the
MediaInformation DS, CreationInformation DS or UsageInformation DS as defined in clauses
8, 9 and 10.
Each individual component of the FilteringAndSearchPreferences DS corresponds to a test or
matching operation by a content filtering or search application against components of descriptions of
multimedia content. The FilteringAndSearchPreferences DS by itself does not imply any specific
normative behavior of such an application regarding the combination of individual preference components.
The most basic behavior of an application would be consistent with viewing the structure of the
FilteringAndSearchPreferences DS as a simple aggregation of individual preference components.
However, the structure of the DS strongly suggests restrictions on the behavior expected from such an
application by the user, with regard to combinations of preference components. A few examples are
discussed in the following.
For example, when multiple sibling elements of the same type are present in a user preference description,
such as the multiple Genre elements in the following description, an application may tend to favor
multimedia content matching the first Genre ("Daily news") or the second Genre ("Sports"). Note that the
exact behavior may also be made to depend on whether the particular element in the user preference
description can be present only once in the content description or not.
Daily news
Sports
As another example, in the presence of a negative preferenceValue, such as for one Genre element
among multiple sibling Genre preference elements, an application may tend to suppress multimedia content
matching the first Genre ("Daily news") and favor multimedia content that matches the Genre with a positive
preferenceValue ("Sports").
Daily news
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...