Information technology — Multimedia content description interface — Part 3: Visual — Amendment 1: Visual extensions

Technologies de l'information — Interface de description du contenu multimédia — Partie 3: Visuel — Amendement 1: Extensions visuelles

General Information

Status
Published
Publication Date
27-Jul-2004
Current Stage
6060 - International Standard published
Start Date
28-Jul-2004
Due Date
30-Jul-2004
Completion Date
30-Dec-2003

Relations

Effective Date
06-Jun-2022
Effective Date
15-Apr-2008

Overview

ISO/IEC 15938-3:2002/Amd 1:2004 is the first amendment to the MPEG-7 Visual component of the ISO/IEC 15938 multimedia content description standard. It extends Part 3 (Visual) by adding visual extensions - new descriptor containers, supporting tools, and formalized syntaxes. The amendment specifies normative syntax using the Description Definition Language (DDL) and defines binary representations and semantics for visual description tools used to annotate still images, video and 3D models.

Key topics and technical requirements

  • Visual description toolset: Extensions cover descriptors and supporting tools in five feature categories - Color, Texture, Shape, Motion, Localization - plus face-recognition descriptors.
  • Supporting and container tools: Adds and formalizes containers and helpers such as GridLayout, TimeSeries, GofGopFeature (Group-of-Frames/Group-of-Pictures), MultipleView, Spatial2DcoordinateSystem, TemporalInterpolation.
  • Color and illumination support: New or extended color descriptors (e.g., DominantColor, ScalableColor, ColorLayout, ColorStructure, ColorTemperature, IlluminationInvariantColor, GoFGoPColor) and supporting tools (ColorSpace, ColorQuantization) to enable illumination-invariant matching and group/frame aggregation.
  • Texture, shape and motion descriptors: Texture (HomogeneousTexture, EdgeHistogram, TextureBrowsing), shape (RegionShape, ContourShape, ShapeVariation, Shape3D) and motion (CameraMotion, MotionTrajectory, ParametricMotion, MotionActivity).
  • Localization and face tools: RegionLocator and SpatioTemporalLocator for ROI annotation; FaceRecognition and AdvancedFaceRecognition for face-based matching.
  • DDL and binary syntax: Normative DDL definitions and a generic binary representation classification scheme (MPEG7… codec terms) are provided, including assignment of descriptor IDs and binary field formats (examples: GofGopFeature binary syntax and fields).
  • Interoperability constraints: The amendment enumerates which descriptor types are allowed within containers (e.g., GofGopFeature supports DominantColor, ColorLayout, EdgeHistogram, HomogeneousTexture) and reserves ID ranges for future use.

Applications and users

  • Applications: Content-based image/video indexing, similarity search, multimedia asset management, video summarization, surveillance analytics, face-based retrieval, 3D model indexing, and cross-frame feature aggregation.
  • Who uses it: Standards implementers, MPEG-7 library and codec developers, multimedia search engine vendors, video analytics researchers, digital library architects and software vendors integrating visual-descriptor interoperability.

Related standards

  • ISO/IEC 15938-3 (Visual - base)
  • ISO/IEC 15938-1 (system and codec configuration)
  • ISO/IEC TR 15938-8 and its amendment (extraction and use of MPEG-7 descriptions)

Keywords: ISO/IEC 15938-3, MPEG-7 Visual, visual descriptors, GofGopFeature, DDL, binary representation, color descriptors, texture descriptors, motion descriptors, multimedia metadata.

Standard

ISO/IEC 15938-3:2002/Amd 1:2004 - Visual extensions

English language
1185 pages
sale 15% off
Preview
sale 15% off
Preview

Get Certified

Connect with accredited certification bodies for this standard

BSI Group

BSI (British Standards Institution) is the business standards company that helps organizations make excellence a habit.

UKAS United Kingdom Verified

NYCE

Mexican standards and certification body.

EMA Mexico Verified

Sponsored listings

Frequently Asked Questions

ISO/IEC 15938-3:2002/Amd 1:2004 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology — Multimedia content description interface — Part 3: Visual — Amendment 1: Visual extensions". This standard covers: Information technology — Multimedia content description interface — Part 3: Visual — Amendment 1: Visual extensions

Information technology — Multimedia content description interface — Part 3: Visual — Amendment 1: Visual extensions

ISO/IEC 15938-3:2002/Amd 1:2004 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.

ISO/IEC 15938-3:2002/Amd 1:2004 has the following relationships with other standards: It is inter standard links to ISO/IEC 15938-3:2002; is excused to ISO/IEC 15938-3:2002. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

ISO/IEC 15938-3:2002/Amd 1:2004 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.

Standards Content (Sample)


INTERNATIONAL ISO/IEC
STANDARD 15938-3
First edition
2002-06-20
AMENDMENT 1
2004-08-01
Information technology — Multimedia
content description interface —
Part 3:
Visual
AMENDMENT 1: Visual extensions
Technologies de l'information — Interface de description du contenu
multimédia —
Partie 3: Visuel
AMENDEMENT 1: Extensions visuelles

Reference number
ISO/IEC 15938-3:2002/Amd.1:2004(E)
©
ISO/IEC 2004
ISO/IEC 15938-3:2002/Amd.1:2004(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.

©  ISO/IEC 2004
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2004 – All rights reserved

ISO/IEC 15938-3:2002/Amd.1:2004(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 1 to ISO/IEC 15938-3:2002 was prepared by Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia
information.
© ISO/IEC 2004 — All rights reserved iii

ISO/IEC 15938-3:2002/Amd.1:2004(E)
Introduction
This document specifies the first Amendment to the Visual part of the ISO/IEC 15938 standard. The normative
syntax of the Visual description tools is specified in this document using the Description Definition Language
(DDL) and the normative semantics is specified using text.
The current set of relevant documents for the Visual description tools is given as follows:
ISO/IEC 15938-3 – Visual
ISO/IEC 15938-3/Amd.1 – Visual extensions
ISO/IEC TR 15938-8 – Extraction and use of MPEG-7 descriptions
ISO/IEC TR 15938-8/Amd.1 – Extensions of extraction and use of MPEG-7 descriptions
iv © ISO/IEC 2004 — All rights reserved

ISO/IEC 15938-3:2002/Amd.1:2004(E)
Information technology — Multimedia content description
interface —
Part 3:
Visual
AMENDMENT 1: Visual extensions
Replace subclause 1.2 with:
1.2. Overview of Visual Description Tools
This part of ISO/IEC 15938 specifies tools for description of visual content, including still images, video and
3D models. These tools are defined by their syntax in DDL and binary representations and semantics
associated with the syntactic elements. They enable description of the visual features of the visual material,
such as color, texture, shape and motion, as well as localization of the described objects in the image or video
sequence. An overview of the visual description tools is shown in Figure 1.
The basic structure description tools include five supporting tools of visual descriptions defined in clauses 6-11.
They are categorized into two groups, descriptor containers and basic supporting tools. The former consists of
three datatypes, GridLayout providing efficient representations of visual features on grids, TimeSeries
representing temporal arrays of several descriptions, GofGopFeature describes representative descriptions
over video segment, and MultipleView describing a 3D object using several pictures captured from different
view angles. The latter contains two tools, Spatial2DcoordinateSystem used to specify the 2D coordinate
system and TemporalInterpolation indicating the interpolation method between two samples on a time axis.
The remaining description tools, except for the FaceRecognition descriptor, are associated with visual
features and are grouped into five feature categories: Color, Texture, Shape, Motion and Localization.
The color description tools include five color descriptors to represent different aspects of color features:
representative colors (DominantColor), color distribution (ScalableColor), spatial distribution of colors
(ColorLayout and ColorStructure) and perceptual feeling of illumination color (ColorTemperature). It also
contains three supporting tools, ColorSpace and ColorQuantization used in DominantColor and
IlluminationInvariantColor to extend four color descriptors, DominantColor, ScalableColor, ColorLayout and
ColorStructure, to support illumination invariant similarity matching. An extension of ScalableColor to a group
of frames or pictures (GoFGoPColor) is also included in this group. All the color descriptors can be extracted
from arbitrarily shaped regions.
The texture description tools facilitate browsing (TextureBrowsing) and similarity retrieval
(HomogeneousTexture and EdgeHistogram) using the texture of a still or moving image region. All the texture
descriptors can be extracted from arbitrarily shaped regions.
The shape description tools include two descriptors that characterize different shape features of a 2D object or
region. The RegionShape descriptor captures the distribution of all pixels within a region and the Contour
Shape descriptor characterizes the shape properties of the contour of an object. The extension of
RegionShape is also defined as ShapeVariation to describe temporal variation of shape over video segment.
The Shape3D descriptor provides an intrinsic shape characterization of 3D mesh models.
The motion description tools include four descriptors that characterize various aspects of motion. The
CameraMotion descriptor specifies a set of basic camera operations such as, for example, panning and tilting.
The motion of a key point (pixel) from a moving object or region can be characterized by the MotionTrajectory
descriptor. The ParametricMotion descriptor characterizes an evolution of an arbitrarily shaped region over
time in terms of a 2D geometric transformation. Finally, the MotionActivity descriptor captures the pace of the
motion in the sequence, as perceived by the viewer. All motion descriptors except for CameraMotion can be
extracted from arbitrarily shaped regions.
The localization description tools can be used to indicate regions of interest in the spatial (RegionLocator) and
spatio-temporal (SpatioTemporalLocator) domains.
© ISO/IEC 2004 — All rights reserved 1

ISO/IEC 15938-3:2002/Amd.1:2004(E)
The FaceRecognition descriptor and the Advance Face Recognition descriptor are not associated with any
particular visual feature and can be used to describe a human face for applications requiring the matching and
retrieval of face images.
Basic Structures
Descriptor Containers Basic Supporting Tools
GridLayout
TemporalInterpolation
TimeSeries
Spatial2DcoordinateSystem
GofGopFeature
MultipleView
Visual Features
Color
Color Feature Descriptors
Color Supporting Tools
DominantColor ColorSpace
ScalableColor ColorQuantization
IlluminationInvariantColor
ColorLayout
ColorStructure
GofGopColor
ColorTemperature
Texture Shape Motion
HomogeneousTexture RegionShape CameraMotion
TextureBrowsing ContourShape MotionTrajectory
EdgeHistogram ParametricMotion
ShapeVariation
MotionActivity
Shape3D
Localization
RegionLocator
SpatioTemporalLocator
Other
FaceRecognition
AdvancedFaceRecognition
Figure 1 - Overview of Visual Description Tools
Extend the definitions in subclause 3.2:
NSVM Negative Shape Variation Map
2 © ISO/IEC 2004 — All rights reserved

ISO/IEC 15938-3:2002/Amd.1:2004(E)
SVM Shape Variation Map
LDA  Linear Discriminant Analysis
PCA Principal Component Analysis
PCLDA Linear Discriminant Analysis of Principal Components
MIRED MIcro REciprol Degree
Extend the definitions in subclause 3.3:
ƒe>@z
z
Real part of a complex value
‚m>@z
z
Imaginary part of a complex value
Replace subclause 4.2.2 with:
4.2.2 Generic binary representation
The use of the video-specific syntax is signaled using the codec configuration mechanism defined in ISO/IEC 15938-1 and
the following classification scheme is defined for this purpose.


MPEG7CameraMotion
ISO/IEC 15938-3 Binary Camera Motion
Codec


MPEG7ColorLayout
ISO/IEC 15938-3 Binary Color Layout
Codec


MPEG7ColorQuantization
ISO/IEC 15938-3 Binary Color Quantization
Codec


MPEG7ColorSpace
ISO/IEC 15938-3 Binary Color Space
Codec


MPEG7ColorStructure
ISO/IEC 15938-3 Binary Color Structure
Codec


MPEG7ContourShape
ISO/IEC 15938-3 Binary Contour Shape
Codec
© ISO/IEC 2004 — All rights reserved 3

ISO/IEC 15938-3:2002/Amd.1:2004(E)


MPEG7DominantColor
ISO/IEC 15938-3 Binary Dominant Color
Codec


MPEG7EdgeHistogram
ISO/IEC 15938-3 Binary Edge Histogram
Codec


MPEG7FaceRecognition
ISO/IEC 15938-3 Binary Face Recognition
Codec


MPEG7FoFGoPColor
ISO/IEC 15938-3 Binary GoFGoP Color
Codec


MPEG7GridLayout
ISO/IEC 15938-3 Binary Grid Layout
Codec


MPEG7HomogeneousTexture
ISO/IEC 15938-3 Binary Homogeneous Texture
Codec


MPEG7IrregularVisualTimeSeries
ISO/IEC 15938-3 Binary Irregular Time Series
Codec


MPEG7MotionActivity
ISO/IEC 15938-3 Binary Motion Activity
Codec


MPEG7MotionTrajectory
ISO/IEC 15938-3 Binary Motion Trajectory
Codec


MPEG7MultipleView
ISO/IEC 15938-3 Binary Multiple View
Codec


MPEG7ParametricMotion
ISO/IEC 15938-3 Binary Parametric Motion
Codec


MPEG7RegionLocator
ISO/IEC 15938-3 Binary Region Locator
Codec
4 © ISO/IEC 2004 — All rights reserved

ISO/IEC 15938-3:2002/Amd.1:2004(E)


MPEG7RegionShape
ISO/IEC 15938-3 Binary Region Shape
Codec


MPEG7RegularVisualTimeSeries
ISO/IEC 15938-3 Binary Regular Time Series
Codec


MPEG7ScalableColor
ISO/IEC 15938-3 Binary Scalable Color
Codec


MPEG7Shape3D
ISO/IEC 15938-3 Binary Shape 3D
Codec


MPEG7Spatial2DCoordinateSystem
ISO/IEC 15938-3 Binary Spatial 2D Coordinate
System Codec


MPEG7SpatioTemporalLocator
ISO/IEC 15938-3 Binary SpatioTemporal Locator
Codec


MPEG7TemporalInterpolation
ISO/IEC 15938-3 Binary Temporal Interpolation
Codec


MPEG7TextureBrowsing
ISO/IEC 15938-3 Binary Texture Browsing
Codec


MPEG7GofGopFeature
ISO/IEC 15938-3 Binary Gof Gop Feature
Codec


MPEG7ColorTemperature
ISO/IEC 15938-3 Binary Color Temperature
Codec


MPEG7ShapeVariation
ISO/IEC 15938-3 Binary Shape Variation
Codec


MPEG7IlluminationInvariantColor
ISO/IEC 15938-3 Binary Illumination Invariant
Color Codec
© ISO/IEC 2004 — All rights reserved 5

ISO/IEC 15938-3:2002/Amd.1:2004(E)


MPEG7AdvancedFaceRecognition
ISO/IEC 15938-3 Binary Advanced Face Recognition
Codec


Replace Table 1 in subclause 5.2.4 with:
Table 1 Assignment of IDs to descriptors
ID Descriptor
0 Forbidden
1 CameraMotion
2 ColorLayout
3 ColorSpace
4 ColorStructure
5 ColorQuantization
6 ContourShape
7 DominantColor
8 EdgeHistogram
9 FaceRecognition
10 GoFGoPColor
11 GridLayout
12 HomogeneousTexture
13 IrregularVisualTimeSeries
14 MotionActivity
15 MotionTrajectory
16 MultipleView
17 ParametricMotion
18 RegionLocator
19 RegionShape
20 RegularVisualTimeSeries
21 ScalableColor
22 Shape3D
23 Spatial2DCoordinateSystem
24 SpatioTemporalLocator
25 TemporalInterpolation
26 TextureBrowsing
27 GofGopFeature
28 ColorTemperature
29 ShapeVariation
30 IlluminationInvariantColor
31 AdvancedFaceRecognition
32-255 Reserved
Add after subclause 5.6:
5.7 GofGopFeature
5.7.1 Introduction
This container is a generic and extensible container to use several description tools defined in ISO/IEC 15938-
3 to describe the representative feature over Group of Frames (GoF)/Group of Pictures (GoP). For the use to
describe the transition of the feature through a video frame, VisualTimeSeries is much preferable.
6 © ISO/IEC 2004 — All rights reserved

ISO/IEC 15938-3:2002/Amd.1:2004(E)
5.7.2 DDL representation syntax

















5.7.3 Binary representation syntax
GofGopFeature { Number of bits Mnemonic
AggregationFlag 1 bsbf
if(AggregationFlag){
AggregationType3 bsbf
}
DescriptorID 8 uimsbf
SizeOfDescriptor 8 uimsbf
Descriptor see subclauses 5 to 8 bsbf
}
5.7.4 Descriptor component semantics
DescriptorID
This field, which is only present in the binary representation, specifies a descriptor identifier. The descriptor
identifier indicates the descriptor type accommodated in this container. The assignment of IDs to the
descriptors is specified in Table 1 in subclause 5.2.4. The available value of this field is 2 for Color Layout, 7
for Dominant Color, 8 for Edge Histogram, and 12 for Homogeneous Texture. The other values are prohibited.
SizeOfDescriptor
This field, which is only present in the binary representation, specifies a size of following descriptor in bytes.
Descriptor
This element represents the elementary feature using several description tools defined in ISO/IEC 15938-3.
The applicable tools are Dominant Color, Color Layout, Edge Histogram and Homogeneous Texture which are
specified in subclause 6.4, 6.6, 7.4 and 7.2, respectively. The size of this element shall be a multiple of 8 bits,
and be equal to the value of SizeOfDescriptor element in bits. For the alignment of byte boundary, ‘0’ bits
should be staffed just after the Descriptor bitstream.
AggregationFlag
This field signals the presence of aggregation attribute. If it is set to “1”, the aggregation attribute is following.
© ISO/IEC 2004 — All rights reserved 7

ISO/IEC 15938-3:2002/Amd.1:2004(E)
aggregation
This optional field specifies the aggregation method to create representative feature from the ones extracted
from group of frames/pictures. The aggregation is performed over all the elementary descriptions of the group
of video frames or images. One of the three types of aggregation presented below is allowed. If the
aggregation is not explicitly specified, one of the allowed aggregation methods for the embeded descriptor is
used.
Average
The average aggregation means that each element of the Descriptor is computed by accumulating the
corresponding elements over group of frame/picture and subsequently normalizing each accumulated value
by the number of the frames/pictures.
Median
The median aggregation means that each element of the Descriptor is obtained by constructing the
ascending list of element values over the frame/picture, and assigning the median of this list to the
representative value.
SplitMerge
The SplitMerge aggregation is allowed to be used only for Dominant Color. When this method is used, the
descriptors are first extracted from all frames/pictures and then merged into a single descriptor consisting of
all representative color values (“Value” elements in subclause 6.3). The size of the resulting descriptor is
then brought down to the limit imposed by the syntax of DominantColor descriptor by iterative merging of
the closest dominant color “Value” elements, where closeness is defined by Euclidean distance in color
space. The values of the merged “Value” element fields are as follows:
w  w
1 2
W
N
wm∗+w∗m
1 122
M=
ww+
wv∗+w∗v w∗w∗()m−m
11 2 2 1 2 1 2
V=+
ww+
()ww+
Where W , M , and V are the values of the “Percentage”, “Index”, and “ColorVariance” fields of the
merged “Value” element, w , w , m , m , and v , v are the corresponding fields of the elements being
1 2 1 2 1 2
merged and N is the total number of frame/pictures. If SpatialCoherency field is used, it is simply averaged
over all frames/pictures.
Note, the use of some of the aggregation methods is prohibited. The applicable methods for each description
tool are specified using “Y” mark in Table Amd1-1.
Table Amd1-1 - Applicable Aggregation Methods
Description Tools Average Medium SplitMerge
Color Layout Y Y n/a
Edge Histogram Y Y n/a
Homogeneous Texture Y Y n/a
Dominant Color n/a n/a Y
8 © ISO/IEC 2004 — All rights reserved

ISO/IEC 15938-3:2002/Amd.1:2004(E)
In the binary description, the following mapping table is used
AggregationType Aggregation
000 Reserved
001 Average
010 Medium
011 SplitMerge
100-111 Reserved
Add after subclause 6.8:
6.9 Color Temperature
6.9.1 Introduction
This descriptor specifies the perceptual temperature feeling of illumination color in an image for browsing and
display preference control purposes. Four perceptual temperature browsing categories are provided; hot,
warm, moderate, and cool. Each category is used for browsing images based upon its perceptual meaning.
This descriptor can be used to control the display quality of images or videos to either warmer or cooler
direction so as to gratify user’s preference.
6.9.2 DDL representation syntax






















6.9.3 Binary representation syntax
ColorTemperature {
Number of bits Mnemonic
2 bslbf
BrowsingCategory
SubRangeIndex 6 uimsbf
}
© ISO/IEC 2004 — All rights reserved 9

ISO/IEC 15938-3:2002/Amd.1:2004(E)
6.9.4 Descriptor component semantics
BrowsingCategory
This element specifies the category of perceptual temperature. The color temperature in the range from
1667K (Kelvin) to 25000K estimates the illumination color of an image. The temperture value out of this range
shall be clipped to the corresponding range boundary value. A color temperature value is rounded with the
first digit under the decimal pointto be an integer. This color temperature range is divided and mapped into the
4 corresponding categories as defined in Table Amd1-2. The mapping between binary representation and
semantics is also provided in Table Amd1-2.
Table Amd1-2- Semantics of the Browsing Category field
Browsing Category Semantics Temperature Range
00 Hot Temperature < 2251K
01 Warm
2251Kd Temperature < 4171K
10 Moderate
4171Kd Temperature < 8061K
11 Cool 8061Kd Temperature
SubrangeIndex
This element specifies the sub-range index inside each browsing category. The temperature range for each
-1
category is uniformly quantized into 64 sub-ranges in MIRED (M/K ). The color temperature range
corresponding to each subrangeindex is provided in Table Amd1-3. A color temperature value is rounded with
the first digit under the decimal point to be an integer.
Table Amd1-3 - SubrangeIndex and its color temperature range for each category
Index Hot Warm Moderate Cool
000000 [1667,1674) [2251,2267) [4171,4203) [8061,8147)
000001 [1674,1681) [2267,2284) [4203,4235) [8147,8235)
000010 [1681,1687) [2284,2301) [4235,4268) [8235,8325)
000011 [1687,1694) [2301,2318) [4268,4301) [8325,8417)
000100 [1694,1701) [2318,2335) [4301,4334) [8417,8512)
000101 [1701,1709) [2335,2352) [4334,4369) [8512,8608)
000110 [1709,1716) [2352,2370) [4369,4403) [8608,8706)
000111 [1716,1723) [2370,2388) [4403,4439) [8706,8807)
001000 [1723,1730) [2388,2407) [4439,4475) [8807,8910)
001001 [1730,1737) [2407,2425) [4475,4511) [8910,9015)
001010 [1737,1745) [2425,2444) [4511,4548) [9015,9123)
001011 [1745,1752) [2444,2464) [4548,4586) [9123,9234)
001100 [1752,1760) [2464,2483) [4586,4624) [9234,9347)
001101 [1760,1767) [2483,2503) [4624,4663) [9347,9464)
001110 [1767,1775) [2503,2523) [4663,4703) [9464,9583)
001111 [1775,1782) [2523,2544) [4703,4743) [9583,9705)
010000 [1782,1790) [2544,2564) [4743,4784) [9705,9830)
010001 [1790,1798) [2564,2586) [4784,4826) [9830,9959)
010010 [1798,1806) [2586,2607) [4826,4868) [9959,10091)
010011 [1806,1814) [2607,2629) [4868,4912) [10091,10226)
010100 [1814,1822) [2629,2651) [4912,4956) [10226,10366)
010101 [1822,1830) [2651,2674) [4956,5000) [10366,10509)
010110 [1830,1838) [2674,2697) [5000,5046) [10509,10656)
010111 [1838,1846) [2697,2720) [5046,5092) [10656,10807)
011000 [1846,1855) [2720,2744) [5092,5140) [10807,10962)
011001 [1855,1863) [2744,2769) [5140,5188) [10962,11123)
10 © ISO/IEC 2004 — All rights reserved

ISO/IEC 15938-3:2002/Amd.1:2004(E)
011010 [1863,1872) [2769,2793) [5188,5237) [11123,11287)
011011 [1872,1880) [2793,2818) [5237,5287) [11287,11457)
011100 [1880,1889) [2818,2844) [5287,5338) [11457,11632)
011101 [1889,1897) [2844,2870) [5338,5390) [11632,11813)
011110 [1897,1906) [2870,2897) [5390,5443) [11813,11999)
011111 [1906,1915) [2897,2924) [5443,5497) [11999,12191)
100000 [1915,1924) [2924,2951) [5497,5552) [12191,12389)
100001 [1924,1933) [2951,2979) [5552,5609) [12389,12594)
100010 [1933,1942) [2979,3008) [5609,5666) [12594,12806)
100011 [1942,1951) [3008,3037) [5666,5725) [12806,13025)
100100 [1951,1961) [3037,3067) [5725,5785) [13025,13252)
100101 [1961,1970) [3067,3097) [5785,5846) [13252,13487)
100110 [1970,1980) [3097,3128) [5846,5908) [13487,13730)
100111 [1980,1989) [3128,3160) [5908,5972) [13730,13982)
101000 [1989,1999) [3160,3192) [5972,6037) [13982,14244)
101001 [1999,2009) [3192,3225) [6037,6104) [14244,14515)
101010 [2009,2018) [3225,3259) [6104,6172) [14515,14797)
101011 [2018,2028) [3259,3293) [6172,6241) [14797,15090)
101100 [2028,2038) [3293,3328) [6241,6313) [15090,15396)
101101 [2038,2049) [3328,3364) [6313,6385) [15396,15713)
101110 [2049,2059) [3364,3400) [6385,6460) [15713,16044)
101111 [2059,2069) [3400,3437) [6460,6536) [16044,16390)
110000 [2069,2080) [3437,3476) [6536,6615) [16390,16750)
110001 [2080,2090) [3476,3515) [6615,6695) [16750,17127)
110010 [2090,2101) [3515,3554) [6695,6777) [17127,17521)
110011 [2101,2112) [3554,3595) [6777,6861) [17521,17934)
110100 [2112,2122) [3595,3637) [6861,6947) [17934,18367)
110101 [2122,2133) [3637,3680) [6947,7035) [18367,18821)
110110 [2133,2145) [3680,3724) [7035,7126) [18821,19298)
110111 [2145,2156) [3724,3768) [7126,7219) [19298,19799)
111000 [2156,2167) [3768,3814) [7219,7314) [19799,20328)
111001 [2167,2179) [3814,3861) [7314,7412) [20328,20886)
111010 [2179,2190) [3861,3910) [7412,7513) [20886,21475)
111011 [2190,2202) [3910,3959) [7513,7616) [21475,22098)
111100 [2202,2214) [3959,4010) [7616,7722) [22098,22758)
111101 [2214,2226) [4010,4062) [7722,7832) [22758,23459)
111110 [2226,2238) [4062,4115) [7832,7944) [23459,24205)
111111 [2238,2251) [4115,4171) [7944,8061) [24205,25001)
6.10 Illumination Invariant Color
6.10.1 Introduction
This descriptor wraps the color descriptors in ISO/IEC 15938-3 that are Dominant Color, Scalable Color, Color
Layout, and Color Structure. Before extracting one of the DominantColor, ScalableColor, ColorLayout, and
ColorStructure from an image, the pixel values shall be converted using a tranformation which correspons to
changing the illumination to 6500K on the daylight locus.
© ISO/IEC 2004 — All rights reserved 11

ISO/IEC 15938-3:2002/Amd.1:2004(E)
6.10.2 DDL representation syntax











6.10.3 Binary representation syntax
IlluminationInvariantColor { Number of bits Mnemonic
DescriptorID 8 uimsbf
SizeOfDescriptor 8 uimsbf
Descriptor see subclauses 6.4 to 6.7 bsbf
}
6.10.4 Descriptor component semantics
DescriptorID
This field, which is only present in the binary representation, specifies a descriptor identifier. The descriptor
identifier indicates the descriptor type accommodated in this container. The assignment of IDs to the
descriptors is specified in Table 1 in subclause 5.2.4. The available value of this field is 2 for Color Layout, 4
for Color Structure, 7 for Dominant Color, 21 for Scalable Color. The other values are prohibited.
SizeOfDescriptor
This field, which is only present in the binary representation, specifies a size of following descriptor in bytes.
Descriptor
This element represents the elementary feature using several description tools defined in ISO/IEC 15938-3.
The applicable tools are Dominant Color, Scalable Color, Color Layout, and Color Structure which are
specified in subclause 6.4, 6.5, 6.6, and 6.7, respectively. If DominantColor is instantiated in this tool, the
ColorReferenceFlag shall be set to the default value. The size of this element shall be a multiple of 8 bits, and
be equal to the value of SizeOfDescriptor element in bits. For the alignment of byte boundary, ‘0’ bits should
be staffed just after the Descriptor bitstream.
12 © ISO/IEC 2004 — All rights reserved

ISO/IEC 15938-3:2002/Amd.1:2004(E)
Add after subclause 8.4:
8.5 Shape Variation
The descriptor can describe shape variations in terms of Shape Variation Map and the statistics of the region
shape description of each binary shape image in the collection. Shape Variation Map consists of
StaticShapeVariation and DynamicShapeVariation. The former corresponds to 35 quantized ART coefficients
on a 2-dimensional histogram of group of shape images and the latter to the inverse of the histogram except
the background. For the statistics, a set of standard deviations of 35 coefficients of the Region Shape, which
is defined in ISO/IEC 15938-3 are used.
8.5.1 DDL representation syntax









































© ISO/IEC 2004 — All rights reserved 13

ISO/IEC 15938-3:2002/Amd.1:2004(E)
8.5.2 Binary representation syntax
Shape Variation { Number of bits Mnemonic
for( k=0; k<35; k++) {
StaticShapeVariation[k] 4 uimsbf
}
for( k=0; k<35; k++) {
DynamicShapeVariation[k] 4 uimsbf
}
for( k=0; k<35; k++) {
StatisticalVariation[k] 4 uimsbf
}
}
8.5.3 Descriptor component semantics
StaticShapeVariation
This is an array of 35 normalized and quantized magnitudes of the ART coefficients on the accumulated
binary shape images (called Shape Variation Map: SVM) which can be considered as a two-dimensional
histogram. The number of each bin in the histogram corresponds to the frequency how often the object
appears at the pixel location throughout the sequence. The maximum number in the bin implies that a part of
the object appears at the location all the time, or becomes static through out the whole sequence. Higher is
the value of a pixel in SVM, the degree of being static of the object at the pixel location. Hence the name
StaticShapeVariation is used for the array to describe the histogram. The relationship between the order of k
and the order of radial and angular indices (n,m) is as follows:
k 0 1 2 3 4 5 6 … 31 32 33 34
n 1 2 3 0 1 2 3 … 0 1 2 3
m 0 0 0 1 1 1 1 … 8 8 8 8
StaticShapeVariation[k] is obtained by quantizing of the ART coefficient into 4 bits. Quantization range and
reconstruction values are listed in Table Amd1-4 and Table Amd1-, respectively.
DynamicShapeVariation
Similar to the StaticShapeVariation, this is an array of 35 normalized and quantized magnitudes of the region
shape coefficients on the inverted cumulated image (NSVM: Negative Shape Variation Map) or 2-D histogram
except the background. Hence the term DynamicShapeVariation is used to contrast with the
StaticShapeVariation. Normalization, quantization and indexing scheme are the same as in
StaticShapeVariation.
StatisticalVariation
This is an array of 35 quantized values of the set of standard deviations of each coefficients described by the
region shape descriptor in each frame. Quantization and indexing scheme are the same as in
StaticShapeVariation.
Table Amd1-4 - Quantization table
Range index
0.000000000 d value < 0.003073263
0.003073263 d value < 0.006358638
0.006358638 d value < 0.009887589
0.009887589 d value < 0.013699146 0011
0.013699146 d value < 0.017842545
0.017842545 d value < 0.022381125
14 © ISO/IEC 2004 — All rights reserved

ISO/IEC 15938-3:2002/Amd.1:2004(E)
0.022381125 d value < 0.027398293 0110
0.027398293 d value < 0.033007009
0.033007009d value < 0.039365646
0.039365646d value < 0.046706155
0.046706155d value < 0.055388134 1010
0.055388134d value < 0.066014017
0.066014017d value < 0.079713163
0.079713163d value < 0.099021026
0.099021026d value < 0.132028034 1110
0.132028034d value
Table Amd1-5 - Reconstruction value for 4bit quantized value
Quantization index Reconstruction value
0000 0.001511843
0001 0.004687623
0010 0.008090430
0011 0.011755242
0100 0.015725795
0101 0.020057784
0110 0.024823663
0111 0.030120122
1000 0.036080271
1001 0.042894597
1010 0.050849554
1011 0.060405301
1100 0.072372655
1101 0.088395142
1110 0.112720172
1111 0.165035042
Add after subclause 11.1:
11.2 AdvancedFaceRecognition
11.2.1 Introduction
Advanced Face Recognition Descriptor is a descriptor of face identity robust to variations in pose and
illumination conditions.
11.2.2 DDL representation syntax











© ISO/IEC 2004 — All rights reserved 15

ISO/IEC 15938-3:2002/Amd.1:2004(E)











































16 © ISO/IEC 2004 — All rights reserved

ISO/IEC 15938-3:2002/Amd.1:2004(E)
11.2.3 Binary representation syntax
AdvancedFaceRecognition{ Number of bits Mnemonic
numOfFourierFeature 6 uimsbf
numOfCentralFourierFeature 6 uimsbf
extensionFlag 1 bsblf
for(k=0; k FourierFeature[k] 5 uimsbf
}
for(k=0; k CentralFourierFeature[k] 5 uimsbf
}
if ( extensionFlag==”1” ){
numOfCompositeFeature 6 uimsbf
numOfSubregionCompositeFeature 6 uimsbf
for (k=0; k< numOfCompositeFeature; k++){
CompositeFeature[k] 5 uimsbf
}
for (k=0; k< numOfSubregionCompositeFeature; k++){
SubregionCompositeFeature[k] 5 uimsbf
}
}
}
11.2.4 Descriptor component semantics
numOfFourierFeature
This field specifies the number of components in FourierFeature. The allowed range is from 24 to 63.
numOfCentralFourierFeature
This field specifies the number of components in CentralFourierFeature. The allowed range is from 24 to 63.
extensionFlag
This field specifies existence of CompositeFeature and SubregionCompositeFeature. “1” indicates their
presence, “0” does their absence. If extensionFlag is set to 1, either numOfCompositeFeature or
numOfSubregionCompositeFeature shall be non-zero value.
FourierFeature
This element represents a facial feature based on the cascaded LDA of the Fourier characteristics of a
normalized face image. The normalized face image is obtained by scaling an original image into 56 lines with
46 luminance values in each line. The center positions of two eyes in the normalized face image shall be
located on the 24th row and the 16th and 31st columns for the right and left eyes respectively.
f
The FourierFeature element is derived from two feature vectors; one is a Fourier Spectrum Vector x , and
f
the other is a Multi-block Fourier Amplitude Vector x . Figure Amd1-1 illustrates the extraction process of
FourierFeature. Given a normalized face image, five steps should be performed to extract the element;
f
(1) Extraction of a Fourier Spectrum Vector x ,
© ISO/IEC 2004 — All rights reserved 17

ISO/IEC 15938-3:2002/Amd.1:2004(E)
f
(2) Extraction of a Multi-block Fourier Amplitude Vector x ,
f f
(3) Projections of the feature vectors using PCLDA basis matrices Ȍ ,Ȍ , and their normalization to unit
1 2
f f
vectors y , y ,
1 2
f f
(4) Projection of a Joint Fourier Vector y of the unit vectors using an LDA basis matrix Ȍ ,
3 3
f
(5) Quantization of the projected vector z .
18 © ISO/IEC 2004 — All rights reserved

ISO/IEC 15938-3:2002/Amd.1:2004(E)
normalized face image f
STEP-2
f
Image Clipping
Division
Division
into 4 blocks into 16 blocks
1 2 3
1 1
f f
1 2 8
5 6 7
f
9 10 11 12
1 1
f f
3 4 13 14 15 16
STEP-1
2 2 2
f , f , … , f
1 2 16
Fourier Transform Fourier Transform Fourier Transform Fourier Transform
Amplitude
Amplitude Amplitude
Calculation Calculation Calculation
Re[F]
1 2 3 4
1 1
|F |
|F |
1 2
5 6 7 8
|F |
Im[F] 9 10 11 12
1 1
|F | |F |
3 4
13 14 15 16
2 2 2
|F |, |F |, … ,|F |
1 2 16
Scanning Scanning
f f
Fourier Spectrum Vector x Multi-block Fourier Amplitude Vector x
1 2
f f
PCLDA Projection Ȍ PCLDA Projection Ȍ
1 2
STEP-3
Vector Normalization
Vector Normalization
f f
y y
1 2
f
Joint Fourier Vector y
f STEP-4
LDA Projection Ȍ
f
z
STEP-5
Quantization
FourierFeature
Figure Amd1-1 - Extraction of FourierFeature.
© ISO/IEC 2004 — All rights reserved 19

ISO/IEC 15938-3:2002/Amd.1:2004(E)
STEP 1) Extraction of Fourier Spectrum Vector
Given a normalized face image f (x, y) , the Fourier spectrum F(u,v) of f (x, y) is calculated by
M1N1
§ xu yv ·
§ ·
F(u,v) f (x, y)exp  2Si  (u 0,,M 1;v 0,, N 1).
¨ ¸
¨ ¸
¦¦
¨ ¸
M N
© ¹
x 0 y 0 © ¹
f
Here, M =46, N =56. A Fourier Spectrum Vector x is defined as a set of scanned components of the
Fourier spectrum. Figure Amd1-2 shows the scanning method of the Fourier spectrum. The scanning shall be
performed only on two rectangle regions, region A and region B, in the Fourier domain. The scanning rule is
concluded in Table Amd1-6. Here, S (u, v) denotes the top-left coordinate of region R and E (u,v) does the
R R
f
bottom-right point of region R, respectively. Therefore, the Fourier Spectrum Vector x is expressed by
ƒe>@F(0,0)
§ ·
¨ ¸

¨ ¸
¨ ¸
ƒe>@F(11,0)
¨ ¸
ƒe>@F(35,0)
¨ ¸
¨ ¸

¨ ¸
¨ ¸
ƒe>@F(45,0)
¨ ¸

¨ ¸
¨ ¸
ƒe>@F(45,13)
f
x ¨ ¸.
‚m>@F(0,0)
¨ ¸
¨ ¸

¨ ¸
‚m>@F(11,0)
¨ ¸
¨ ¸
‚m>@F(35,0)
¨ ¸
¨  ¸
¨ ¸
‚m>@F(45,0)
¨ ¸
¨ ¸

¨ ¸
‚m>@F(45,13)
© ¹
f
The dimension of x is 2
(M / 2)
(N / 4) , i.e., 644.
STEP 2) Extraction of Multi-block Fourier Amplitude Vector
A Multi-block Fourier Amplitude Vector is extracted from the Fourier amplitudes of partial images in the
normalized face image. As the partial images, three types of images are used; (a) a holistic image, (b) quarter
images, and (c) 1/16 images.
(a) holistic image
A holistic image f (x, y) is obtained by clipping the normalized face image f (x, y) into 44x56 image size
removing boundary columns in both sides. It is given by
f (x, y) f (x1, y) (x 0,1,,43; y 0,1,,55).
(b) quarter images
Quarter images are obtained by dividing the holistic image f (x, y) equally into 4 blocks
f (x, y) ( k =1,2,3,4) given by
k
1 0 1 1
f (x, y) f (x 22s , y 28t ) (x 0,1,,21; y 0,1,,27),
k 1 k k
1 1
where s (k 1)%2 and t (k 1) / 2 .
k k
20 © ISO/IEC 2004 — All rights reserved

ISO/IEC 15938-3:2002/Amd.1:2004(E)
(c) one-sixteenth image
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...