Information technology — Multimedia content description interface — Part 8: Extraction and use of MPEG-7 descriptions — Amendment 3: Technologies for digital photo management using MPEG-7 visual tools

Technologies de l'information — Interface de description du contenu multimédia — Partie 8: Extraction et utilisation des descriptions MPEG-7 — Amendement 3: Technologies pour la gestion des photos numériques à l'aide des outils visuels MPEG-7

General Information

Status

Published

Publication Date

13-Dec-2007

ICS

35.040 - Information coding

35.040.40 - Coding of audio, video, multimedia and hypermedia information

Technical Committee

ISO/IEC JTC 1/SC 29 - Coding of audio, picture, multimedia and hypermedia information

Drafting Committee

ISO/IEC JTC 1/SC 29 - Coding of audio, picture, multimedia and hypermedia information

Current Stage

6060 - International Standard published

Due Date

14-Nov-2009

Completion Date

14-Dec-2007

Ref Project

Relations

Amends

ISO/IEC TR 15938-8:2002 - Information technology — Multimedia content description interface — Part 8: Extraction and use of MPEG-7 descriptions

Effective Date

06-Jun-2022

Child

ISO/IEC TR 15938-8:2002 - Information technology — Multimedia content description interface — Part 8: Extraction and use of MPEG-7 descriptions

Effective Date

15-Apr-2008

Buy Standard

ISO/IEC TR 15938-8:2002/Amd 3:2007 - Technologies for digital photo management using MPEG-7 visual tools

Technical report

ISO/IEC TR 15938-8:2002/Amd 3:2007 - Technologies for digital photo management using MPEG-7 visual tools

English language

34 pages

sale 15% off

Preview

sale 15% off

Preview

Standards Content (Sample)

ISO/IEC TR 15938-8:2002/Amd 3:...

INTERNATIONAL ISO/IEC
STANDARD TR
15938-8
First edition
2002-12-15
AMENDMENT 3
2007-12-15

Information technology — Multimedia
content description interface —
Part 8:
Extraction and use of MPEG-7
descriptions
AMENDMENT 3: Technologies for digital
photo management using MPEG-7 visual
tools
Technologies de l'information — Interface de description du contenu
multimédia —
Partie 8: Extraction et utilisation des descriptions MPEG-7
AMENDEMENT 3: Technologies pour la gestion des photos
numériques à l'aide des outils visuels MPEG-7

Reference number
ISO/IEC TR 15938-8:2002/Amd 3:2007(E)
©
ISO/IEC 2007

---------------------- Page: 1 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.

COPYRIGHT PROTECTED DOCUMENT

© ISO/IEC 2007
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland

ii © ISO/IEC 2007 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
In exceptional circumstances, the joint technical committee may propose the publication of a Technical Report
of one of the following types:
— type 1, when the required support cannot be obtained for the publication of an International Standard,
despite repeated efforts;
— type 2, when the subject is still under technical development or where for any other reason there is the
future but not immediate possibility of an agreement on an International Standard;
— type 3, when the joint technical committee has collected data of a different kind from that which is
normally published as an International Standard (“state of the art”, for example).
Technical Reports of types 1 and 2 are subject to review within three years of publication, to decide whether
they can be transformed into International Standards. Technical Reports of type 3 do not necessarily have to
be reviewed until the data they provide are considered to be no longer valid or useful.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 3 to ISO/IEC TR 15938-8:2002 was prepared by Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia
Information.

© ISO/IEC 2007 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)

Information technology — Multimedia content description
interface —
Part 8:
Extraction and use of MPEG-7 descriptions
AMENDMENT 3: Technologies for digital photo management using
MPEG-7 visual tools
Add after subclause 4.2.3.3:
4.2.3.4 Dominant Color Temperature
4.2.3.4.1 General
This subclause provides an advanced use scenario of the Dominant Color descriptor. The Dominant Color
Temperature is a variation of Dominant Color, but suitable to implement perceptual similarity based retrieval.
Images usually have one of a few dominant color temperatures perceived by users when they look at them.
Dominant Color Temperatures enable users to search for images in scenarios such as query by example or
query by value, and for image browsing regarding their color temperature. It can be useful for users who want
to find images which look similar according to color temperature rather than to find images which have similar
color regions.
4.2.3.4.2 Use scenario
Dominant Color Temperatures can be used in query by example and query by value search scenarios.
Examples of such queries are depicted in Figure AMD3.1. In a query by example a user inputs an example
image or draws a colored sketch (query by sketch) and the search application returns the most similar images
regarding their color temperature. In a query by value a user chooses a temperature value, and the system
retrieves images in which the appearance of color temperature is closest to the user choice.
© ISO/IEC 2007 – All rights reserved 1

---------------------- Page: 4 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)

a) b)

Figure AMD3.1 — Examples of image retrieval using Dominant Color Temperatures: a) query by
example; b) query by color temperature value given in kelvins
4.2.3.4.3 Feature extraction
The Dominant Color Temperature, which consists of a maximum of eight pairs of color temperature and
percentage, is obtained by the following steps.
1. Get RGB color values and percentages of dominant colors from a Dominant Color descriptor instance.
2. Convert each dominant color value from RGB to color temperature using the relevant method
specified in the feature extraction method of Color Temperature descriptor [subclause 6.9.1.1]. The
number of obtained color temperatures cannot, therefore, exceed the number of dominant colors in
the Dominant Color descriptor instance. The colors that do not have significant color temperature
(colors having luminance values below the luminance threshold specified in the extraction method of
the Color Temperature descriptor) should be omitted.
3. Use the obtained color temperatures and their percentages given by the Dominant Color descriptor
instance in queries: query by example, query by color temperature value, ranking search results, and
others.
4.2.3.4.4 Similarity matching
The similarity is based on a distance function which is defined as an integral of absolute difference between
two percentage distributions of dominant color temperature. The percentage distributions of dominant color
temperature should be obtained first in the following steps:
1. Convert color temperature values T of Dominant Color Temperature description to Reciprocal
i
-1
Megakelvin scale RT [MK ] = 1000000/T [K].
i i
2. Sort, in ascending order, the dominant color temperatures expressed in reciprocal scale.
3. Create the percentage distribution of dominant color temperature D (RT) using the following
i i
equations:
D(RT) = 0           for RT < RT ;
0
D(RT) = p + p + . + p  for RT ≤ RT < RT , 1 ≤ i ≤ n-1 ;
0 1 i-1 i-1 i
2 © ISO/IEC 2007 – All rights reserved

---------------------- Page: 5 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)
D(RT) = p + p + . + p   for RT ≥ RT ;
0 1 n-1 n-1
Where:
n – number of dominant color temperatures;
RT , RT , . , RT – sorted dominant color temperatures;
0 1 n-1
p , p , . , p – percentages.
0 1 n-1
Figure AMD3.2 shows an example of a dominant color temperature distribution.
ppp[%[%[%]]]
100100100
25%25%25%
808080
606060 40%40%40%
404040
15%15%15%
202020
20%20%20%
RTRTRT RTRTRT
RTRTRT RTRTRT RTRTRT RTRTRT
minminmin maxmaxmax
000 111 222 333
-1-1-1
RTRTRT [MK[MK[MK ]]]

Figure AMD3.2 — Example of cumulative dominant color temperature distribution
The proposed distance function is given by the following equation, which is an integral of difference between
two color temperature distributions.
RT
max
dist= D()RT −D()RT dRT
1 2
∫
RT
min
This expression is equivalent to the geometrical area bounded by the two distributions. An example of
distance calculation is depicted in Figure AMD3.3, where the distribution distances are shown graphically on
distribution diagrams.

Figure AMD3.3 — Example of distance calculation
© ISO/IEC 2007 – All rights reserved 3

---------------------- Page: 6 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)
The distance function presented in the above equation can be efficiently implemented using the following
steps:
1. Input: two percentage distributions of dominant color temperature:
RT1, D1 – tables of temperature and percentage distribution for image 1,
RT2, D2 – tables of temperature and percentage distribution for image 2;
2. Initialize: dist=0, x = RT ;
1 min
3. Take the next minimum temperature value t from tables RT1, RT2, and let x = t ;
curr 2 curr
4. Find in D1, D2 the lower bound y and the upper bound y of the rectangle corresponding to the current x ,
1 2 1
x coordinates;
2
5. dist = dist + (x – x )(y – y );
2 1 2 1
6. x = x ;
1 2
7. If all values from tables D1, D2 have been taken then return dist else go to step 3.
The tables used as an input to the algorithm above are obtained from the percentage distributions of dominant
color temperature in the following way: RTX[i] = RT, DX[i]=D(RT), for 0 ≤ i ≤ n, where X stands for image 1 or
i i
2.
In the case of query by color temperature value, the same distance function can be used, by assuming that
the query value given by the user is a single dominant color temperature with a percentage of 100%. Although
in this case, the distance function can be simplified to the following:
n−1
∆RT= RT−RT p
∑ i REF i
i=0
where RT is the value of the query color temperature, RT are dominant color temperatures, p are
REF i i
percentages, and n is the number of dominant color temperatures in image.
4.2.3.4.5 Condition of usage
The same restrictions are applied as for the Dominant Color descriptor. Additionally, Dominant Color
Temperatures cannot be used for very dark images of which all dominant colors have luminance values below
the luminance threshold specified in the extraction method of Color Temperature descriptor.

Add after subclause 4.7:
4.8 High-level use scenarios
4.8.1 Content based Image retrieval
4.8.1.1 General
Content-based image retrieval gives an efficient and easy way of managing and retrieving digital images from
enormous digital contents. In content-based image retrieval, there are two representative methods. One is a
query by example, where a user selects a similar image to those expected for a query. The other is query by
4 © ISO/IEC 2007 – All rights reserved

---------------------- Page: 7 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)
sketch, in which a user must draw a sketch and use it as a query. Since a seed picture is needed, some
mechanism to assist users finding query image itself is required in the former scenario. One possible solution
is to combine text-based image retrieval or query by sketch as a pre-processing step of query by example.
4.8.1.2 Query within region of interest (ROI)
4.8.1.2.1 General
This section provides a usage scenario to enable users to dynamically retrieve photographs with similar
Region of Interest (such as background) in image space. Region-based image retrieval can be implemented
by portioning an image into several small regions and assigning a StillRegionFeatureDS for each of them.
However, in practice, such an approach is difficult as it requires prior segmentations that are often subjective
and may depend on a particular query. The ROI-based photo retrieval gives users the benefit of defining ROI
when making a query. Although query by example is very useful for image retrieval, one may want to retrieve
photos with similar backgrounds. In other words, if the scenery is well known or quite beautiful, people tend to
take pictures with the same background but different persons. For those photos, it will be more efficient to
retrieve the photos by matching the background regions only. In this scenario, the user can select the region
that he wants to retrieve in particular and send it to the system as a query.
4.8.1.2.2 Use Scenario
Figure AMD3.4 shows the flow of the proposed query method. The user first selects a query image. In the
query image, the user selects a ROI by selecting local regions (shown in blue). The ROI is used as a query
image for retrieval. Figure AMD3.5 shows the example of image retrieval within a ROI.

ROI Query DB
Query image ROI selected in blue region

Figure AMD3.4 — Flow of query by ROI
© ISO/IEC 2007 – All rights reserved 5

---------------------- Page: 8 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)

Figure AMD3.5 — Retrieval of image within ROI
4.8.1.2.3 Tools to be used
StillRegionFeatureDS or VideoSegmentFeatureDS is used for this scenario. Among the several elements
included in these DSs, the Edge Histogram descriptor and the Color Layout descriptor should be instantiated
to implement the functionality of ROI-based retrieval. For video retrieval, shots are extracted from the video
sequence and for each shot, localized features from the specific region are extracted. Then, the ROI is used
as a query for video retrieval.
4.8.1.2.4 Feature Extraction
ROI-based retrieval can be implemented by extracting localized features from the specified region. The
extraction process of the localized feature from the instances of two mandatory description tools, Color Layout
and Edge Histogram, is described in this subclause. Figure AMD3.6 illustrates this process. From the Edge
Histogram Descriptor one can obtain a localized edge distribution in each 4 x 4 local rectangular region. From
the Color Layout descriptor, one can obtain an 8 x 8 region-based DCT: by performing inverse quantization
and taking the 8 x 8 inverse DCT (as described in subclause 4.2.5.2.3), we can obtain average color values
for 8 x 8 local rectangular regions. Feature extraction of each descriptor is defined in ISO/IEC 15938-3,
MPEG-7 Visual. As in Figure AMD3.6, a combination of the Edge Histogram Descriptor and Color Layout
Descriptors can be used for the rectangular region-based query-by-ROI.

6 © ISO/IEC 2007 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)
44 x4 x4 E EHHDD
CoCommbbineinedd
FeaturFeaturee
For EachFor Each
8x8x8 ID8 IDCCTT
4x4x4 Bl4 Blockock
8x8x8 Spati8 Spatiaall D Doommaaiinn
AAvveeraragege co cololor fr foorr
Each blEach blockock
CCLLDD on on 8x8x8 D8 DCCTT Pl Plaannee

Figure AMD3.6 — 4x4 block-based “Query-by-ROI” with Edge Histogram Descriptor and Color Layout
Descriptor
4.8.1.2.5 Similarity Matching
For the Color Layout descriptor, we can take an 8 x 8 inverse DCT for the quantized DCT coefficients of Y, Cr,
and Cb. Then, we have representative color values for 8 x 8 blocks of the image. These block-wise color
values are combined with the edge histogram bins for each 4 x 4 image region (see Figure AMD3.7). Thus,
each rectangular image region of the (4 x 4) Edge Histogram descriptor blocks includes 4 (2 x 2) color blocks
obtained by the inverse 8 x 8 DCT of the Color Layout descriptor. Now, a combination of the color and edge
information in each of the (4 x 4) rectangular image regions will form a feature vector for the rectangular
region-based similarity matching.
© ISO/IEC 2007 – All rights reserved 7

---------------------- Page: 10 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)

Figure AMD3.7 — Parameter value example of EHD
Figure AMD3.7 shows an example of parameter values when using the EHD for matching blocks. When the
total number of images is N, the jth(j=0,1,2 . 15) block of the ith(i=0,1,2,.N) image has five types(0°, 45°,
90°, 135°, non-directional) of edge value. If we represent these edge value as k (k=0,1,2,3,4), the parameter
value H [k] is the kth edge value of the jth block of the ith image. For a query image Q, the edge value of the
ij
Q EHD
selected sub-image is H [k].The local distance of Edge Histogram LD is as follows.
ij
4
EHD Q
LD ij= |H [k]−H [k]|    (AMD1)
∑ ij
k=0
8 © ISO/IEC 2007 – All rights reserved

---------------------- Page: 11 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)

Figure AMD3.8 — Parameter of Inverse DCT Color Layout descriptor
Figure AMD3.8 shows the parameters of the inverse DCT Color Layout descriptor. The total number of
images is N. We group 8 X 8 image blocks into 4 X 4 blocks. Newly grouped blocks can be labeled as β where
each block consists of 4 blocks(β=0,1,2,3). Each Y, Cb, Cr is labeled as α(α=0,1,2) (α=0 for Y, α=1 for Cb,
α=2 for Cr). Parameter value is C [α] for jth block(sub image) of ith image. C [α] represents color value α of β
ijβ ijβ
Q
sub-block of jth block of ith image. C [α] are parameter values of the query image Q. The local distance of
β
Color Layout descriptor of jth sub image of ith image can be obtained as follows.
Q
2 3 C [α]−C [α]
β ij
β
CLD
LD =    (AMD2)
ij
∑∑
3
αβ==0 0
© ISO/IEC 2007 – All rights reserved 9

---------------------- Page: 12 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)
EHD CLD
The distance values are then scaled, so that the maximum values of LD ij (AMD1) and LD ij (AMD2)
for all images in the database, are normalised to 1. The combined distance CD can be obtained from these
ij
EHD CLD
normalized distances, ND and ND , as follows:
ij ij
EHD CLD
ND ij+ ND ij
CD =    (AMD3)
ij
2
4.8.1.2.6 Condition of Use
In order to use ROI there shall be rectangular regions and there is a restriction on the setting of rectangular
regions as 4 X 4.
4.8.1.2.7 DDL instance examples
xmlns="urn:mpeg:mpeg7:schema:2004">

  xsi:type="StillRegionFeatureType">

  40
  34
  30
  16 12 15 12 17
  12 17
  12 14

2 6 4 4 2 1 7 5 3 2 1 6 4 2 2 2 5 4
  5 3 1 5 5 6 5 2 6 5 4 4 1 6 4 4 4 0 6 3 5
  2 1 5 5 6 6 4 2 3 6 7 3 2 5 5 7 3 2 4 4 7
  1 5 6 4 6 1 5 7 4 5 1 6 4 6 5 1 3 4 7 6

4.8.2 Grouping Technologies
4.8.2.1 Situation-based clustering
4.8.2.1.1 General
A simple but very effective structure is to group images by the occasion on which they were taken. This is
natural for the user since they will often remember the context of the situation much better than a date, time or
explicit label attached to the picture. It is possible to automatically cluster images into such “situations” by
using MPEG-7 visual description, together with the time stamp of the image. Based on the assumption that
each situation is contiguous in time, the organisational structure can be represented by the time-sequence of
images, with a flag or marker to indicate the boundaries between situations (cf. Figure AMD3.9). This provides
the user with a simple, intuitive and effective means to browse through their collection, without placing any
10 © ISO/IEC 2007 – All rights reserved

---------------------- Page: 13 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)
additional burden on them to spend time organising it. Two methods are presented for situation based
clustering.

Boundaries (vertical bars) are inserted between adjacent images in the sequence to denote the grouping.
Figure AMD3.9 — One representation of the grouped sequence of images
4.8.2.1.2 Use scenario
This kind of clustering can easily be implemented in traditional photo-browsing software applications. For the
user, it is very simple to use – the extraction and matching of MPEG-7 descriptors and detection of the
boundaries is fully automatic, so the tool is essentially “one-click”. Of course, some users may choose to
adjust and refine the automatic output to match their individual preferences. This process would still be far
easier than organising all the photos manually.
The clustering information can be used to access and manipulate the image content in a variety of ways:
• Browsing:
o Display a cluster of images per page, or
o Display a single thumbnail / icon for each cluster
• Annotation
o User can easily assign a single label to all the images in a cluster
• Sharing:
o User can select images by cluster and…
o Print
o Copy
o Upload to website
© ISO/IEC 2007 – All rights reserved 11

---------------------- Page: 14 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)
4.8.2.1.3 Method1: Simple Linear Clustering
4.8.2.1.3.1 General
This method achieves good clustering performance with minimal complexity. The additional computation (after
extracting and matching MPEG-7 visual descriptors) consists of a simple weighted linear summation. It is
therefore well-suited to applications where MPEG-7 descriptors have been extracted from images but
resources are not available for higher-level processing (for example, in low-complexity devices). The input
parameters to the algorithm are also simple and therefore easy to adapt - for example, to different applications
or user preferences.
4.8.2.1.3.2 Tools to be used
Six tools defined in ISO/IEC 15938-3 shall be instantiated in StillRegionFeatureDS:
z Dominant Color (DC)
z Scalable Color (SC)
z Color Layout (CL)
z Color Structure (CS)
z Homogeneous Texture (HT)
z Edge Histogram (EH)
Also, capturing date/time information should be included. If an image is encoded in Exif file format (JEITA CP-
3451), this information can be obtained from the Exif header.
z EXIF DateTime tag (ID36867)
Alternatively, the same information can be captured using
z CreationInformation/CreationCoordinates/Date (mpeg7:TimeType)
4.8.2.1.3.3 Clustering Algorithm
The images are ordered by their time stamps and each potential boundary in the sequence is evaluated in
turn. To determine the presence or absence of a boundary, a number of pair-wise comparisons are made
amongst images lying in a window either side of the transition. This neighbourhood and the comparisons used
are illustrated in Figure AMD3.10.
12 © ISO/IEC 2007 – All rights reserved

---------------------- Page: 15 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)

j-2 j-1 j j+1 j+2 j+3
?

Figure AMD3.10 — Neighbourhood comparisons evaluated to determine if a boundary is present
Comparison of images consists of computing the descriptor distances (by the respective methods suggested
in ISO/IEC 15938-8 TR) and calculating the time difference. The latter is measured on a logarithmic scale, to
compress the range of this feature and allow meaningful comparisons. Time distance is therefore defined as:
−5
D (i,i+1)= ln(10 +(T −T))
T i+1 i
The unit of time used for T is days. The natural logarithm is applied to normalise the range of time distances
i
– potential time differences will vary over several orders of magnitude. After this transformation, the variation
−5
of the time distance is comparable to the remaining features. The constant, 10 , meanwhile, chooses the
minimum scale of the distance – just under one second, in this case. It also ensures that ln(0) does not
occur.
The input to the algorithm includes the first-, second- and third-order distances in a short time interval around
the boundary to be tested. Here “first-order” refers to the difference, for any given feature, between two
images that are adjacent in the sequence – i.e. D (i,i+1). A second-order distance is the difference
F
between two images that are separated by one other image – i.e. D (i,i+ 2). Similarly, a third-order
F
distance is the difference between two images that are separated by two other images – i.e. D (i,i+ 3) . The
F
total measurement of difference between images j and j+1 is now:
2 1 0
⎧ ⎫
D(j, j+1)= α D (j+i, j+i+1)+ β D (j+i, j+i+ 2)+ γ D (j+i, j+i+ 3)
⎨ ⎬
∑∑ Fi F ∑ Fi F ∑ Fi F
Fi⎩=−2 i=−2 i=−2 ⎭
This is a summation over a set of 12 distance measurements for each of 6 visual features (the outer
summation being over the set of features, F). For time difference, only the first order distances are used,
adding 5 more distance measurements, to give a total set of 77 numbers. These are weighted by 77 weights
α,β,γ , the recommended values of which are given in Table AMD3.1.
© ISO/IEC 2007 – All rights reserved 13

---------------------- Page: 16 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)
Table AMD3.1 —Weights for distance calculation. (Feature distances are first normalized for zero
mean and unit variance)
Dominant Scalable Color Color Homogeneous Edge
Time
Color Color Layout Structure Texture Histogram
α 0,0583 0,2598 0,4546 0,2661 -0,0718 0,4890 2,8952
0
α 0,1976 -0,0077 -0,0986 -0,3279 -0,1370 -0,3108 -0,2911
1
α -0,0425 0,1117 0,0543 0,0594 -0,0089 -0,1642 -0,0035
2
α 0,2718 -0,1835 -0,0640 -0,1153 0,0102 -0,3534 -0,3748
−1
α 0,0085 -0,0259 -0,0539 -0,1419 -0,0725 0,0951 -0,0786
−2
β -0,0249 -0,0107 0,4662 0,3828 0,0567 -0,2351
−1
β 0,1718 -0,0788 -0,0086 0,2190 0,2653 0,2186
0
β 0,0958 -0,2618 -0,0520 -0,0652 -0,0496 0,1157
−2
β 0,2785 0,0072 -0,3648 -0,1872 -0,0611 0,1439
1
γ -0,1955 0,1203 -0,0767 -0,0567 0,0148 0,1178
−1
γ 0,0324 0,1808 -0,2327 0,2665 0,0167 0,2029
−2
γ
-0,1199 0,0196 0,0477 0,1841 0,0288 0,1436
0

The output, D, is the real-valued indicator of boundary confidence – i.e., higher values of D indicate a stronger
belief that there is a boundary at the candidate position. A binary boundary indicator is obtained by comparing
this number to a threshold. The threshold may be adjusted for sensitivity depending on the image collection,
the particular application, or the preferences of the user. A value of around 3.35 could be recommended to
produce a good balance between false positives (mistakenly detected boundaries) and false negatives
(missed boundaries).
4.8.2.1.4 Method2: Clustering based on Visual Semantic Hints
4.8.2.1.4.1 General
The proposed method achieves good clustering performance on similar situations based upon the visual
semantic hints. If the visual semantic hints are used for adaptive feature selection, they eventually help to
reduce computational complexity while achieving reasonable clustering performance. For example, a low
performance device like a mobile phone, which can only extract a limited number of MPEG-7 descriptors, can
apply this method while maintaining a reasonable clustering performance.
4.8.2.1.4.2 Tools to be used
Seven tools defined in ISO/IEC 15938-3 shall be instantiated in StillRegionFeatureDS:
z Dominant Color (DC)
z Scalable Color (SC)
z Color Layout (CL)
z Color Structure (CS)
z Homogeneous Texture (HT)
z Texture Browsing (TB)
z Edge Histogram (EH)
14 © ISO/IEC 2007 – All rights reserved

---------------------- Page: 17 ----------------------
ISO/IEC TR 15938-8:2002/Amd.3:2007(E)
Also, capturing date/time information should be included. If an image is encoded in Exif file format
(JEITA CP-3451), this information can be obtained from the Exif header.
z EXIF DateTime tag (ID36867)
Alternatively, the same information can be captured using
z CreationInformation/CreationCoordinates/Date (mpeg7:TimeType)
4.8.2.1.4.3 Semantic Hint Extraction
We determine the weight of each visual feature using ‘visual semantic hints’, which are automatically
extracted from a series of photos, in order to improve the Situation/View-based Photo clustering performance.
The visual semantic hints of image represent the visual characteristics that are perceived by human visual
system. The visual semantic hints used are as follows:
1) Colorfulness (CoF) hint: it represents degree of a visual sensation according to the purity of colors on
photo. Figure AMD3.11 shows some exemplary photos with high degree of colorfulness.

Figure AMD3.11 — Exemplary photos with Colorfulness semantics.
To extract the C
...

ISO/IEC TR 15938-8:2002/Amd 3:2007

Information technology — Multimedia content description interface — Part 8: Extraction and use of MPEG-7 descriptions — Amendment 3: Technologies for digital photo management using MPEG-7 visual tools

Information technology — Multimedia content description interface — Part 8: Extraction and use of MPEG-7 descriptions — Amendment 3: Technologies for digital photo management using MPEG-7 visual tools

Technologies de l'information — Interface de description du contenu multimédia — Partie 8: Extraction et utilisation des descriptions MPEG-7 — Amendement 3: Technologies pour la gestion des photos numériques à l'aide des outils visuels MPEG-7

General Information

Relations

Buy Standard

Standards Content (Sample)

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Information technology — Multimedia content description interface — Part 8: Extraction and use of MPEG-7 descriptions — Amendment 3: Technologies for digital photo management using MPEG-7 visual tools

Technologies de l'information — Interface de description du contenu multimédia — Partie 8: Extraction et utilisation des descriptions MPEG-7 — Amendement 3: Technologies pour la gestion des photos numériques à l'aide des outils visuels MPEG-7

General Information

Relations

Buy Standard

Standards Content (Sample)

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

This May Also Interest You