Information technology — Multimedia content description interface — Part 8: Extraction and use of MPEG-7 descriptions — Amendment 6: Extraction and matching of video signature tools

Technologies de l'information — Interface de description du contenu multimédia — Partie 8: Extraction et utilisation des descriptions MPEG-7 — Amendement 6: Extraction et correspondance des outils de signature vidéo

General Information

Status
Published
Publication Date
30-Oct-2011
Current Stage
6060 - International Standard published
Due Date
28-Feb-2014
Completion Date
31-Oct-2011
Ref Project

Relations

Buy Standard

Technical report
ISO/IEC TR 15938-8:2002/Amd 6:2011 - Extraction and matching of video signature tools
English language
4 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

TECHNICAL ISO/IEC
REPORT TR
15938-8
First edition
2002-12-15
AMENDMENT 6
2011-11-01


Information technology — Multimedia
content description interface —
Part 8:
Extraction and use of MPEG-7
descriptions
AMENDMENT 6: Extraction and matching of
video signature tools
Technologies de l'information — Interface de description du contenu
multimédia —
Partie 8: Extraction et utilisation des descriptions MPEG-7
AMENDEMENT 6: Extraction et correspondance des outils de signature
vidéo



Reference number
ISO/IEC TR 15938-8:2002/Amd.6:2011(E)
©
ISO/IEC 2011

---------------------- Page: 1 ----------------------
ISO/IEC TR 15938-8:2002/Amd.6:2011(E)

COPYRIGHT PROTECTED DOCUMENT


©  ISO/IEC 2011
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland

ii © ISO/IEC 2011 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC TR 15938-8:2002/Amd.6:2011(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
In exceptional circumstances, when the joint technical committee has collected data of a different kind from
that which is normally published as an International Standard (“state of the art”, for example), it may decide to
publish a Technical Report. A Technical Report is entirely informative in nature and shall be subject to review
every five years in the same manner as an International Standard.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Amendment 6 to ISO/IEC TR 15938-8:2002 was prepared jointly by Joint Technical Committee
ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and
hypermedia information.

© ISO/IEC 2011 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC TR 15938-8:2002/Amd.6:2011(E)

Information technology — Multimedia content description
interface —
Part 8:
Extraction and use of MPEG-7 descriptions
AMENDMENT 6: Extraction and matching of video signature tools
After 4.9.1, add:
4.9.2 Video Signature
The visual content descriptors in Clauses 6-9 of ISO/IEC 15938-3:2002 are very useful when trying to find
videos with similar content. These descriptors are intended to be general and were found to be unsuitable for
the task of finding duplicate content. The video signature descriptor is designed to identify duplicate video
content. This descriptor is robust to a wide range of common video editing operations, but is sufficiently
different for every original content to identify it reliably.
The video signature is composed of three main elements: a frame signature, a set of compact summary frame
signatures, referred to as words, and a group-of-frames representation for a temporal segment, referred to as
a bag of words.
4.9.2.1 Extraction
11.4.5 to 11.4.8 of ISO/IEC 15938-3:2002 describe the extraction of the video signature.
4.9.2.2 Matching
A Video Signature is composed of multiple temporal segments, each represented by a BagOfWords element,
and multiple frames, each represented by a FrameSignature element and a FrameConfidence element.
1 2
The matching between two Video Signatures v and v is carried out in three stages, designed to maximize
matching speed and true positives and to minimize false positives. The first stage uses the BagOfWords
element to identify candidate matching segments. The second stage uses the FrameSignature element to
identify candidates of frame rate ratio and temporal offset between the candidate matching segments. The
third stage performs frame-by-frame matching to determine candidate matching intervals using the
FrameSignature and FrameConfidence elements, and then determines the best match between the Video
1 2
Signatures v and v . These matching stages are explained in more detail below.
Stage 1 (Segment matching with BagOfWords)
1
All of the temporal segments of Video Signature v are compared with all of the temporal segments of Video
2
Signature v .
© ISO/IEC 2011 – All rights reserved 1

---------------------- Page: 4 ----------------------
ISO/IEC TR 15938-8:2002/Amd.6:2011(E)
1 2
For two segments f and f , their similarity is assessed by comparing the bag-of-words representation for
1
each vocabulary j and merging the results to reach a decision. More specifically, for BagOfWords[]j and
2
BagOfWords[ j] , their distance is measured by the Jaccard distance metric given by
12
#(BagOfWords[]j BagOfWords[]j)
12
D BagOfWords[]j, BagOfWords []j 1

j
12
#(BagOfWords[]j BagOfWords[]j)
J
1 2
where # denotes the number of elements in a set. This measures the distance of the segments f and f in
a given vocabulary as a function of the distinct words they have in common and all the distinct words that they
contain jointly.
For Q 5 vocabularies, we have Jaccard distances D , D , …,. D . These distances are fused to give
0 1 Q1
J J
J
the composite distance D as
J
Q1
D D
 k
J
J
k0
Then a decision on the similarity of the segments is reached by thresholding each of Jaccard distances D ,
0
J
1 2
D , …,. D , and the composite distance D . That is, the segments f and f are passed to stage 2 of
1 Q1 i j
J
J
J
matching if more than half of the Q Jaccard distances D , D , …,. D are less than a threshold T and
0 1 Q1 1
J J
J
the composite distance D is less than another threshold T , otherwise they are declared not matching.
2
J
Stage 2 (Frame rate ratio & time shift estimation using Hough transform)
For the segment pairs passed to this stage, a Hough transform is used to estimate the temporal parameter
differences, i.e. frame
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.