ETSI TR 102 493 V1.2.1 (2009-06)
Speech and multimedia Transmission Quality (STQ); Guidelines for the use of Video Quality Algorithms for Mobile Applications
Speech and multimedia Transmission Quality (STQ); Guidelines for the use of Video Quality Algorithms for Mobile Applications
RTR/STQ-00137m
General Information
Standards Content (Sample)
ETSI TR 102 493 V1.2.1 (2009-06)
Technical Report
Speech and multimedia Transmission Quality (STQ);
Guidelines for the use of Video Quality Algorithms
for Mobile Applications
---------------------- Page: 1 ----------------------
2 ETSI TR 102 493 V1.2.1 (2009-06)
Reference
RTR/STQ-00137m
Keywords
QoS, telephony, video
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE
Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16
Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88
Important notice
Individual copies of the present document can be downloaded from:
http://www.etsi.org
The present document may be made available in more than one electronic version or in print. In any case of existing or
perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF).
In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive
within ETSI Secretariat.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
http://portal.etsi.org/tb/status/status.asp
If you find errors in the present document, please send your comment to one of the following services:
http://portal.etsi.org/chaircor/ETSI_support.asp
Copyright Notification
No part may be reproduced except as authorized by written permission.
The copyright and the foregoing restriction extend to reproduction in all media.
© European Telecommunications Standards Institute 2009.
All rights reserved.
TM TM TM TM
DECT , PLUGTESTS , UMTS , TIPHON , the TIPHON logo and the ETSI logo are Trade Marks of ETSI registered
for the benefit of its Members.
TM
3GPP is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners.
LTE™ is a Trade Mark of ETSI currently being registered
for the benefit of its Members and of the 3GPP Organizational Partners.
GSM® and the GSM logo are Trade Marks registered and owned by the GSM Association.
ETSI
---------------------- Page: 2 ----------------------
3 ETSI TR 102 493 V1.2.1 (2009-06)
Contents
Intellectual Property Rights . 5
Foreword . 5
1 Scope . 6
2 References . 6
2.1 Normative references . 6
2.2 Informative references . 6
3 Definitions and abbreviations . 7
3.1 Definitions . 7
3.2 Abbreviations . 7
4 General . 7
5 Services . 8
5.1 Streaming . 9
5.2 Conversational Multimedia . 9
5.3 Video Telephony . 9
6 QoS Scenarios . 10
6.1 Key Scenarios . 10
6.2 Other scenarios . 10
7 Requirements for test systems for mobile networks . 11
7.1 Sequence and observation length . 11
7.2 Content . 11
7.3 Algorithm Properties . 11
7.3.1 Full reference perceptual algorithms. 11
7.3.2 No reference perceptual algorithms . 12
7.3.3 No reference hybrid algorithms . 12
7.3.4 Full reference hybrid algorithms . 12
7.3.5 Bitstream algorithms . 12
7.3.6 Parametric algorithms . 12
7.3.7 Video Codecs . 13
7.3.8 Calculation time . 13
7.4 Container schemes . 13
7.5 Output . 13
8 Standardization of algorithms . 13
8.1 Perceptual algorithms (J.246 and J.247) . 13
8.1.1 Sequence length . 14
8.1.2 Content . 14
8.1.3 Formats . 14
8.1.4 Bit Rates . 14
8.1.5 Compressing algorithm . 15
8.1.6 Container schemes . 15
8.1.7 Evaluation. 15
8.1.8 Conclusions . 15
8.2 Hybrid, bitstream and parametric algorithms . 16
Annex A (informative): Algorithms . 17
A.1 Measurement Methodologies . 17
A.1.1 Full Reference Approach (FR) . 18
A.1.2 No Reference Approach (NR) . 18
A.1.3 Reduced Reference Approach (RR) . 19
A.1.4 Comparison of FR and NR Approaches . 20
A.2 Degradations and Metrics . 20
ETSI
---------------------- Page: 3 ----------------------
4 ETSI TR 102 493 V1.2.1 (2009-06)
A.2.1 Jerkiness . 20
A.2.2 Freezing . 20
A.2.3 Blockiness . 21
A.2.4 Slice Error . 21
A.2.5 Blurring . 21
A.2.6 Ringing . 21
A.2.7 Noise . 21
A.2.8 Colourfulness . 21
A.2.9 MOS Prediction . 21
A.2.10 Comparison of NR and FR regarding metrics and Degradations . 22
History . 23
ETSI
---------------------- Page: 4 ----------------------
5 ETSI TR 102 493 V1.2.1 (2009-06)
Intellectual Property Rights
IPRs essential or potentially essential to the present document may have been declared to ETSI. The information
pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found
in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in
respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web
server (http://webapp.etsi.org/IPR/home.asp).
Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee
can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web
server) which are, or may be, or may become, essential to the present document.
Foreword
This Technical Report (TR) has been produced by ETSI Technical Committee Speech and multimedia Transmission
Quality (STQ).
ETSI
---------------------- Page: 5 ----------------------
6 ETSI TR 102 493 V1.2.1 (2009-06)
1 Scope
The present document gives guidelines for the use of video quality algorithms for the different services and scenarios
applied in the mobile environment.
2 References
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific.
• For a specific reference, subsequent revisions do not apply.
• Non-specific reference may be made only to a complete document or a part thereof and only in the following
cases:
- if it is accepted that it will be possible to use all future changes of the referenced document for the
purposes of the referring document;
- for informative references.
Referenced documents which are not found to be publicly available in the expected location might be found at
http://docbox.etsi.org/Reference.
NOTE: While any hyperlinks included in this clause were valid at the time of publication ETSI cannot guarantee
their long term validity.
2.1 Normative references
The following referenced documents are indispensable for the application of the present document. For dated
references, only the edition cited applies. For non-specific references, the latest edition of the referenced document
(including any amendments) applies.
Not applicable.
2.2 Informative references
The following referenced documents are not essential to the use of the present document but they assist the user with
regard to a particular subject area. For non-specific references, the latest version of the referenced document (including
any amendments) applies.
[i.1] ETSI TS 126 233: "Universal Mobile Telecommunications System (UMTS); LTE; End-to-end
transparent streaming service; General description (3GPP TS 26.233 version 8.0.0 Release 8)".
[i.2] VQEG: "Multimedia Group: Test Plan", Draft Version 1.5, March 2005.
[i.3] ETSI TS 122 960: "Universal Mobile Telecommunications System (UMTS); Mobile Multimedia
services including mobile Intranet and Internet services".
[i.4] Final Report from the Video Quality Experts Group on the validation of the objective models of
multimedia quality assessment, Phase.
[i.5] ITU-T Recommendation J.247: Objective perceptual multimedia video quality measurement in
presence of a full reference.
[i.6] ITU-T Recommendation J.246: Perceptual visual quality measurement techniques for multimedia
services over digital cable television networks in the presence of a reduced bandwidth reference.
ETSI
---------------------- Page: 6 ----------------------
7 ETSI TR 102 493 V1.2.1 (2009-06)
[i.7] ETSI TS 126 114: "Universal Mobile Telecommunications System (UMTS); LTE; IP Multimedia
Subsystem (IMS); Multimedia telephony; Media handling and interaction
(3GPP TS 26.114 Release 7)".
3 Definitions and abbreviations
3.1 Definitions
For the purposes of the present document, the following terms and definitions apply:
bitstream model: computational model that predicts the subjectively perceived quality of video, audio or multimedia,
based on analysis of the payload and transport headers
hybrid model: computational model that predicts the subjectively perceived quality of video, audio, or multimedia,
based on the media signal and the payload and transport headers
live Streaming: streaming of live content e.g. web cam, TV programs, etc.
parametric model: computational algorithm that predicts the subjectively perceived quality of video, based on
transport layer and client parameters
perceptual model: computational algorithm that aims to predict the subjectively perceived quality of video, based on
the media signal
streaming on demand: streaming of stored content e.g. movies
3.2 Abbreviations
For the purposes of the present document, the following abbreviations apply:
BLER BLock Error Rates
CIF Common Intermediate Format (352 x 288 pixels)
DMOS Difference Mean Opinion Score
FR Full Reference Algorithm
HRC Hypothetical Reference Circuit
ITU International Telecom standardization Union
MOS Mean Opinion Score
NR No Reference Algorithm
PLR Packet Loss Rates
PSNR Peak Signal Noise Ratio
QCIF Quarter Common Intermediate Format (176 x 144 pixesls)
RR Reduced Reference
SRC Source Reference Channel (or Circuit)
VGA Video Graphics Adapter
VQEG Video Quality Expert Group
4 General
Video quality assessment has become a central issue with the increasing use of digital video compression systems and
their delivery over mobile networks. Due to the nature of the coding standards and delivery networks the provided
quality will differ in time and space. Thus, methods for video quality assessment represent important tools to compare
the performance of end-to-end applications.
The present document sets the guidelines of video quality algorithms applicable for mobile applications and the
scenarios of their application. Any eligible algorithm needs to predict the perceived quality by the user using mobile
terminal equipment. The goal is to have one or more objective video quality measurement algorithm(s), which predicts
the video quality as perceived by a human viewer, which is in conformance with the minimum requirements list given
in the present document.
ETSI
---------------------- Page: 7 ----------------------
8 ETSI TR 102 493 V1.2.1 (2009-06)
On the input of the Video Quality Experts Group (VQEG) the ITU has recommended in ITU-T Recommendation J.247
[i.5] an objective perceptual video quality measurement in the presence of a full reference and in
ITU-T Recommendation J.246 [i.6] a perceptual video quality measurement in the presence of a reduced reference. An
objective perceptual multimedia video quality for no-reference algorithms has not been recommended. However
continuing research within the VQEG is directed towards providing further input to the ITU on digital multimedia
objective video quality measurement models. Work is going on in ITU-T and VQEG to develop and standardize hybrid,
bitstream and parametric models.
It is common to all services treated in the present document that quality as seen from the user's perspective depends on
the server and client applications used. For example, is has to be expected that under the same network conditions, two
different video streaming clients will exhibit different video quality due to differences in the way these clients use
available bandwidth. Therefore, for full validation of tools type and version of clients used has to be fully documented
and are seen as part of the information needed to reproduce and calibrate measurements.
NOTE: The present document focuses on those visual continuous media reproductions where the source and the
player are connected via a (mobile) telecommunication network rather than the replay of a clip that has
been completely stored on the same device as the player and is replayed from there.
5 Services
The aspect of video quality is of interest wherever there are services where the transfer of 'moving pictures' or still
images is involved. Three major fields of transferring video content can be identified that make use of packet switched
and circuit switched services.
Table 1: Requirement profiles of the services
Application Symmetry Data rates One Way Delay Lip-sync Information loss
Video telephony Two-way 32 kbps to 2 Mbps < 150 ms < 80 ms < 1 % pl
preferred
< 400 ms limit
Streaming One-way 32 kbps to 2 Mbps < 10 s < 1 % pl
Conversational Two-way < 150 ms Mutual service
Multimedia dependency,
echo
ETSI
---------------------- Page: 8 ----------------------
9 ETSI TR 102 493 V1.2.1 (2009-06)
Figure 1: Streaming (TS 126 233 [i.1])
5.1 Streaming
Streaming refers to the ability of an application to play synchronized media streams like audio and video streams in a
continuous way while those streams are being transmitted to the client over a data network. The client plays the
incoming multimedia stream in real time as the data is received.
Typical applications can be classified into on-demand and live information delivery applications. Examples of the first
group are music and news-on-demand applications. Live delivery of radio and television programs is an example of the
second category.
For 3G systems, the 3G packet-switched streaming service (PSS) fills the gap between 3G MMS, e.g. downloading, and
conversational services.
5.2 Conversational Multimedia
Multimedia services combine two or more media components within a call. The service where two or more parties
exchange video, audio and text and maybe even share documents is a multimedia service. Microsoft Netmeeting is an
example for a conversational multimedia application [i.3]. This is a peer-to-peer set up in which one party acts as the
source (server) and the other as client(s) and vice versa in real time. Another example of a new multimedia
conversational service is the 3GPP standardized MTSI service [i.7].
5.3 Video Telephony
Video telephony is a full-duplex system, carrying both video and audio and intended for use in a conversational
environment. In principle the same delay requirements as for conversational voice will apply, i.e. no echo and minimal
effect on conversational dynamics, with the added requirement that the audio and video have to be synchronized within
certain limits to provide "lip-synch".
ETSI
---------------------- Page: 9 ----------------------
10 ETSI TR 102 493 V1.2.1 (2009-06)
6 QoS Scenarios
The different services that are making use of video can be delivered in a variety of ways and situations. To obtain the
full picture of the quality of these services they need to be tested accordingly. However for practical purposes and
general feasibility key scenarios need to be identified to facilitate video quality measurements.
6.1 Key Scenarios
The key scenarios are live streaming, streaming on demand, video telephony and conversational multimedia. These
services can be tested by drive test or in a static fashion.
The algorithms for estimating video and audiovisual quality can be classified depending on:
• Type of input:
- Perceptual (access to the video signal).
- Bitstream (access to the transport layer payload, but not the video signal).
- Hybrid (access to both the video signal and the transport layer payload).
- Parametric (access to transport header, client information, and knowledge about used codecs).
• Access to reference video: The algorithm models that are used are:
- Full reference model (FR).
- No reference model (NR).
• Media types: An algorithm can estimate:
- Video quality only.
- Audiovisual quality (taking into account the combined effect of audio and video quality).
Table 2: Key scenarios and model applicability for video quality algorithm assessment
Live streaming Streaming on Video Telephony Conversational MM
Demand
FR perceptual Require pre-stored Applicable. Require Applicable. Require Applicable. Require
source - normally not pre-stored source. pre-stored source. pre-stored source.
applicable for live
streaming.
NR perceptual Applicable. Might have Applicable. Might have Applicable. Might have Applicable. Might have
bad performance
bad performance bad performance bad performance
when video contains when video contains when video contains when video contains
artefact-like content. artefact-like content. artefact-like content. artefact-like content.
FR hybrid Require pre-stored Applicable. Require Applicable. Require Applicable. Require
source - normally not pre-stored source. pre-stored source. pre-stored source.
applicable for live
streaming.
NR Hybrid Applicable. Applicable. Applicable. Applicable.
Bitstream
Applicable. Applicable. Applicable. Applicable.
Parametric
Applicable. Applicable. Applicable. Applicable.
6.2 Other scenarios
There is a further approach of video testing that does not focus on the perceptual quality of a delivered video but on the
pure availability (delivery) of the desired content in real time. This is referred to as live verification or live monitoring.
Like in the previous clause all four scenarios can be tested with all models. However due to the nature of the NR,
parametric and bitstream models they are more suitable for that purpose.
ETSI
---------------------- Page: 10 ----------------------
11 ETSI TR 102 493 V1.2.1 (2009-06)
7 Requirements for test systems for mobile networks
Testing of mobile networks is a special field of application for a video quality algorithm. To be actually applicable for
e.g. drive testing any algorithm should fulfil the following requirements.
7.1 Sequence and observation length
Since one aspect of mobile network testing is to georeference the results to identify areas with less than optimal quality,
the algorithm should be capable to provide data for a reasonable resolution. Therefore it should be capable of assessing
sequences of a period of 8 seconds to 30 seconds (comparable with listening quality).
The length of a Video Telephony call and video streaming can vary between a couple of seconds and several hours. For
video streaming sessions where the quality is degraded by rebuffering, the sequence length should be in the range
15 seconds to 30 seconds to be able to estimate the quality for such degradations.
Estimating quality for sequences longer than 30 seconds may be done by collecting and aggregating the results of a
sequence of short samples. The way of aggregation itself needs to be determined.
7.2 Content
The algorithm should be capable of assessing the quality of all visual content that is (can be) delivered over mobile
networks. E.g.:
1) Video conferencing.
2) Movies, movie trailers.
3) Sports.
4) Music video.
5) Advertisement.
6) Animation.
7) Broadcasting news (head and shoulders and outside broadcasting).
8) Home video.
9) Video Telephony (low quality input of various content).
10) Pictures /Still images.
Regarding 10) it is required that the algorithm can process pictures of the type of content delivered as moving picture
(1 to 9) and in addition still images and maps. When using a perceptual or hybrid algorithm the test set-up should
include a variety of content and the final quality should be the average of all used contents. A parametric quality model
normally directly estimates the average quality for typical video or audiovisual content.
7.3 Algorithm Properties
7.3.1 Full reference perceptual algorithms
In order to assure a wide range of applicability any full reference algorithm (FR) should be capable of working equally
well with the uncompressed and a pre-processed (compressed) version of the reference. In cases where the reference is
not loss less processed and hence the uncompressed original is not recoverable from the pre processed, an adequate
mapping function has to be provided to facilitate homogeneous measurement results for both types of references.
ETSI
---------------------- Page: 11 ----------------------
12 ETSI TR 102 493 V1.2.1 (2009-06)
For mobile environments the following scenario has to be taken into account:
An operator conveys live streaming as third party content to its users. In order to assess the end user quality of this
content the capture on the end user side can only be compared with the stream as delivered by the content provider. If
this is not being the uncompressed original but a processed one the operator needs to uncompress the delivery (see
clause 7.3.3). This uncompressed stream serves as the reference for a FR assessment of the quality. If the compression
was not loss less the 'original' is not recoverable and hence a FR algorithm applicable only for originals cannot be used.
7.3.2 No reference perceptual algorithms
No reference perceptual algorithms evaluate the quality without a dedicated undisturbed reference signal.
For perceptual no reference models erroneous evaluation is to be avoided. In particular that artefact-like content is not
to be confused with real artefacts. Furthermore black videos received and freezing should not produce high MOS scores
if the source of the videos was not black or a still image respectively.
No reference perceptual algorithms need to have the capability to score live video and are independent from a dedicated
video server providing reference video samples.
7.3.3 No reference hybrid algorithms
A no reference hybrid algorithm uses both the video signal and the decoded bitstream to estimate the quality, but does
not use the original video sequence
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.