Speech and multimedia Transmission Quality (STQ) - Audiovisual QoS for communication over IP networks

The present document addresses combination network performance parameters and user perceived media (audio and video) quality parameters for audiovisual communications on IP networks. The access technologies covered include both wired (e.g. xDSL) and wireless (e.g. UMTS, WLAN) technologies. The display size range covered is from those of small mobile terminals (e.g. 2") up to large TV sets (e.g. 40" or more). It is applicable to:
- Broadcasting and streaming applications such as IPTV and VoD.
- Interactive point-to-pint applications such as videotelephony and videoconferencing.
Where the media coding standards define two or more profiles, the baseline profile is addressed in the normative part of the standard. Informative annexes present an overview of network QoS mechanisms and the effects on connection performance as well as guidance on terminal parameters that may influence the user perceived media performance.

Kakovost prenosa govora in večpredstavnih vsebin (STQ) - Kakovost avdiovizualne storitve pri komunikaciji prek omrežij z internetnim protokolom (IP)

General Information

Status
Published
Publication Date
09-Jun-2009
Current Stage
6060 - National Implementation/Publication (Adopted Project)
Start Date
11-May-2009
Due Date
16-Jul-2009
Completion Date
10-Jun-2009
Standard
ETSI ES 202 667 V1.1.1 (2009-02) - Speech and multimedia Transmission Quality (STQ); Audiovisual QoS for communication over IP networks
English language
63 pages
sale 15% off
Preview
sale 15% off
Preview
Standard
ETSI ES 202 667 V1.1.1 (2009-04) - Speech and multimedia Transmission Quality (STQ); Audiovisual QoS for communication over IP networks
English language
63 pages
sale 15% off
Preview
sale 15% off
Preview
Standardization document
SIST ES 202 667 V1.1.1:2009
English language
63 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day

Standards Content (Sample)


Final draft ETSI ES 202 667 V1.1.1 (2009-02)
ETSI Standard
Speech and multimedia Transmission Quality (STQ);
Audiovisual QoS for communication over IP networks

2 Final draft ETSI ES 202 667 V1.1.1 (2009-02)

Reference
DES/STQ-00097
Keywords
multimedia, QoS
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE

Tel.: +33 4 92 94 42 00  Fax: +33 4 93 65 47 16

Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88

Important notice
Individual copies of the present document can be downloaded from:
http://www.etsi.org
The present document may be made available in more than one electronic version or in print. In any case of existing or
perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF).
In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive
within ETSI Secretariat.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
http://portal.etsi.org/tb/status/status.asp
If you find errors in the present document, please send your comment to one of the following services:
http://portal.etsi.org/chaircor/ETSI_support.asp
Copyright Notification
No part may be reproduced except as authorized by written permission.
The copyright and the foregoing restriction extend to reproduction in all media.

© European Telecommunications Standards Institute 2009.
All rights reserved.
TM TM TM TM
DECT , PLUGTESTS , UMTS , TIPHON , the TIPHON logo and the ETSI logo are Trade Marks of ETSI registered
for the benefit of its Members.
TM
3GPP is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners.
LTE™ is a Trade Mark of ETSI currently being registered
for the benefit of its Members and of the 3GPP Organizational Partners.
GSM® and the GSM logo are Trade Marks registered and owned by the GSM Association.
ETSI
3 Final draft ETSI ES 202 667 V1.1.1 (2009-02)
Contents
Intellectual Property Rights . 5
Foreword . 5
1 Scope . 6
2 References . 6
2.1 Normative references . 7
2.2 Informative references . 10
3 Definitions and abbreviations . 11
3.1 Definitions . 11
3.2 Abbreviations . 12
4 Parameters affecting audiovisual user perceived quality . 13
4.1 Audiovisual user perceived quality model . 13
4.2 Coding algorithms . 14
4.2.1 Speech . 14
4.2.2 Audio . 16
4.2.3 Video . 16
4.2.4 Lip synchronization . 20
4.3 Network transmission protocols and principles . 21
4.3.1 Unreliable communication . 21
4.3.2 Multicast . 21
4.3.3 Reliable communication . 22
4.3.4 Multipoint . 22
4.3.5 DSL Access . 22
4.3.5.1 Asymmetrical DSL. 23
4.3.5.2 Symmetrical DSL . 24
4.3.6 Wireless access . 24
4.3.6.1 2G Mobile access . 24
4.3.6.2 3G Mobile access . 24
4.3.6.3 DECT . 25
4.3.6.4 Broadband Wireless Access . 26
4.3.6.5 WLAN . 27
4.3.7 Broadcasting . 27
4.4 Network performance parameters . 28
4.4.1 Transmission bandwidth . 28
4.4.1.1 General considerations . 28
4.4.1.2 Speech transmission . 28
4.4.1.3 Audio transmission . 28
4.4.1.4 Video transmission . 30
4.4.2 Packet loss . 30
4.4.2.1 General considerations . 30
4.4.2.2 Speech transmission . 30
4.4.2.3 Audio transmission . 31
4.4.2.4 Video transmission . 32
4.4.3 Transmission delay . 33
4.4.3.1 General . 33
4.4.3.2 Transmitting terminal delay . 33
4.4.3.3 Access network delay . 34
4.4.3.4 Core network delay . 35
4.4.3.5 Receiving terminal delay . 35
4.4.4 Transmission delay variations (jitter) . 35
4.5 Terminal characteristics . 36
4.5.1 Packet loss recovery. 36
4.5.2 Playout buffer . 36
4.5.3 Audio characteristics. 36
4.5.4 Video display . 36
ETSI
4 Final draft ETSI ES 202 667 V1.1.1 (2009-02)
4.6 Audio-video interaction . 37
5 Audiovisual applications classification . 37
5.1 Delay sensitive applications . 38
5.2 Delay insensitive applications . 38
6 Delay sensitive audiovisual applications requirements . 38
6.1 Coding algorithms . 38
6.1.1 Narrowband speech. 38
6.1.2 Wideband speech . 39
6.1.3 Video . 39
6.2 Network performance requirements . 40
6.3 Terminal characteristics . 41
6.3.1 Narrowband speech. 41
6.3.2 Wideband speech . 41
6.3.3 Video . 41
7 Delay insensitive audiovisual applications requirements . 41
7.1 Coding algorithms . 41
7.1.1 Audio . 41
7.1.2 Video . 42
7.2 Network performance requirements . 43
7.3 Terminal characteristics . 44
7.3.1 Audio . 44
7.3.2 Video . 44
Annex A (informative): Audio-video quality interaction . 45
A.1 Introduction . 45
A.2 Available information. 45
A.2.1 TR 102 479 . 45
A.2.2 Results presented to ITU-T Recommendation SG 12 . 47
Annex B (informative): QoS mechanisms and the effects on connection performance . 50
B.1 Introduction . 50
B.2 QoS mechanisms . 50
B.2.1 Integrated services (IntServ). 50
B.2.2 Differentiated services (DiffServ) . 50
B.2.3 WLAN QoS mechanism . 51
B.2.4 WiMax QoS mechanism . 52
B.2.5 HSPA packet scheduling . 52
B.2.6 RACS . 53
B.3 Effects on connection performance . 55
Annex C (informative): Packet loss recovery . 56
C.1 Introduction . 56
C.2 Application layer packet loss recovery methods . 56
C.2.1 Speech and audio recovery . 56
C.2.2 Video recovery . 59
C.3 Performance improvement . 60
Annex D (informative): Provisional QoS Classes defined in ITU-T Recommendation Y.1541 . 62
History . 63

ETSI
5 Final draft ETSI ES 202 667 V1.1.1 (2009-02)
Intellectual Property Rights
IPRs essential or potentially essential to the present document may have been declared to ETSI. The information
pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found
in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in
respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web
server (http://webapp.etsi.org/IPR/home.asp).
Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee
can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web
server) which are, or may be, or may become, essential to the present document.
Foreword
This ETSI Standard (ES) has been produced by ETSI Technical Committee Speech and multimedia Transmission
Quality (STQ), and is now submitted for the ETSI standards Membership Approval Procedure.
ETSI
6 Final draft ETSI ES 202 667 V1.1.1 (2009-02)
1 Scope
The present document addresses combination network performance parameters and user perceived media (audio and
video) quality parameters for audiovisual communications on IP networks.
The access technologies covered include both wired (e.g. xDSL) and wireless (e.g. UMTS, WLAN) technologies.
The display size range covered is from those of small mobile terminals (e.g. 2") up to large TV sets (e.g. 40" or more).
It is applicable to:
• Broadcasting and streaming applications such as IPTV and VoD.
• Interactive point-to-pint applications such as videotelephony and videoconferencing.
Where the media coding standards define two or more profiles, the baseline profile is addressed in the normative part of
the standard.
Informative annexes present an overview of network QoS mechanisms and the effects on connection performance as
well as guidance on terminal parameters that may influence the user perceived media performance.
2 References
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific.
• For a specific reference, subsequent revisions do not apply.
• Non-specific reference may be made only to a complete document or a part thereof and only in the following
cases:
- if it is accepted that it will be possible to use all future changes of the referenced document for the
purposes of the referring document;
- for informative references.
Referenced documents which are not found to be publicly available in the expected location might be found at
http://docbox.etsi.org/Reference.
NOTE: While any hyperlinks included in this clause were valid at the time of publication ETSI cannot guarantee
their long term validity.
ETSI
7 Final draft ETSI ES 202 667 V1.1.1 (2009-02)
2.1 Normative references
The following referenced documents are indispensable for the application of the present document. For dated
references, only the edition cited applies. For non-specific references, the latest edition of the referenced document
(including any amendments) applies.
[1] ITU-T Recommendation Y.1540: "Internet protocol data communication service - IP packet
transfer and availability performance parameters".
[2] ITU-T Recommendation Y.1541: "Network performance objectives for IP-based services".
[3] ITU-T Recommendation G.711: "Pulse Code Modulation (PCM) of voice frequencies".
[4] ITU-T Recommendation G.722: "7 kHz audio-coding within 64 kbit/s".
[5] ITU-T Recommendation G.723.1: "Dual rate speech coder for multimedia communications
transmitting at 5,3 and 6,3 kbit/s".
[6] ITU-T Recommendation G.726: "40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code
Modulation (ADPCM)".
[7] ITU-T Recommendation G.728: "Coding of speech at 16 kbit/s using low-delay code exited linear
prediction".
[8] ITU-T Recommendation G.729: "Coding of speech at 8 kbit/s using conjugate-structure
algebraic-code-excited linear-prediction (CS-ACELP)".
[9] ITU-T Recommendation G.729.1: "G.729 Embedded Variable bit-rate coder: An 8-32 kbit/s
scalable wideband coder bitstream interoperable with G.729".
[10] ETSI TS 126 071 (V6.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); AMR speech Codec; General description
(3GPP TS 26.071, version 6.0.0 Release 6)".
[11] 3GPP2 C.S0014-C v1.0 Enhanced Variable Rate Codec, Speech Service Option 3, 68 and 70 for
Wideband Spread Spectrum Digital Systems, January 2007.
[12] ITU-T Recommendation G.722.1: "Low-complexity coding at 24 and 32 kbit/s for hands-free
operation in systems with low frame loss".
[13] ITU-T Recommendation G.711.1: "Wideband embedded extension for G.711 pulse code
modulation".
[14] ETSI TS 126 171 (V6.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); AMR speech codec, wideband; General description
(3GPP TS 26.171, version 6.0.0 Release 6)".
NOTE: TS 126 171 is identical to ITU-T Recommendation G.722.2.
[15] IETF RFC 3351: "RTP Profile for Audio and Video Conferences with Minimal Control".
[16] ISO/IEC 11172: "Information technology -- Coding of moving pictures and associated audio for
digital storage media at up to about 1,5 Mbit/s (MPEG 1, 5 parts)".
[17] ISO/IEC 13818: "Information technology -- Generic coding of moving pictures and associated
audio information (MPEG 2, 9 parts)".
[18] ISO/IEC 14496: "Information technology -- Coding of audio-visual objects (MPEG 4; currently in
11 parts)".
[19] ETSI TS 126 290 (V6.3.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS) Audio codec processing functions; Extended
Adaptive Multi-Rate - Wideband (AMR-WB+) codec; Transcoding functions
(3GPP TS 26.290, version 6.0.0 Release 6)".
ETSI
8 Final draft ETSI ES 202 667 V1.1.1 (2009-02)

[20] ITU-T Recommendation G.719: "Low-complexity full-band audio coding for high-quality
conversational applications".
[21] ETSI TS 102 366: "Digital Audio Compression (AC-3, Enhanced AC-3) Standard".
[22] ITU-T Recommendation H.261: "Video codec for audiovisual services at p × 64 kbit/s".
[23] ITU-T Recommendation H.262: "Information technology - Generic coding of moving pictures and
associated audio information: Video".
[24] ITU-T Recommendation H.263: "Video coding for low bit rate communication".
[25] ITU-T Recommendation H.264: "Advanced video coding for generic audiovisual services".
NOTE: This recommendation is identical to MPEG 4 Annex 10.
[26] SMPTE 421M (2006): "Television - VC-1 Compressed Video Bitstream Format and Decoding
Process".
[27] ITU-R Recommendation BT.1359-1: "Relative timing of sound and vision for broadcasting".
[28] IETF RFC 768: "User Datagram Protocol".
[29] ETSI TS 122 146 (V7.1.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); LTE; Multimedia Broadcast/Multicast Service
(MBMS); Stage 1 (3GPP TS 22.146, version 7.1.0 Release 7)".
[30] IETF RFC 793: "Transmission Control Protocol".
[31] ITU-T Recommendation G.995.1: "Overview of digital subscriber line (DSL) Recommendations".
[32] IEEE 802.3 (2005): "IEEE Standard for Information technology-Telecommunications and
information exchange between systems-Local and metropolitan area networks; Specific
requirements Part 3: Carrier Sense Multiple Access with Collision Detection (CSMA/CD) Access
Method and Physical Layer Specifications".
[33] ITU-T Recommendation G.992: Parts 1 to 5.
[34] ITU-T Recommendation G.993: Parts 1 and 2.
[35] ITU-T Recommendation G.991.1: "High bit rate Digital Subscriber Line (HDSL) transceivers".
[36] ITU-T Recommendation G.991.2: "Single-pair high-speed digital subscriber line (SHDSL)
transceivers".
[37] ETSI TS 101 113 (V7.5.0): "Digital cellular telecommunications system (Phase 2+) (GSM);
General Packet Radio Service (GPRS); Service description; Stage 1
(GSM 02.60, version 7.5.0 Release 1998)".
[38] ETSI TS 122 228 (V8.5.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); Service requirements for the Internet Protocol (IP)
multimedia core network subsystem (IMS); Stage 1 (3GPP TS 22.228, version 8.5.0 Release 8)".
[39] ETSI TS 122 173 (V7.5.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); IP Multimedia Core Network Subsystem (IMS)
Multimedia Telephony Service and supplementary services; Stage 1
(3GPP TS 22.173, version 7.5.0 Release 7)".
[40] ETSI TS 125 308 (V7.7.0): "Universal Mobile Telecommunications System (UMTS); High Speed
Downlink Packet Access (HSDPA); Overall description; Stage 2
(3GPP TS 25.308, version 7.7.0 Release 7)".
[41] ETSI TS 125 319 (V7.6.0): "Universal Mobile Telecommunications System (UMTS); Enhanced
uplink; Overall description; Stage 2 (3GPP TS 25.319, version 7.6.0 Release 7)".
ETSI
9 Final draft ETSI ES 202 667 V1.1.1 (2009-02)

[42] ETSI TS 123 107 (V7.1.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); Quality of Service (QoS) concept and architecture
(3GPP TS 23.107, version 7.1.0 Release 7)".
[43] ETSI TS 123 207 (V7.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); End-to-end Quality of Service (QoS) concept and
architecture (3GPP TS 23.207, version 7.0.0 Release 7)".
[44] ETSI EN 300 175-2: "Digital Enhanced Cordless Telecommunications (DECT); Common
Interface (CI); Part 2: Physical Layer (PHL)".
[45] IEEE 802.16 (2004): "Standard for Local and metropolitan area networks. Part 16: Air Interface
for Fixed Broadband Wireless Access Systems".
[46] IEEE 802.11 (2007): "Information technology - Telecommunications and information exchange
between systems - Local and metropolitan area networks. Specific requirements. Part 11: Wireless
LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications".
[47] ETSI EN 300 401: "Radio Broadcasting Systems; Digital Audio Broadcasting (DAB) to mobile,
portable and fixed receivers".
[48] ETSI TS 102 428: "Digital Audio Broadcasting (DAB); DMB video service; User Application
Specification".
[49] ETSI EN 302 307: "Digital Video Broadcasting (DVB); Second generation framing structure,
channel coding and modulation systems for Broadcasting, Interactive Services, News Gathering
and other broadband satellite applications".
[50] ETSI EN 300 744: "Digital Video Broadcasting (DVB); Framing structure, channel coding and
modulation for digital terrestrial television".
[51] ETSI EN 300 419: "Access and Terminals (AT); 2 048 kbit/s digital structured leased lines
(D2048S); Connection characteristics".
[52] ETSI EN 302 304: "Digital Video Broadcasting (DVB); Transmission System for Handheld
Terminals (DVB-H)".
[53] ETSI EN 302 583: "Digital Video Broadcasting (DVB); Framing Structure, channel coding and
modulation for Satellite Services to Handheld devices (SH) below 3 GHz".
[54] ETSI TS 101 154: "Digital Video Broadcasting (DVB); Specification for the use of Video and
Audio Coding in Broadcasting Applications based on the MPEG-2 Transport Stream".
[55] ETSI TS 102 005: "Digital Video Broadcasting (DVB); Specification for the use of Video and
Audio Coding in DVB services delivered directly over IP protocols".
[56] ITU-T Recommendation G.114: "One-way transmission time".
[57] IETF RFC 3550: "RTP: A Transport Protocol for Real-Time Applications".
[58] IETF RFC 3095: "RObust Header Compression (ROHC): Framework and four profiles: RTP,
UDP, ESP, and uncompressed".
[59] ETSI TS 181 005: "Telecommunications and Internet Converged Services and Protocols for
Advanced Networking (TISPAN); Service and Capability Requirements".
[60] ITU-R Recommendation BS.1534-1: "Method for the subjective assessment of intermediate
quality levels of coding systems".
[61] ITU-T Recommendation G.107: "The E-model, a computational model for use in transmission
planning".
[62] ITU-T Recommendation G.1010: "End-user Multimedia QoS Categories".
ETSI
10 Final draft ETSI ES 202 667 V1.1.1 (2009-02)

[63] ETSI ES 202 737: "Speech Processing, Transmission and Quality Aspects (STQ); Transmission
requirements for narrowband VoIP terminals (handset and headset) from a QoS perspective as
perceived by the user".
[64] ETSI ES 202 738: "Speech Processing, Transmission and Quality Aspects (STQ); Transmission
requirements for narrowband VoIP loudspeaking and handsfree terminals from a QoS perspective
as perceived by the user".
[65] ETSI ES 202 739: "Speech Processing, Transmission and Quality Aspects (STQ); Transmission
requirements for wideband VoIP terminals (handset and headset) from a QoS perspective as
perceived by the user".
[66] ETSI ES 202 740: "Speech Processing, Transmission and Quality Aspects (STQ); Transmission
requirements for wideband VoIP loudspeaking and handsfree terminals from a QoS perspective as
perceived by the user".
[67] ETSI TS 126 235 (V7.4.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); Packet switched conversational multimedia
applications; Default codecs (3GPP TS 26.235, version 7.4.0 Release 7)".
[68] ITU-T Recommendation J.247: "Objective perceptual multimedia video quality measurement in
the presence of a full reference".
[69] ITU-T Recommendation P.911: "Subjective audiovisual quality assessment methods for
multimedia applications".
[70] ETSI TS 181 018: "Telecommunications and Internet converged Services and Protocols for
Advanced Networking (TISPAN); Requirements for QoS in a NGN".
[71] ETSI TS 122 105 (V8.4.0): "Universal Mobile Telecommunications System (UMTS); Services
and service capabilities (3GPP TS 22.105, version 8.4.0 Release 8)".
[72] ETSI TS 126 234 (V7.5.0): "Universal Mobile Telecommunications System (UMTS); Transparent
end-to-end Packet-switched Streaming Service (PSS); Protocols and codecs
(3GPP TS 26.234, version 7.5.0 Release 7)".
[73] ETSI TS 126 346 (V7.8.0): "Universal Mobile Telecommunications System (UMTS); Multimedia
Broadcast/Multicast Service (MBMS); Protocols and codecs
(3GPP TS 26.346, version 7.8.0 Release 7)".
2.2 Informative references
The following referenced documents are not essential to the use of the present document but they assist the user with
regard to a particular subject area. For non-specific references, the latest version of the referenced document (including
any amendments) applies.
[i.1] ETSI ETR 310: "Digital Enhanced Cordless Telecommunications (DECT); Traffic capacity and
spectrum requirements for multi-system and multi-service DECT applications co-existing in a
common frequency band".
[i.2] ETSI TS 126 091 (V7.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); AMR speech Codec; Error concealment of lost
frames (3GPP TS 26.091, version 7.0.0 Release 7)".
[i.3] ETSI TS 126 191 (V7.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); Speech codec speech processing functions;
Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; Error concealment of erroneous or
lost frames (3GPP TS 26.191, version 7.0.0 Release 7)".
[i.4] ETSI TR 102 479: "Telecommunications and Internet converged Services and Protocols for
Advanced Networking (TISPAN); Review of available material on QoS requirements of
Multimedia Services".
[i.5] IETF RFC 1633: "Integrated services in the Internet architecture: An overview".
ETSI
11 Final draft ETSI ES 202 667 V1.1.1 (2009-02)

[i.6] IETF RFC 2205: "Resource ReSerVation Protocol (RSVP) Version 1 Functional Specification".
[i.7] IETF RFC 2475: "An Architecture for Differentiated Services".
[i.8] Cicconetti, C., Lezini, L., Mingozzi, E. and Eklund, C.: "Quality of Service support in 802.16
networks. IEEE Network, vol. 29", March/April 2006.
[i.9] Pedersen, K., Mogensen, P. and Kolding, T.: "Overview of QoS options for HSDPA. IEEE
Communications Magazine vol. 44", July 2006.
[i.10] ETSI ES 282 003: "Telecommunications and Internet converged Services and Protocols for
Advanced Networking (TISPAN); Resource and Admission Control Sub-System (RACS):
Functional Architecture".
[i.11] Perkins, C, Hodson, O. and Hardman, V.: "A Survey of Packet Loss Recovery Techniques for
Streaming Audio. IEEE Network vol. 12, No. 5", 1998.
[i.12] Wah, B., Su, X. and Lin, D.: "A Survey of Error-Concealment Schemes for Real-Time Audio and
Video Transmissions over the Internet. IEEE International Symposium on Multimedia Software
Engineering 2000 (MSE 2000)"; Taipei, Taiwan, 11-13 December 2000.
[i.13] ITU-T Recommendation G.711 (Appendix I).: "Pulse code modulation (PCM) of voice
frequencies; A high quality low-complexity algorithm for packet loss concealment with G.711".
[i.14] ITU-T Recommendation G.722 (Appendix III): "7 kHz audio-coding within 64 kbit/s; A
high-quality packet loss concealment algorithm for G.722".
[i.15] ITU-T Recommendation G.722 (Appendix IV): "7 kHz audio-coding within 64 kbit/s; A
low-complexity algorithm for packet loss concealment with G.722".
[i.16] Wenger, S.: "H.264/AVC over IP. IEEE Transactions on circuits and systems for video
technology, vol. 13, No. 7", 2003.
[i.17] ETSI TR 101 329-6: "Telecommunications and Internet Protocol Harmonization Over Networks
(TIPHON) Release 3; End-to-end Quality of Service in TIPHON systems; Part 6: Actual
measurements of network and terminal characteristics and performance parameters in TIPHON
networks and their influence on voice quality".
[i.18] Kövesi, B. and Ragot, S.: "A low complexity packet loss concealment algorithm for
ITU-T Recommendation G.722. 2008 IEEE International Conference on Acoustics, Speech, and
th th
Signal Processing (ICASSP)". Las Vegas, USA, 30 March - 4 April, 2008.
[i.19] ITU-T Recommendation I.113: "Vocabulary of terms for broadband aspects of ISDN".
[i.20] ITU-T Recommendation G.722.2: "Wideband coding of speech at around 16 kbit/s using Adaptive
Multi-Rate Wideband (AMR-WB)".
[i.21] ITU-T Recommendation SG 12: "Temporary Documents".
[i.22] Layer 1 specifications.
NOTE: Available at http://3GPP specification series: 05series.
3 Definitions and abbreviations
3.1 Definitions
For the purposes of the present document, the following terms and definitions apply:
audio: all signals that are audible to human beings, including speech and music
broadcasting: communication capability which denotes unidirectional distribution from a single source to all users
connected to the network
ETSI
12 Final draft ETSI ES 202 667 V1.1.1 (2009-02)

multipoint: value of the service attribute "communication configuration", which denotes that the communication
involves more than two network terminations
NOTE: Source: ITU-T Recommendation I.113 [i.19].
narrowband speech: speech restricted to the frequency band from 300 Hz to 3 400 Hz
speech: oral production of information by a human being
streaming: mechanism whereby media content can be rendered at the same time that it is being transmitted to the client
over the network
video: signal that contains timing/synchronization information as well as luminance (intensity) and chrominance
(colour) information that when displayed on an appropriate device gives a visual representation of the original image
sequence
videoconferencing: service providing interactive, bi-directional and real time audio-visual communication
NOTE: Normally intended for multiple users at each end.
videotelephony: service providing an interactive, bi-directional, real time audio-visual communication between users
wideband speech: speech restricted to the frequency band from 50 Hz to 7 000 Hz
3.2 Abbreviations
For the purposes of the present document, the following abbreviations apply:
rd
3GPP 3 Generation Partnership Project
rd
3GPP2 3 Generation Partnership Project 2
NOTE: A 3G project comprising North American and Asian interests.
AAC Advanced Audio Coding
ADPCM Adaptive Differential Pulse Code Modulation
AMR Adaptive Multi Rate
AMR-WB Adaptive Multi Rate Wide Band
AMR-WB+ Adaptive Multi Rate extended Wide Band
AP Access Point (IEEE 802.11 WLAN [46])
ATM Asynchronous Transfer Mode
AVC Advanced Video Coding
CCIR Comité Consultatif International pour la Radio; Now ITU-R
CELP Code-Excited Linear Predictive
CIF Common Intermediate Format
CPCFC Custom Picture Clock Frequency Code
CPFMT Custom Picture ForMaT
DECT Digital Enhanced Cordless Telecommunications
DPCM Differential Pulse Code Modulation
EUL Enhanced UpLink
FER Frame Error Rate
FP Fixed Part (DECT)
HDTV High Definition TV
HE-AAC High Efficiency AAC
HSPA High-Speed Packet Access
HSDPA High-Speed Downstream Packet Access
HSUPA High-Speed Upstream Packet Access
IETF Internet Engineering Task Force
IMS IP Multimedia Subsystem
IP Internet Protocol
IPDV IP Packet Delay Variation
IPER IP Packet Error Ratio
IPLR IP Packet Loss Ratio
IPTD IP Packet Transfer Delay
ETSI
13 Final draft ETSI ES 202 667 V1.1.1 (2009-02)

ITU-R International Telecommunication Union - Radiocommunication sector
ITU-T International Telecommunication Union - Telecommunication standardization sector
LPC Linear Predictive Coding
MAC Medium Access Control
MBMS Mobile Broadcast/Multicast Service
MDCT Modified Discrete Cosine Transform
MCU Multipoint Control Unit
MPE Multi-Pulse Excited
MPEG 2 TS MPEG 2 Transport Stream
MPEG Moving Picture Experts Group
MUSHRA MUlti Stimulus with Hidden Reference and Anchors
NTSC National Television System Committee
NOTE: Used to identify an analogue TV standard used outside Europe.
PAL Phase-Alternating Line
NOTE: Colour-encoding system used in television systems.
PBX Private Branch eXchange
PCM Pulse Code Modulation
PP Portable Part (DECT)
QCIF Quart CIF
QVGA Quart VGA
RTP Real-time Transport Protocol
RTT Round Trip Time
SDTV Standard Definition TV
SVC Scalable Video Coding
TCP Transport Control Protocol
TTI Transmission Time Interval
UDP User Datagram Protocol
UMTS Universal Mobile Telecommunications System
VGA Video Graphics Array
W-CDMA Wideband-Code Division Multiple Access
WLAN Wireless Local Area Network
NOTE: IPER, IPDV, IPLR and IPTD are defined in ITU-T Recommendations Y.1540 [1] and Y.1541 [2].
4 Parameters affecting audiovisual user perceived
quality
4.1 Audiovisual user perceived quality model
The characteristics affecting audiovisual user perceived quality and their interactions are illustrated in figure 1.
ETSI
14 Final draft ETSI ES 202 667 V1.1.1 (2009-02)

����
������������������
����������
���������������
������
����������������
����
�������
����������
�������������
�����
������������
������
��������
���������������
Figure 1: Characteristics affecting audiovisual user perceived quality
4.2 Coding algorithms
4.2.1 Speech
There are two groups of speech codecs:
• narrowband codecs, transmitting the frequency band from 300 Hz to 3 400 Hz;
• wideband codecs, transmitting the frequency band from 50 Hz to 7 000 Hz.
NOTE 1: The transmission bandwidth indicated is the bandwidth supported by the coding algorithm. The actual
bandwidth supported may be restricted due to handset or terminal characteristics.
Speech codecs are often classified into three types:
• waveform codecs;
• source codecs;
• hybrid codecs.
Waveform codecs attempt, without using any knowledge of how the signal to be coded was generated, to produce a
reconstructed signal whose waveform is as close as possible to the original. This means that in theory they should be
signal independent and work well with non-speech signals. An example is the codec standardized in
ITU-T Recommendation G.711 [3].
To reduce the required bit rate, the difference compared with the previous sample may be transmitted instead of the
actual sample. This technique is call delta modulation or differential PCM (DPCM). This technique may be further
enhanced by predicting the value of the next sample from the previous samples and transmit the difference between the
predicted value and the actual sampled value (ADPCM).
The input speech signal may also be split into a number of frequency bands, or sub-bands, and each is coded
independently. This is called Sub-band coding. An example is the codec defined in ITU-T Recommendation G.722 [4]
where the 7 kHz frequency band is divided into two sub-bands, which are coded independent of each other.
ETSI
15 Final draft ETSI ES 202 667 V1.1.1 (2009-02)

Source coders operate using a model of how the source is generated, and attempt to extract, from the signal being
coded, the parameters of the model. Coders using this technique require very low bit rate, but the quality is usually not
good enough for public telecommunication applications.
Hybrid codecs attempt to fill the gap between waveform and source codecs. Although other forms of hybrid codecs
exist, the most successful and commonly used are time domain Analysis-by-Synthesis (AbS) codecs. Such coders use
the same linear prediction filter model of the vocal tract as found in LPC vocoders. However instead of applying a
simple two-state, voiced/unvoiced, model to find the necessary input to this filter, the excitation signal is chosen by
attempting to match the reconstructed speech waveform as closely as possible to the original speech waveform.
Examples are Multi-Pulse Excited (MPE) codecs and Code-Excited Linear Predictive (CELP) codecs.
The narrowband speech codecs standardized by ITU-T and ETSI are listed in table 1.
Table 1: Narrowband speech codecs standardized by ITU-T and ETSI
Codec ID Bit rate (kbit/s) Frame size Look ahead
(ms) (ms)
ITU-T Recommendation G.711 [75] 64 n/a
ITU-T Recommendation G.723.1 [5] 6,3 30 7,5
5,3
ITU
...


ETSI Standard
Speech and multimedia Transmission Quality (STQ);
Audiovisual QoS for communication over IP networks

2 ETSI ES 202 667 V1.1.1 (2009-04)

Reference
DES/STQ-00097
Keywords
multimedia, QoS
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE

Tel.: +33 4 92 94 42 00  Fax: +33 4 93 65 47 16

Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88

Important notice
Individual copies of the present document can be downloaded from:
http://www.etsi.org
The present document may be made available in more than one electronic version or in print. In any case of existing or
perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF).
In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive
within ETSI Secretariat.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
http://portal.etsi.org/tb/status/status.asp
If you find errors in the present document, please send your comment to one of the following services:
http://portal.etsi.org/chaircor/ETSI_support.asp
Copyright Notification
No part may be reproduced except as authorized by written permission.
The copyright and the foregoing restriction extend to reproduction in all media.

© European Telecommunications Standards Institute 2009.
All rights reserved.
TM TM TM TM
DECT , PLUGTESTS , UMTS , TIPHON , the TIPHON logo and the ETSI logo are Trade Marks of ETSI registered
for the benefit of its Members.
TM
3GPP is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners.
LTE™ is a Trade Mark of ETSI currently being registered
for the benefit of its Members and of the 3GPP Organizational Partners.
GSM® and the GSM logo are Trade Marks registered and owned by the GSM Association.
ETSI
3 ETSI ES 202 667 V1.1.1 (2009-04)
Contents
Intellectual Property Rights . 5
Foreword . 5
1 Scope . 6
2 References . 6
2.1 Normative references . 6
2.2 Informative references . 10
3 Definitions and abbreviations . 11
3.1 Definitions . 11
3.2 Abbreviations . 12
4 Parameters affecting audiovisual user perceived quality . 13
4.1 Audiovisual user perceived quality model . 13
4.2 Coding algorithms . 14
4.2.1 Speech . 14
4.2.2 Audio . 15
4.2.3 Video . 16
4.2.4 Lip synchronization . 20
4.3 Network transmission protocols and principles . 21
4.3.1 Unreliable communication . 21
4.3.2 Multicast . 21
4.3.3 Reliable communication . 22
4.3.4 Multipoint . 22
4.3.5 DSL Access . 22
4.3.5.1 Asymmetrical DSL. 23
4.3.5.2 Symmetrical DSL . 24
4.3.6 Wireless access . 24
4.3.6.1 2G Mobile access . 24
4.3.6.2 3G Mobile access . 24
4.3.6.3 DECT . 25
4.3.6.4 Broadband Wireless Access . 26
4.3.6.5 WLAN . 27
4.3.7 Broadcasting . 27
4.4 Network performance parameters . 28
4.4.1 Transmission bandwidth . 28
4.4.1.1 General considerations . 28
4.4.1.2 Speech transmission . 28
4.4.1.3 Audio transmission . 28
4.4.1.4 Video transmission . 30
4.4.2 Packet loss . 30
4.4.2.1 General considerations . 30
4.4.2.2 Speech transmission . 30
4.4.2.3 Audio transmission . 31
4.4.2.4 Video transmission . 32
4.4.3 Transmission delay . 33
4.4.3.1 General . 33
4.4.3.2 Transmitting terminal delay . 33
4.4.3.3 Access network delay . 34
4.4.3.4 Core network delay . 35
4.4.3.5 Receiving terminal delay . 35
4.4.4 Transmission delay variations (jitter) . 35
4.5 Terminal characteristics . 36
4.5.1 Packet loss recovery. 36
4.5.2 Playout buffer . 36
4.5.3 Audio characteristics. 36
4.5.4 Video display . 36
ETSI
4 ETSI ES 202 667 V1.1.1 (2009-04)
4.6 Audio-video interaction . 37
5 Audiovisual applications classification . 37
5.1 Delay sensitive applications . 38
5.2 Delay insensitive applications . 38
6 Delay sensitive audiovisual applications requirements . 38
6.1 Coding algorithms . 38
6.1.1 Narrowband speech. 38
6.1.2 Wideband speech . 39
6.1.3 Video . 39
6.2 Network performance requirements . 40
6.3 Terminal characteristics . 41
6.3.1 Narrowband speech. 41
6.3.2 Wideband speech . 41
6.3.3 Video . 41
7 Delay insensitive audiovisual applications requirements . 41
7.1 Coding algorithms . 41
7.1.1 Audio . 41
7.1.2 Video . 42
7.2 Network performance requirements . 43
7.3 Terminal characteristics . 44
7.3.1 Audio . 44
7.3.2 Video . 44
Annex A (informative): Audio-video quality interaction . 45
A.1 Introduction . 45
A.2 Available information. 45
A.2.1 TR 102 479 . 45
A.2.2 Results presented to ITU-T Recommendation SG 12 . 47
Annex B (informative): QoS mechanisms and the effects on connection performance . 50
B.1 Introduction . 50
B.2 QoS mechanisms . 50
B.2.1 Integrated services (IntServ). 50
B.2.2 Differentiated services (DiffServ) . 50
B.2.3 WLAN QoS mechanism . 51
B.2.4 WiMax QoS mechanism . 52
B.2.5 HSPA packet scheduling . 52
B.2.6 RACS . 53
B.3 Effects on connection performance . 55
Annex C (informative): Packet loss recovery . 56
C.1 Introduction . 56
C.2 Application layer packet loss recovery methods . 56
C.2.1 Speech and audio recovery . 56
C.2.2 Video recovery . 59
C.3 Performance improvement . 60
Annex D (informative): Provisional QoS Classes defined in ITU-T Recommendation Y.1541 . 62
History . 63

ETSI
5 ETSI ES 202 667 V1.1.1 (2009-04)
Intellectual Property Rights
IPRs essential or potentially essential to the present document may have been declared to ETSI. The information
pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found
in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in
respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web
server (http://webapp.etsi.org/IPR/home.asp).
Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee
can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web
server) which are, or may be, or may become, essential to the present document.
Foreword
This ETSI Standard (ES) has been produced by ETSI Technical Committee Speech and multimedia Transmission
Quality (STQ).
ETSI
6 ETSI ES 202 667 V1.1.1 (2009-04)
1 Scope
The present document addresses combination network performance parameters and user perceived media (audio and
video) quality parameters for audiovisual communications on IP networks.
The access technologies covered include both wired (e.g. xDSL) and wireless (e.g. UMTS, WLAN) technologies.
The display size range covered is from those of small mobile terminals (e.g. 2") up to large TV sets (e.g. 40" or more).
It is applicable to:
• Broadcasting and streaming applications such as IPTV and VoD.
• Interactive point-to-pint applications such as videotelephony and videoconferencing.
Where the media coding standards define two or more profiles, the baseline profile is addressed in the normative part of
the standard.
Informative annexes present an overview of network QoS mechanisms and the effects on connection performance as
well as guidance on terminal parameters that may influence the user perceived media performance.
2 References
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific.
• For a specific reference, subsequent revisions do not apply.
• Non-specific reference may be made only to a complete document or a part thereof and only in the following
cases:
- if it is accepted that it will be possible to use all future changes of the referenced document for the
purposes of the referring document;
- for informative references.
Referenced documents which are not found to be publicly available in the expected location might be found at
http://docbox.etsi.org/Reference.
NOTE: While any hyperlinks included in this clause were valid at the time of publication ETSI cannot guarantee
their long term validity.
2.1 Normative references
The following referenced documents are indispensable for the application of the present document. For dated
references, only the edition cited applies. For non-specific references, the latest edition of the referenced document
(including any amendments) applies.
[1] ITU-T Recommendation Y.1540: "Internet protocol data communication service - IP packet
transfer and availability performance parameters".
[2] ITU-T Recommendation Y.1541: "Network performance objectives for IP-based services".
[3] ITU-T Recommendation G.711: "Pulse Code Modulation (PCM) of voice frequencies".
[4] ITU-T Recommendation G.722: "7 kHz audio-coding within 64 kbit/s".
[5] ITU-T Recommendation G.723.1: "Dual rate speech coder for multimedia communications
transmitting at 5,3 and 6,3 kbit/s".
ETSI
7 ETSI ES 202 667 V1.1.1 (2009-04)
[6] ITU-T Recommendation G.726: "40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code
Modulation (ADPCM)".
[7] ITU-T Recommendation G.728: "Coding of speech at 16 kbit/s using low-delay code exited linear
prediction".
[8] ITU-T Recommendation G.729: "Coding of speech at 8 kbit/s using conjugate-structure
algebraic-code-excited linear-prediction (CS-ACELP)".
[9] ITU-T Recommendation G.729.1: "G.729 Embedded Variable bit-rate coder: An 8-32 kbit/s
scalable wideband coder bitstream interoperable with G.729".
[10] ETSI TS 126 071 (V6.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); AMR speech Codec; General description
(3GPP TS 26.071, version 6.0.0 Release 6)".
[11] 3GPP2 C.S0014-C v1.0 Enhanced Variable Rate Codec, Speech Service Option 3, 68 and 70 for
Wideband Spread Spectrum Digital Systems, January 2007.
[12] ITU-T Recommendation G.722.1: "Low-complexity coding at 24 and 32 kbit/s for hands-free
operation in systems with low frame loss".
[13] ITU-T Recommendation G.711.1: "Wideband embedded extension for G.711 pulse code
modulation".
[14] ETSI TS 126 171 (V6.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); AMR speech codec, wideband; General description
(3GPP TS 26.171, version 6.0.0 Release 6)".
NOTE: TS 126 171 is identical to ITU-T Recommendation G.722.2.
[15] IETF RFC 3351: "RTP Profile for Audio and Video Conferences with Minimal Control".
[16] ISO/IEC 11172: "Information technology -- Coding of moving pictures and associated audio for
digital storage media at up to about 1,5 Mbit/s (MPEG 1, 5 parts)".
[17] ISO/IEC 13818: "Information technology -- Generic coding of moving pictures and associated
audio information (MPEG 2, 9 parts)".
[18] ISO/IEC 14496: "Information technology -- Coding of audio-visual objects (MPEG 4; currently in
11 parts)".
[19] ETSI TS 126 290 (V6.3.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS) Audio codec processing functions; Extended
Adaptive Multi-Rate - Wideband (AMR-WB+) codec; Transcoding functions
(3GPP TS 26.290, version 6.0.0 Release 6)".
[20] ITU-T Recommendation G.719: "Low-complexity full-band audio coding for high-quality
conversational applications".
[21] ETSI TS 102 366: "Digital Audio Compression (AC-3, Enhanced AC-3) Standard".
[22] ITU-T Recommendation H.261: "Video codec for audiovisual services at p × 64 kbit/s".
[23] ITU-T Recommendation H.262: "Information technology - Generic coding of moving pictures and
associated audio information: Video".
[24] ITU-T Recommendation H.263: "Video coding for low bit rate communication".
[25] ITU-T Recommendation H.264: "Advanced video coding for generic audiovisual services".
NOTE: This recommendation is identical to MPEG 4 Annex 10.
[26] SMPTE 421M (2006): "Television - VC-1 Compressed Video Bitstream Format and Decoding
Process".
[27] ITU-R Recommendation BT.1359-1: "Relative timing of sound and vision for broadcasting".
ETSI
8 ETSI ES 202 667 V1.1.1 (2009-04)

[28] IETF RFC 768: "User Datagram Protocol".
[29] ETSI TS 122 146 (V7.1.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); LTE; Multimedia Broadcast/Multicast Service
(MBMS); Stage 1 (3GPP TS 22.146, version 7.1.0 Release 7)".
[30] IETF RFC 793: "Transmission Control Protocol".
[31] ITU-T Recommendation G.995.1: "Overview of digital subscriber line (DSL) Recommendations".
[32] IEEE 802.3 (2005): "IEEE Standard for Information technology-Telecommunications and
information exchange between systems-Local and metropolitan area networks; Specific
requirements Part 3: Carrier Sense Multiple Access with Collision Detection (CSMA/CD) Access
Method and Physical Layer Specifications".
[33] ITU-T Recommendation G.992: Parts 1 to 5.
[34] ITU-T Recommendation G.993: Parts 1 and 2.
[35] ITU-T Recommendation G.991.1: "High bit rate Digital Subscriber Line (HDSL) transceivers".
[36] ITU-T Recommendation G.991.2: "Single-pair high-speed digital subscriber line (SHDSL)
transceivers".
[37] ETSI TS 101 113 (V7.5.0): "Digital cellular telecommunications system (Phase 2+) (GSM);
General Packet Radio Service (GPRS); Service description; Stage 1
(GSM 02.60, version 7.5.0 Release 1998)".
[38] ETSI TS 122 228 (V8.5.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); Service requirements for the Internet Protocol (IP)
multimedia core network subsystem (IMS); Stage 1 (3GPP TS 22.228, version 8.5.0 Release 8)".
[39] ETSI TS 122 173 (V7.5.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); IP Multimedia Core Network Subsystem (IMS)
Multimedia Telephony Service and supplementary services; Stage 1
(3GPP TS 22.173, version 7.5.0 Release 7)".
[40] ETSI TS 125 308 (V7.7.0): "Universal Mobile Telecommunications System (UMTS); High Speed
Downlink Packet Access (HSDPA); Overall description; Stage 2
(3GPP TS 25.308, version 7.7.0 Release 7)".
[41] ETSI TS 125 319 (V7.6.0): "Universal Mobile Telecommunications System (UMTS); Enhanced
uplink; Overall description; Stage 2 (3GPP TS 25.319, version 7.6.0 Release 7)".
[42] ETSI TS 123 107 (V7.1.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); Quality of Service (QoS) concept and architecture
(3GPP TS 23.107, version 7.1.0 Release 7)".
[43] ETSI TS 123 207 (V7.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); End-to-end Quality of Service (QoS) concept and
architecture (3GPP TS 23.207, version 7.0.0 Release 7)".
[44] ETSI EN 300 175-2: "Digital Enhanced Cordless Telecommunications (DECT); Common
Interface (CI); Part 2: Physical Layer (PHL)".
[45] IEEE 802.16 (2004): "Standard for Local and metropolitan area networks. Part 16: Air Interface
for Fixed Broadband Wireless Access Systems".
[46] IEEE 802.11 (2007): "Information technology - Telecommunications and information exchange
between systems - Local and metropolitan area networks. Specific requirements. Part 11: Wireless
LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications".
[47] ETSI EN 300 401: "Radio Broadcasting Systems; Digital Audio Broadcasting (DAB) to mobile,
portable and fixed receivers".
ETSI
9 ETSI ES 202 667 V1.1.1 (2009-04)

[48] ETSI TS 102 428: "Digital Audio Broadcasting (DAB); DMB video service; User Application
Specification".
[49] ETSI EN 302 307: "Digital Video Broadcasting (DVB); Second generation framing structure,
channel coding and modulation systems for Broadcasting, Interactive Services, News Gathering
and other broadband satellite applications".
[50] ETSI EN 300 744: "Digital Video Broadcasting (DVB); Framing structure, channel coding and
modulation for digital terrestrial television".
[51] ETSI EN 300 419: "Access and Terminals (AT); 2 048 kbit/s digital structured leased lines
(D2048S); Connection characteristics".
[52] ETSI EN 302 304: "Digital Video Broadcasting (DVB); Transmission System for Handheld
Terminals (DVB-H)".
[53] ETSI EN 302 583: "Digital Video Broadcasting (DVB); Framing Structure, channel coding and
modulation for Satellite Services to Handheld devices (SH) below 3 GHz".
[54] ETSI TS 101 154: "Digital Video Broadcasting (DVB); Specification for the use of Video and
Audio Coding in Broadcasting Applications based on the MPEG-2 Transport Stream".
[55] ETSI TS 102 005: "Digital Video Broadcasting (DVB); Specification for the use of Video and
Audio Coding in DVB services delivered directly over IP protocols".
[56] ITU-T Recommendation G.114: "One-way transmission time".
[57] IETF RFC 3550: "RTP: A Transport Protocol for Real-Time Applications".
[58] IETF RFC 3095: "RObust Header Compression (ROHC): Framework and four profiles: RTP,
UDP, ESP, and uncompressed".
[59] ETSI TS 181 005: "Telecommunications and Internet Converged Services and Protocols for
Advanced Networking (TISPAN); Service and Capability Requirements".
[60] ITU-R Recommendation BS.1534-1: "Method for the subjective assessment of intermediate
quality levels of coding systems".
[61] ITU-T Recommendation G.107: "The E-model, a computational model for use in transmission
planning".
[62] ITU-T Recommendation G.1010: "End-user Multimedia QoS Categories".
[63] ETSI ES 202 737: "Speech Processing, Transmission and Quality Aspects (STQ); Transmission
requirements for narrowband VoIP terminals (handset and headset) from a QoS perspective as
perceived by the user".
[64] ETSI ES 202 738: "Speech Processing, Transmission and Quality Aspects (STQ); Transmission
requirements for narrowband VoIP loudspeaking and handsfree terminals from a QoS perspective
as perceived by the user".
[65] ETSI ES 202 739: "Speech Processing, Transmission and Quality Aspects (STQ); Transmission
requirements for wideband VoIP terminals (handset and headset) from a QoS perspective as
perceived by the user".
[66] ETSI ES 202 740: "Speech Processing, Transmission and Quality Aspects (STQ); Transmission
requirements for wideband VoIP loudspeaking and handsfree terminals from a QoS perspective as
perceived by the user".
[67] ETSI TS 126 235 (V7.4.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); Packet switched conversational multimedia
applications; Default codecs (3GPP TS 26.235, version 7.4.0 Release 7)".
[68] ITU-T Recommendation J.247: "Objective perceptual multimedia video quality measurement in
the presence of a full reference".
ETSI
10 ETSI ES 202 667 V1.1.1 (2009-04)

[69] ITU-T Recommendation P.911: "Subjective audiovisual quality assessment methods for
multimedia applications".
[70] ETSI TS 181 018: "Telecommunications and Internet converged Services and Protocols for
Advanced Networking (TISPAN); Requirements for QoS in a NGN".
[71] ETSI TS 122 105 (V8.4.0): "Universal Mobile Telecommunications System (UMTS); Services
and service capabilities (3GPP TS 22.105, version 8.4.0 Release 8)".
[72] ETSI TS 126 234 (V7.5.0): "Universal Mobile Telecommunications System (UMTS); Transparent
end-to-end Packet-switched Streaming Service (PSS); Protocols and codecs
(3GPP TS 26.234, version 7.5.0 Release 7)".
[73] ETSI TS 126 346 (V7.8.0): "Universal Mobile Telecommunications System (UMTS); Multimedia
Broadcast/Multicast Service (MBMS); Protocols and codecs
(3GPP TS 26.346, version 7.8.0 Release 7)".
2.2 Informative references
The following referenced documents are not essential to the use of the present document but they assist the user with
regard to a particular subject area. For non-specific references, the latest version of the referenced document (including
any amendments) applies.
[i.1] ETSI ETR 310: "Digital Enhanced Cordless Telecommunications (DECT); Traffic capacity and
spectrum requirements for multi-system and multi-service DECT applications co-existing in a
common frequency band".
[i.2] ETSI TS 126 091 (V7.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); AMR speech Codec; Error concealment of lost
frames (3GPP TS 26.091, version 7.0.0 Release 7)".
[i.3] ETSI TS 126 191 (V7.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); Speech codec speech processing functions;
Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; Error concealment of erroneous or
lost frames (3GPP TS 26.191, version 7.0.0 Release 7)".
[i.4] ETSI TR 102 479: "Telecommunications and Internet converged Services and Protocols for
Advanced Networking (TISPAN); Review of available material on QoS requirements of
Multimedia Services".
[i.5] IETF RFC 1633: "Integrated services in the Internet architecture: An overview".
[i.6] IETF RFC 2205: "Resource ReSerVation Protocol (RSVP) Version 1 Functional Specification".
[i.7] IETF RFC 2475: "An Architecture for Differentiated Services".
[i.8] Cicconetti, C., Lezini, L., Mingozzi, E. and Eklund, C.: "Quality of Service support in 802.16
networks. IEEE Network, vol. 29", March/April 2006.
[i.9] Pedersen, K., Mogensen, P. and Kolding, T.: "Overview of QoS options for HSDPA. IEEE
Communications Magazine vol. 44", July 2006.
[i.10] ETSI ES 282 003: "Telecommunications and Internet converged Services and Protocols for
Advanced Networking (TISPAN); Resource and Admission Control Sub-System (RACS):
Functional Architecture".
[i.11] Perkins, C, Hodson, O. and Hardman, V.: "A Survey of Packet Loss Recovery Techniques for
Streaming Audio. IEEE Network vol. 12, No. 5", 1998.
[i.12] Wah, B., Su, X. and Lin, D.: "A Survey of Error-Concealment Schemes for Real-Time Audio and
Video Transmissions over the Internet. IEEE International Symposium on Multimedia Software
Engineering 2000 (MSE 2000)"; Taipei, Taiwan, 11-13 December 2000.
ETSI
11 ETSI ES 202 667 V1.1.1 (2009-04)

[i.13] ITU-T Recommendation G.711 (Appendix I).: "Pulse code modulation (PCM) of voice
frequencies; A high quality low-complexity algorithm for packet loss concealment with G.711".
[i.14] ITU-T Recommendation G.722 (Appendix III): "7 kHz audio-coding within 64 kbit/s; A
high-quality packet loss concealment algorithm for G.722".
[i.15] ITU-T Recommendation G.722 (Appendix IV): "7 kHz audio-coding within 64 kbit/s; A
low-complexity algorithm for packet loss concealment with G.722".
[i.16] Wenger, S.: "H.264/AVC over IP. IEEE Transactions on circuits and systems for video
technology, vol. 13, No. 7", 2003.
[i.17] ETSI TR 101 329-6: "Telecommunications and Internet Protocol Harmonization Over Networks
(TIPHON) Release 3; End-to-end Quality of Service in TIPHON systems; Part 6: Actual
measurements of network and terminal characteristics and performance parameters in TIPHON
networks and their influence on voice quality".
[i.18] Kövesi, B. and Ragot, S.: "A low complexity packet loss concealment algorithm for
ITU-T Recommendation G.722. 2008 IEEE International Conference on Acoustics, Speech, and
th th
Signal Processing (ICASSP)". Las Vegas, USA, 30 March - 4 April, 2008.
[i.19] ITU-T Recommendation I.113: "Vocabulary of terms for broadband aspects of ISDN".
[i.20] ITU-T Recommendation G.722.2: "Wideband coding of speech at around 16 kbit/s using Adaptive
Multi-Rate Wideband (AMR-WB)".
[i.21] ITU-T Recommendation SG.12: "Temporary Documents".
[i.22] Layer 1 specifications.
NOTE: Available at http://3GPP specification series: 05series.
3 Definitions and abbreviations
3.1 Definitions
For the purposes of the present document, the following terms and definitions apply:
audio: all signals that are audible to human beings, including speech and music
broadcasting: communication capability which denotes unidirectional distribution from a single source to all users
connected to the network
multipoint: value of the service attribute "communication configuration", which denotes that the communication
involves more than two network terminations
NOTE: Source: ITU-T Recommendation I.113 [i.19].
narrowband speech: speech restricted to the frequency band from 300 Hz to 3 400 Hz
speech: oral production of information by a human being
streaming: mechanism whereby media content can be rendered at the same time that it is being transmitted to the client
over the network
video: signal that contains timing/synchronization information as well as luminance (intensity) and chrominance
(colour) information that when displayed on an appropriate device gives a visual representation of the original image
sequence
videoconferencing: service providing interactive, bi-directional and real time audio-visual communication
NOTE: Normally intended for multiple users at each end.
ETSI
12 ETSI ES 202 667 V1.1.1 (2009-04)

videotelephony: service providing an interactive, bi-directional, real time audio-visual communication between users
wideband speech: speech restricted to the frequency band from 50 Hz to 7 000 Hz
3.2 Abbreviations
For the purposes of the present document, the following abbreviations apply:
rd
3GPP 3 Generation Partnership Project
rd
3GPP2 3 Generation Partnership Project 2
NOTE: A 3G project comprising North American and Asian interests.
AAC Advanced Audio Coding
ADPCM Adaptive Differential Pulse Code Modulation
AMR Adaptive Multi Rate
AMR-WB Adaptive Multi Rate Wide Band
AMR-WB+ Adaptive Multi Rate extended Wide Band
AP Access Point (IEEE 802.11 WLAN [46])
ATM Asynchronous Transfer Mode
AVC Advanced Video Coding
CCIR Comité Consultatif International pour la Radio; Now ITU-R
CELP Code-Excited Linear Predictive
CIF Common Intermediate Format
CPCFC Custom Picture Clock Frequency Code
CPFMT Custom Picture ForMaT
DECT Digital Enhanced Cordless Telecommunications
DPCM Differential Pulse Code Modulation
EUL Enhanced UpLink
FER Frame Error Rate
FP Fixed Part (DECT)
HDTV High Definition TV
HE-AAC High Efficiency AAC
HSPA High-Speed Packet Access
HSDPA High-Speed Downstream Packet Access
HSUPA High-Speed Upstream Packet Access
IETF Internet Engineering Task Force
IMS IP Multimedia Subsystem
IP Internet Protocol
IPDV IP Packet Delay Variation
IPER IP Packet Error Ratio
IPLR IP Packet Loss Ratio
IPTD IP Packet Transfer Delay
ITU-R International Telecommunication Union - Radiocommunication sector
ITU-T International Telecommunication Union - Telecommunication standardization sector
LPC Linear Predictive Coding
MAC Medium Access Control
MBMS Mobile Broadcast/Multicast Service
MDCT Modified Discrete Cosine Transform
MCU Multipoint Control Unit
MPE Multi-Pulse Excited
MPEG 2 TS MPEG 2 Transport Stream
MPEG Moving Picture Experts Group
MUSHRA MUlti Stimulus with Hidden Reference and Anchors
NTSC National Television System Committee
NOTE: Used to identify an analogue TV standard used outside Europe.
PAL Phase-Alternating Line
NOTE: Colour-encoding system used in television systems.
ETSI
13 ETSI ES 202 667 V1.1.1 (2009-04)

PBX Private Branch eXchange
PCM Pulse Code Modulation
PP Portable Part (DECT)
QCIF Quart CIF
QVGA Quart VGA
RTP Real-time Transport Protocol
RTT Round Trip Time
SDTV Standard Definition TV
SVC Scalable Video Coding
TCP Transport Control Protocol
TTI Transmission Time Interval
UDP User Datagram Protocol
UMTS Universal Mobile Telecommunications System
VGA Video Graphics Array
W-CDMA Wideband-Code Division Multiple Access
WLAN Wireless Local Area Network
NOTE: IPER, IPDV, IPLR and IPTD are defined in ITU-T Recommendations Y.1540 [1] and Y.1541 [2].
4 Parameters affecting audiovisual user perceived
quality
4.1 Audiovisual user perceived quality model
The characteristics affecting audiovisual user perceived quality and their interactions are illustrated in figure 1.
����
������������������
����������
���������������
������
����������������
����
�������
����������
�������������
�����
������������
������
��������
���������������
Figure 1: Characteristics affecting audiovisual user perceived quality
ETSI
14 ETSI ES 202 667 V1.1.1 (2009-04)

4.2 Coding algorithms
4.2.1 Speech
There are two groups of speech codecs:
• narrowband codecs, transmitting the frequency band from 300 Hz to 3 400 Hz;
• wideband codecs, transmitting the frequency band from 50 Hz to 7 000 Hz.
NOTE 1: The transmission bandwidth indicated is the bandwidth supported by the coding algorithm. The actual
bandwidth supported may be restricted due to handset or terminal characteristics.
Speech codecs are often classified into three types:
• waveform codecs;
• source codecs;
• hybrid codecs.
Waveform codecs attempt, without using any knowledge of how the signal to be coded was generated, to produce a
reconstructed signal whose waveform is as close as possible to the original. This means that in theory they should be
signal independent and work well with non-speech signals. An example is the codec standardized in ITU-T
Recommendation G.711 [74].
To reduce the required bit rate, the difference compared with the previous sample may be transmitted instead of the
actual sample. This technique is call delta modulation or differential PCM (DPCM). This technique may be further
enhanced by predicting the value of the next sample from the previous samples and transmit the difference between the
predicted value and the actual sampled value (ADPCM).
The input speech signal may also be split into a number of frequency bands, or sub-bands, and each is coded
independently. This is called Sub-band coding. An example is the codec defined in ITU-T Recommendation G.722 [4]
where the 7 kHz frequency band is divided into two sub-bands, which are coded independent of each other.
Source coders operate using a model of how the source is generated, and attempt to extract, from the signal being
coded, the parameters of the model. Coders using this technique require very low bit rate, but the quality is usually not
good enough for public telecommunication applications.
Hybrid codecs attempt to fill the gap between waveform and source codecs. Although other forms of hybrid codecs
exist, the most successful and commonly used are time domain Analysis-by-Synthesis (AbS) codecs. Such coders use
the same linear prediction filter model of the vocal tract as found in LPC vocoders. However instead of applying a
simple two-state, voiced/unvoiced, model to find the necessary input to this filter, the excitation signal is chosen by
attempting to match the reconstructed speech waveform as closely as possible to the original speech waveform.
Examples are Multi-Pulse Excited (MPE) codecs and Code-Excited Linear Predictive (CELP) codecs.
The narrowband speech codecs standardized by ITU-T and ETSI are listed in table 1.
ETSI
15 ETSI ES 202 667 V1.1.1 (2009-04)

Table 1: Narrowband speech codecs standardized by ITU-T and ETSI
Codec ID Bit rate (kbit/s) Frame size Look ahead
(ms) (ms)
ITU-T Recommendation G.711 [75] 64 n/a
ITU-T Recommendation G.723.1 [5] 6,3 30 7,5
5,3
ITU-T Recommendation G.726 [6] 16, 24, 32 and 40 n/a n/a
ITU-T Recommendation G.728 [7] 16 n/a n/a
ITU-T Recommendation G.729 [8] 6,4, 8 and 11,8 10 5
ITU-T Recommendation G.729.1 [9] 8, 12, 14, 16, 18, 20, 22, 24, 20 25
(see note 1) 26, 28
...


SLOVENSKI STANDARD
01-julij-2009
.DNRYRVWSUHQRVDJRYRUDLQYHþSUHGVWDYQLKYVHELQ 674 .DNRYRVW
DYGLRYL]XDOQHVWRULWYHSULNRPXQLNDFLMLSUHNRPUHåLM]LQWHUQHWQLPSURWRNRORP ,3
Speech and multimedia Transmission Quality (STQ) - Audiovisual QoS for
communication over IP networks
Ta slovenski standard je istoveten z: ES 202 667 Version 1.1.1
ICS:
33.040.35 Telefonska omrežja Telephone networks
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

ETSI Standard
Speech and multimedia Transmission Quality (STQ);
Audiovisual QoS for communication over IP networks

2 ETSI ES 202 667 V1.1.1 (2009-04)

Reference
DES/STQ-00097
Keywords
multimedia, QoS
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE

Tel.: +33 4 92 94 42 00  Fax: +33 4 93 65 47 16

Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88

Important notice
Individual copies of the present document can be downloaded from:
http://www.etsi.org
The present document may be made available in more than one electronic version or in print. In any case of existing or
perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF).
In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive
within ETSI Secretariat.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
http://portal.etsi.org/tb/status/status.asp
If you find errors in the present document, please send your comment to one of the following services:
http://portal.etsi.org/chaircor/ETSI_support.asp
Copyright Notification
No part may be reproduced except as authorized by written permission.
The copyright and the foregoing restriction extend to reproduction in all media.

© European Telecommunications Standards Institute 2009.
All rights reserved.
TM TM TM TM
DECT , PLUGTESTS , UMTS , TIPHON , the TIPHON logo and the ETSI logo are Trade Marks of ETSI registered
for the benefit of its Members.
TM
3GPP is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners.
LTE™ is a Trade Mark of ETSI currently being registered
for the benefit of its Members and of the 3GPP Organizational Partners.
GSM® and the GSM logo are Trade Marks registered and owned by the GSM Association.
ETSI
3 ETSI ES 202 667 V1.1.1 (2009-04)
Contents
Intellectual Property Rights . 5
Foreword . 5
1 Scope . 6
2 References . 6
2.1 Normative references . 6
2.2 Informative references . 10
3 Definitions and abbreviations . 11
3.1 Definitions . 11
3.2 Abbreviations . 12
4 Parameters affecting audiovisual user perceived quality . 13
4.1 Audiovisual user perceived quality model . 13
4.2 Coding algorithms . 14
4.2.1 Speech . 14
4.2.2 Audio . 15
4.2.3 Video . 16
4.2.4 Lip synchronization . 20
4.3 Network transmission protocols and principles . 21
4.3.1 Unreliable communication . 21
4.3.2 Multicast . 21
4.3.3 Reliable communication . 22
4.3.4 Multipoint . 22
4.3.5 DSL Access . 22
4.3.5.1 Asymmetrical DSL. 23
4.3.5.2 Symmetrical DSL . 24
4.3.6 Wireless access . 24
4.3.6.1 2G Mobile access . 24
4.3.6.2 3G Mobile access . 24
4.3.6.3 DECT . 25
4.3.6.4 Broadband Wireless Access . 26
4.3.6.5 WLAN . 27
4.3.7 Broadcasting . 27
4.4 Network performance parameters . 28
4.4.1 Transmission bandwidth . 28
4.4.1.1 General considerations . 28
4.4.1.2 Speech transmission . 28
4.4.1.3 Audio transmission . 28
4.4.1.4 Video transmission . 30
4.4.2 Packet loss . 30
4.4.2.1 General considerations . 30
4.4.2.2 Speech transmission . 30
4.4.2.3 Audio transmission . 31
4.4.2.4 Video transmission . 32
4.4.3 Transmission delay . 33
4.4.3.1 General . 33
4.4.3.2 Transmitting terminal delay . 33
4.4.3.3 Access network delay . 34
4.4.3.4 Core network delay . 35
4.4.3.5 Receiving terminal delay . 35
4.4.4 Transmission delay variations (jitter) . 35
4.5 Terminal characteristics . 36
4.5.1 Packet loss recovery. 36
4.5.2 Playout buffer . 36
4.5.3 Audio characteristics. 36
4.5.4 Video display . 36
ETSI
4 ETSI ES 202 667 V1.1.1 (2009-04)
4.6 Audio-video interaction . 37
5 Audiovisual applications classification . 37
5.1 Delay sensitive applications . 38
5.2 Delay insensitive applications . 38
6 Delay sensitive audiovisual applications requirements . 38
6.1 Coding algorithms . 38
6.1.1 Narrowband speech. 38
6.1.2 Wideband speech . 39
6.1.3 Video . 39
6.2 Network performance requirements . 40
6.3 Terminal characteristics . 41
6.3.1 Narrowband speech. 41
6.3.2 Wideband speech . 41
6.3.3 Video . 41
7 Delay insensitive audiovisual applications requirements . 41
7.1 Coding algorithms . 41
7.1.1 Audio . 41
7.1.2 Video . 42
7.2 Network performance requirements . 43
7.3 Terminal characteristics . 44
7.3.1 Audio . 44
7.3.2 Video . 44
Annex A (informative): Audio-video quality interaction . 45
A.1 Introduction . 45
A.2 Available information. 45
A.2.1 TR 102 479 . 45
A.2.2 Results presented to ITU-T Recommendation SG 12 . 47
Annex B (informative): QoS mechanisms and the effects on connection performance . 50
B.1 Introduction . 50
B.2 QoS mechanisms . 50
B.2.1 Integrated services (IntServ). 50
B.2.2 Differentiated services (DiffServ) . 50
B.2.3 WLAN QoS mechanism . 51
B.2.4 WiMax QoS mechanism . 52
B.2.5 HSPA packet scheduling . 52
B.2.6 RACS . 53
B.3 Effects on connection performance . 55
Annex C (informative): Packet loss recovery . 56
C.1 Introduction . 56
C.2 Application layer packet loss recovery methods . 56
C.2.1 Speech and audio recovery . 56
C.2.2 Video recovery . 59
C.3 Performance improvement . 60
Annex D (informative): Provisional QoS Classes defined in ITU-T Recommendation Y.1541 . 62
History . 63

ETSI
5 ETSI ES 202 667 V1.1.1 (2009-04)
Intellectual Property Rights
IPRs essential or potentially essential to the present document may have been declared to ETSI. The information
pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found
in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in
respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web
server (http://webapp.etsi.org/IPR/home.asp).
Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee
can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web
server) which are, or may be, or may become, essential to the present document.
Foreword
This ETSI Standard (ES) has been produced by ETSI Technical Committee Speech and multimedia Transmission
Quality (STQ).
ETSI
6 ETSI ES 202 667 V1.1.1 (2009-04)
1 Scope
The present document addresses combination network performance parameters and user perceived media (audio and
video) quality parameters for audiovisual communications on IP networks.
The access technologies covered include both wired (e.g. xDSL) and wireless (e.g. UMTS, WLAN) technologies.
The display size range covered is from those of small mobile terminals (e.g. 2") up to large TV sets (e.g. 40" or more).
It is applicable to:
• Broadcasting and streaming applications such as IPTV and VoD.
• Interactive point-to-pint applications such as videotelephony and videoconferencing.
Where the media coding standards define two or more profiles, the baseline profile is addressed in the normative part of
the standard.
Informative annexes present an overview of network QoS mechanisms and the effects on connection performance as
well as guidance on terminal parameters that may influence the user perceived media performance.
2 References
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific.
• For a specific reference, subsequent revisions do not apply.
• Non-specific reference may be made only to a complete document or a part thereof and only in the following
cases:
- if it is accepted that it will be possible to use all future changes of the referenced document for the
purposes of the referring document;
- for informative references.
Referenced documents which are not found to be publicly available in the expected location might be found at
http://docbox.etsi.org/Reference.
NOTE: While any hyperlinks included in this clause were valid at the time of publication ETSI cannot guarantee
their long term validity.
2.1 Normative references
The following referenced documents are indispensable for the application of the present document. For dated
references, only the edition cited applies. For non-specific references, the latest edition of the referenced document
(including any amendments) applies.
[1] ITU-T Recommendation Y.1540: "Internet protocol data communication service - IP packet
transfer and availability performance parameters".
[2] ITU-T Recommendation Y.1541: "Network performance objectives for IP-based services".
[3] ITU-T Recommendation G.711: "Pulse Code Modulation (PCM) of voice frequencies".
[4] ITU-T Recommendation G.722: "7 kHz audio-coding within 64 kbit/s".
[5] ITU-T Recommendation G.723.1: "Dual rate speech coder for multimedia communications
transmitting at 5,3 and 6,3 kbit/s".
ETSI
7 ETSI ES 202 667 V1.1.1 (2009-04)
[6] ITU-T Recommendation G.726: "40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code
Modulation (ADPCM)".
[7] ITU-T Recommendation G.728: "Coding of speech at 16 kbit/s using low-delay code exited linear
prediction".
[8] ITU-T Recommendation G.729: "Coding of speech at 8 kbit/s using conjugate-structure
algebraic-code-excited linear-prediction (CS-ACELP)".
[9] ITU-T Recommendation G.729.1: "G.729 Embedded Variable bit-rate coder: An 8-32 kbit/s
scalable wideband coder bitstream interoperable with G.729".
[10] ETSI TS 126 071 (V6.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); AMR speech Codec; General description
(3GPP TS 26.071, version 6.0.0 Release 6)".
[11] 3GPP2 C.S0014-C v1.0 Enhanced Variable Rate Codec, Speech Service Option 3, 68 and 70 for
Wideband Spread Spectrum Digital Systems, January 2007.
[12] ITU-T Recommendation G.722.1: "Low-complexity coding at 24 and 32 kbit/s for hands-free
operation in systems with low frame loss".
[13] ITU-T Recommendation G.711.1: "Wideband embedded extension for G.711 pulse code
modulation".
[14] ETSI TS 126 171 (V6.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); AMR speech codec, wideband; General description
(3GPP TS 26.171, version 6.0.0 Release 6)".
NOTE: TS 126 171 is identical to ITU-T Recommendation G.722.2.
[15] IETF RFC 3351: "RTP Profile for Audio and Video Conferences with Minimal Control".
[16] ISO/IEC 11172: "Information technology -- Coding of moving pictures and associated audio for
digital storage media at up to about 1,5 Mbit/s (MPEG 1, 5 parts)".
[17] ISO/IEC 13818: "Information technology -- Generic coding of moving pictures and associated
audio information (MPEG 2, 9 parts)".
[18] ISO/IEC 14496: "Information technology -- Coding of audio-visual objects (MPEG 4; currently in
11 parts)".
[19] ETSI TS 126 290 (V6.3.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS) Audio codec processing functions; Extended
Adaptive Multi-Rate - Wideband (AMR-WB+) codec; Transcoding functions
(3GPP TS 26.290, version 6.0.0 Release 6)".
[20] ITU-T Recommendation G.719: "Low-complexity full-band audio coding for high-quality
conversational applications".
[21] ETSI TS 102 366: "Digital Audio Compression (AC-3, Enhanced AC-3) Standard".
[22] ITU-T Recommendation H.261: "Video codec for audiovisual services at p × 64 kbit/s".
[23] ITU-T Recommendation H.262: "Information technology - Generic coding of moving pictures and
associated audio information: Video".
[24] ITU-T Recommendation H.263: "Video coding for low bit rate communication".
[25] ITU-T Recommendation H.264: "Advanced video coding for generic audiovisual services".
NOTE: This recommendation is identical to MPEG 4 Annex 10.
[26] SMPTE 421M (2006): "Television - VC-1 Compressed Video Bitstream Format and Decoding
Process".
[27] ITU-R Recommendation BT.1359-1: "Relative timing of sound and vision for broadcasting".
ETSI
8 ETSI ES 202 667 V1.1.1 (2009-04)

[28] IETF RFC 768: "User Datagram Protocol".
[29] ETSI TS 122 146 (V7.1.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); LTE; Multimedia Broadcast/Multicast Service
(MBMS); Stage 1 (3GPP TS 22.146, version 7.1.0 Release 7)".
[30] IETF RFC 793: "Transmission Control Protocol".
[31] ITU-T Recommendation G.995.1: "Overview of digital subscriber line (DSL) Recommendations".
[32] IEEE 802.3 (2005): "IEEE Standard for Information technology-Telecommunications and
information exchange between systems-Local and metropolitan area networks; Specific
requirements Part 3: Carrier Sense Multiple Access with Collision Detection (CSMA/CD) Access
Method and Physical Layer Specifications".
[33] ITU-T Recommendation G.992: Parts 1 to 5.
[34] ITU-T Recommendation G.993: Parts 1 and 2.
[35] ITU-T Recommendation G.991.1: "High bit rate Digital Subscriber Line (HDSL) transceivers".
[36] ITU-T Recommendation G.991.2: "Single-pair high-speed digital subscriber line (SHDSL)
transceivers".
[37] ETSI TS 101 113 (V7.5.0): "Digital cellular telecommunications system (Phase 2+) (GSM);
General Packet Radio Service (GPRS); Service description; Stage 1
(GSM 02.60, version 7.5.0 Release 1998)".
[38] ETSI TS 122 228 (V8.5.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); Service requirements for the Internet Protocol (IP)
multimedia core network subsystem (IMS); Stage 1 (3GPP TS 22.228, version 8.5.0 Release 8)".
[39] ETSI TS 122 173 (V7.5.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); IP Multimedia Core Network Subsystem (IMS)
Multimedia Telephony Service and supplementary services; Stage 1
(3GPP TS 22.173, version 7.5.0 Release 7)".
[40] ETSI TS 125 308 (V7.7.0): "Universal Mobile Telecommunications System (UMTS); High Speed
Downlink Packet Access (HSDPA); Overall description; Stage 2
(3GPP TS 25.308, version 7.7.0 Release 7)".
[41] ETSI TS 125 319 (V7.6.0): "Universal Mobile Telecommunications System (UMTS); Enhanced
uplink; Overall description; Stage 2 (3GPP TS 25.319, version 7.6.0 Release 7)".
[42] ETSI TS 123 107 (V7.1.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); Quality of Service (QoS) concept and architecture
(3GPP TS 23.107, version 7.1.0 Release 7)".
[43] ETSI TS 123 207 (V7.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); End-to-end Quality of Service (QoS) concept and
architecture (3GPP TS 23.207, version 7.0.0 Release 7)".
[44] ETSI EN 300 175-2: "Digital Enhanced Cordless Telecommunications (DECT); Common
Interface (CI); Part 2: Physical Layer (PHL)".
[45] IEEE 802.16 (2004): "Standard for Local and metropolitan area networks. Part 16: Air Interface
for Fixed Broadband Wireless Access Systems".
[46] IEEE 802.11 (2007): "Information technology - Telecommunications and information exchange
between systems - Local and metropolitan area networks. Specific requirements. Part 11: Wireless
LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications".
[47] ETSI EN 300 401: "Radio Broadcasting Systems; Digital Audio Broadcasting (DAB) to mobile,
portable and fixed receivers".
ETSI
9 ETSI ES 202 667 V1.1.1 (2009-04)

[48] ETSI TS 102 428: "Digital Audio Broadcasting (DAB); DMB video service; User Application
Specification".
[49] ETSI EN 302 307: "Digital Video Broadcasting (DVB); Second generation framing structure,
channel coding and modulation systems for Broadcasting, Interactive Services, News Gathering
and other broadband satellite applications".
[50] ETSI EN 300 744: "Digital Video Broadcasting (DVB); Framing structure, channel coding and
modulation for digital terrestrial television".
[51] ETSI EN 300 419: "Access and Terminals (AT); 2 048 kbit/s digital structured leased lines
(D2048S); Connection characteristics".
[52] ETSI EN 302 304: "Digital Video Broadcasting (DVB); Transmission System for Handheld
Terminals (DVB-H)".
[53] ETSI EN 302 583: "Digital Video Broadcasting (DVB); Framing Structure, channel coding and
modulation for Satellite Services to Handheld devices (SH) below 3 GHz".
[54] ETSI TS 101 154: "Digital Video Broadcasting (DVB); Specification for the use of Video and
Audio Coding in Broadcasting Applications based on the MPEG-2 Transport Stream".
[55] ETSI TS 102 005: "Digital Video Broadcasting (DVB); Specification for the use of Video and
Audio Coding in DVB services delivered directly over IP protocols".
[56] ITU-T Recommendation G.114: "One-way transmission time".
[57] IETF RFC 3550: "RTP: A Transport Protocol for Real-Time Applications".
[58] IETF RFC 3095: "RObust Header Compression (ROHC): Framework and four profiles: RTP,
UDP, ESP, and uncompressed".
[59] ETSI TS 181 005: "Telecommunications and Internet Converged Services and Protocols for
Advanced Networking (TISPAN); Service and Capability Requirements".
[60] ITU-R Recommendation BS.1534-1: "Method for the subjective assessment of intermediate
quality levels of coding systems".
[61] ITU-T Recommendation G.107: "The E-model, a computational model for use in transmission
planning".
[62] ITU-T Recommendation G.1010: "End-user Multimedia QoS Categories".
[63] ETSI ES 202 737: "Speech Processing, Transmission and Quality Aspects (STQ); Transmission
requirements for narrowband VoIP terminals (handset and headset) from a QoS perspective as
perceived by the user".
[64] ETSI ES 202 738: "Speech Processing, Transmission and Quality Aspects (STQ); Transmission
requirements for narrowband VoIP loudspeaking and handsfree terminals from a QoS perspective
as perceived by the user".
[65] ETSI ES 202 739: "Speech Processing, Transmission and Quality Aspects (STQ); Transmission
requirements for wideband VoIP terminals (handset and headset) from a QoS perspective as
perceived by the user".
[66] ETSI ES 202 740: "Speech Processing, Transmission and Quality Aspects (STQ); Transmission
requirements for wideband VoIP loudspeaking and handsfree terminals from a QoS perspective as
perceived by the user".
[67] ETSI TS 126 235 (V7.4.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); Packet switched conversational multimedia
applications; Default codecs (3GPP TS 26.235, version 7.4.0 Release 7)".
[68] ITU-T Recommendation J.247: "Objective perceptual multimedia video quality measurement in
the presence of a full reference".
ETSI
10 ETSI ES 202 667 V1.1.1 (2009-04)

[69] ITU-T Recommendation P.911: "Subjective audiovisual quality assessment methods for
multimedia applications".
[70] ETSI TS 181 018: "Telecommunications and Internet converged Services and Protocols for
Advanced Networking (TISPAN); Requirements for QoS in a NGN".
[71] ETSI TS 122 105 (V8.4.0): "Universal Mobile Telecommunications System (UMTS); Services
and service capabilities (3GPP TS 22.105, version 8.4.0 Release 8)".
[72] ETSI TS 126 234 (V7.5.0): "Universal Mobile Telecommunications System (UMTS); Transparent
end-to-end Packet-switched Streaming Service (PSS); Protocols and codecs
(3GPP TS 26.234, version 7.5.0 Release 7)".
[73] ETSI TS 126 346 (V7.8.0): "Universal Mobile Telecommunications System (UMTS); Multimedia
Broadcast/Multicast Service (MBMS); Protocols and codecs
(3GPP TS 26.346, version 7.8.0 Release 7)".
2.2 Informative references
The following referenced documents are not essential to the use of the present document but they assist the user with
regard to a particular subject area. For non-specific references, the latest version of the referenced document (including
any amendments) applies.
[i.1] ETSI ETR 310: "Digital Enhanced Cordless Telecommunications (DECT); Traffic capacity and
spectrum requirements for multi-system and multi-service DECT applications co-existing in a
common frequency band".
[i.2] ETSI TS 126 091 (V7.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); AMR speech Codec; Error concealment of lost
frames (3GPP TS 26.091, version 7.0.0 Release 7)".
[i.3] ETSI TS 126 191 (V7.0.0): "Digital cellular telecommunications system (Phase 2+); Universal
Mobile Telecommunications System (UMTS); Speech codec speech processing functions;
Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; Error concealment of erroneous or
lost frames (3GPP TS 26.191, version 7.0.0 Release 7)".
[i.4] ETSI TR 102 479: "Telecommunications and Internet converged Services and Protocols for
Advanced Networking (TISPAN); Review of available material on QoS requirements of
Multimedia Services".
[i.5] IETF RFC 1633: "Integrated services in the Internet architecture: An overview".
[i.6] IETF RFC 2205: "Resource ReSerVation Protocol (RSVP) Version 1 Functional Specification".
[i.7] IETF RFC 2475: "An Architecture for Differentiated Services".
[i.8] Cicconetti, C., Lezini, L., Mingozzi, E. and Eklund, C.: "Quality of Service support in 802.16
networks. IEEE Network, vol. 29", March/April 2006.
[i.9] Pedersen, K., Mogensen, P. and Kolding, T.: "Overview of QoS options for HSDPA. IEEE
Communications Magazine vol. 44", July 2006.
[i.10] ETSI ES 282 003: "Telecommunications and Internet converged Services and Protocols for
Advanced Networking (TISPAN); Resource and Admission Control Sub-System (RACS):
Functional Architecture".
[i.11] Perkins, C, Hodson, O. and Hardman, V.: "A Survey of Packet Loss Recovery Techniques for
Streaming Audio. IEEE Network vol. 12, No. 5", 1998.
[i.12] Wah, B., Su, X. and Lin, D.: "A Survey of Error-Concealment Schemes for Real-Time Audio and
Video Transmissions over the Internet. IEEE International Symposium on Multimedia Software
Engineering 2000 (MSE 2000)"; Taipei, Taiwan, 11-13 December 2000.
ETSI
11 ETSI ES 202 667 V1.1.1 (2009-04)

[i.13] ITU-T Recommendation G.711 (Appendix I).: "Pulse code modulation (PCM) of voice
frequencies; A high quality low-complexity algorithm for packet loss concealment with G.711".
[i.14] ITU-T Recommendation G.722 (Appendix III): "7 kHz audio-coding within 64 kbit/s; A
high-quality packet loss concealment algorithm for G.722".
[i.15] ITU-T Recommendation G.722 (Appendix IV): "7 kHz audio-coding within 64 kbit/s; A
low-complexity algorithm for packet loss concealment with G.722".
[i.16] Wenger, S.: "H.264/AVC over IP. IEEE Transactions on circuits and systems for video
technology, vol. 13, No. 7", 2003.
[i.17] ETSI TR 101 329-6: "Telecommunications and Internet Protocol Harmonization Over Networks
(TIPHON) Release 3; End-to-end Quality of Service in TIPHON systems; Part 6: Actual
measurements of network and terminal characteristics and performance parameters in TIPHON
networks and their influence on voice quality".
[i.18] Kövesi, B. and Ragot, S.: "A low complexity packet loss concealment algorithm for
ITU-T Recommendation G.722. 2008 IEEE International Conference on Acoustics, Speech, and
th th
Signal Processing (ICASSP)". Las Vegas, USA, 30 March - 4 April, 2008.
[i.19] ITU-T Recommendation I.113: "Vocabulary of terms for broadband aspects of ISDN".
[i.20] ITU-T Recommendation G.722.2: "Wideband coding of speech at around 16 kbit/s using Adaptive
Multi-Rate Wideband (AMR-WB)".
[i.21] ITU-T Recommendation SG.12: "Temporary Documents".
[i.22] Layer 1 specifications.
NOTE: Available at http://3GPP specification series: 05series.
3 Definitions and abbreviations
3.1 Definitions
For the purposes of the present document, the following terms and definitions apply:
audio: all signals that are audible to human beings, including speech and music
broadcasting: communication capability which denotes unidirectional distribution from a single source to all users
connected to the network
multipoint: value of the service attribute "communication configuration", which denotes that the communication
involves more than two network terminations
NOTE: Source: ITU-T Recommendation I.113 [i.19].
narrowband speech: speech restricted to the frequency band from 300 Hz to 3 400 Hz
speech: oral production of information by a human being
streaming: mechanism whereby media content can be rendered at the same time that it is being transmitted to the client
over the network
video: signal that contains timing/synchronization information as well as luminance (intensity) and chrominance
(colour) information that when displayed on an appropriate device gives a visual representation of the original image
sequence
videoconferencing: service providing interactive, bi-directional and real time audio-visual communication
NOTE: Normally intended for multiple users at each end.
ETSI
12 ETSI ES 202 667 V1.1.1 (2009-04)

videotelephony: service providing an interactive, bi-directional, real time audio-visual communication between users
wideband speech: speech restricted to the frequency band from 50 Hz to 7 000 Hz
3.2 Abbreviations
For the purposes of the present document, the following abbreviations apply:
rd
3GPP 3 Generation Partnership Project
rd
3GPP2 3 Generation Partnership Project 2
NOTE: A 3G project comprising North American and Asian interests.
AAC Advanced Audio Coding
ADPCM Adaptive Differential Pulse Code Modulation
AMR Adaptive Multi Rate
AMR-WB Adaptive Multi Rate Wide Band
AMR-WB+ Adaptive Multi Rate extended Wide Band
AP Access Point (IEEE 802.11 WLAN [46])
ATM Asynchronous Transfer Mode
AVC Advanced Video Coding
CCIR Comité Consultatif International pour la Radio; Now ITU-R
CELP Code-Excited Linear Predictive
CIF Common Intermediate Format
CPCFC Custom Picture Clock Frequency Code
CPFMT Custom Picture ForMaT
DECT Digital Enhanced Cordless Telecommunications
DPCM Differential Pulse Code Modulation
EUL Enhanced UpLink
FER Frame Error Rate
FP Fixed Part (DECT)
HDTV High Definition TV
HE-AAC High Efficiency AAC
HSPA High-Speed Packet Access
HSDPA High-Speed Downstream Packet Access
HSUPA High-Speed Upstream Packet Access
IETF Internet Engineering Task Force
IMS IP Multimedia Subsystem
IP Internet Protocol
IPDV IP Packet Delay Variation
IPER IP Packet Error Ratio
IPLR IP Packet Loss Ratio
IPTD IP Packet Transfer Delay
ITU-R International Telecommunication Union - Radiocommunication sector
ITU-T International Telecommunication Union - Telecommunication standardization sector
LPC Linear Predictive Coding
MAC Medium Access Control
MBMS Mobile Broadcast/Multicast Service
MDCT Modified Discrete Cosine Transform
MCU Multipoint Control Unit
MPE Multi-Pulse Excited
MPEG 2 TS MPEG 2 Transport Stream
MPEG Moving Picture Experts Group
MUSHRA MUlti Stimulus with Hidden Reference and Anchors
NTSC National Television System Committee
NOTE: Used to identify an analogue TV standard used outside Europe.
PAL Phase-Alternating Line
NOTE: Colour-encoding system used in television systems.
ETSI
13 ETSI ES 202 667 V1.1.1 (2009-04)

PBX Private Branch eXchange
PCM Pulse Code Modulation
PP Portable Part (DECT)
QCIF Quart CIF
QVGA Quart VGA
RTP Real-time Transport Protocol
RTT Round Trip Time
SDTV Standard Definition TV
SVC Scalable Video Coding
TCP Transport Control Protocol
TTI Transmission Time Interval
UDP User Datagram Protocol
UMTS Universal Mobile Telecommunications System
VGA Video Graphics Array
W-CDMA Wideband-Code Division Multiple Access
WLAN Wireless Local Area Network
NOTE: IPER, IPDV, IPLR and IPTD are defined in ITU-T Recommendations Y.1540 [1] and Y.1541 [2].
4 Parameters affecting audiovisual user perceived
quality
4.1 Audiovisual user perceived quality model
The characteristics affecting audiovisual user perceived quality and their interactions are illustrated in figure 1.
����
������������������
����������
���������������
������
����������������
����
�������
����������
�������������
�����
������������
������
��������
���������������
Figure 1: Characteristics affecting audiovisual user perceived quality
ETSI
14 ETSI ES 202 667 V1.1.1 (2009-04)

4.2 Coding algorithms
4.2.1 Speech
There are two groups of speech codecs:
• narrowband codecs, transmitting the frequency band from 300 Hz to 3 400 Hz;
• wideband codecs, transmitting the frequency band from 50 Hz to 7 000 Hz.
NOTE 1: The transmission bandwidth indicated is the bandwidth supported by the coding algorithm. The actual
bandwidth supported may be restricted due to handset or terminal characteristics.
Speech codecs are often classified into three types:
• waveform codecs;
• source codecs;
• hybrid codecs.
Waveform codecs attempt, without using any knowledge of how the signal to be coded was generated, to produce a
reconstructed signal whose waveform is as close as possible to the original. This means that in theory they should be
signal independent and work well with non-speech signals. An example is the codec standardized in ITU-T
Recommendation G.711 [74].
To reduce the required bit rate, the difference compared with the previous sample may be transmitted instead of the
actual sample. This technique is call delta modulation or differential PCM (DPCM). This technique may be further
enhanced by predicting the value of the next sample from the previous samples and transmit the difference between the
predicted value and the actual sampled value (ADPCM).
The input speech signal may also be split into a number of frequency bands, or sub-bands, and each is coded
independently. This is called Sub-band coding. An example is the codec defined in ITU-T Recommendation G.722 [4]
where the 7 kHz frequency band is divided into two sub-bands, which are coded independent of each other.
Source coders operate using a model of how the source is generated, and attempt to extract, from the signal being
coded, the parameters of the model. Coders using this technique require very low bit rate, but the quality is usually not
good enough for public telecommunication applications.
Hybrid codecs attempt to fill the gap between waveform and source codecs. Although other forms of hybrid codecs
exist, the most successful and commonly used
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...