ETSI TR 102 505 V1.2.1 (2011-12)
Speech and multimedia Transmission Quality (STQ); Development of a Reference Web page
Speech and multimedia Transmission Quality (STQ); Development of a Reference Web page
RTR/STQ-00185m
General Information
Standards Content (Sample)
Technical Report
Speech and multimedia Transmission Quality (STQ);
Development of a Reference Web page
2 ETSI TR 102 505 V1.2.1 (2011-12)
Reference
RTR/STQ-00185m
Keywords
internet, quality
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE
Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16
Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88
Important notice
Individual copies of the present document can be downloaded from:
http://www.etsi.org
The present document may be made available in more than one electronic version or in print. In any case of existing or
perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF).
In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive
within ETSI Secretariat.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
http://portal.etsi.org/tb/status/status.asp
If you find errors in the present document, please send your comment to one of the following services:
http://portal.etsi.org/chaircor/ETSI_support.asp
Copyright Notification
No part may be reproduced except as authorized by written permission.
The copyright and the foregoing restriction extend to reproduction in all media.
© European Telecommunications Standards Institute 2011.
All rights reserved.
TM TM TM
DECT , PLUGTESTS , UMTS and the ETSI logo are Trade Marks of ETSI registered for the benefit of its Members.
TM
3GPP and LTE™ are Trade Marks of ETSI registered for the benefit of its Members and
of the 3GPP Organizational Partners.
GSM® and the GSM logo are Trade Marks registered and owned by the GSM Association.
ETSI
3 ETSI TR 102 505 V1.2.1 (2011-12)
Contents
Intellectual Property Rights . 4
Foreword . 4
Introduction . 4
1 Scope . 5
2 References . 5
2.1 Normative references . 5
2.2 Informative references . 5
3 Definitions and abbreviations . 6
3.1 Definitions . 6
3.2 Abbreviations . 6
4 General aspects of reference web page design . 6
4.1 Basic dimensions of typical web page content . 6
4.1.1 General considerations . 6
4.1.2 Summary of typical parameters determined for Kepler . 7
4.1.3 Composition rule proposal applied for Kepler . 7
4.2 Accelerators: Basic considerations . 8
4.3 Selection of a web page sample set . 9
4.4 Statistical and methodological considerations . 10
4.5 Validation of reference pages for acceleration effects . 10
4.6 Impact of server-side compression on http test results . 11
4.7 Handling precautions for reference web site file sets . 12
5 Measurement of web page characteristics . 12
5.1 Measurement of object count . 13
5.2 Measurement of total size . 13
5.3 Measurement of object size distribution . 13
5.4 Measurement of web page composition . 13
5.5 Measurement of image optimization . 13
6 Work flow description . 13
Annex A: Example workflows used in creation of the Copernicus reference web page . 15
A.1 Determination of parameters for the Copernicus web page . 15
A.1.1 Selection of samples . 15
A.1.2 Determining typical parameters . 15
A.1.2.1 Compression Factors. 15
A.1.2.2 Object count and Size . 18
A.1.2.3 Object types . 20
A.1.2.4 Object size distribution by type . 22
A.1.2.5 Compression factor for text-type objects . 22
A.1.3 Summary of typical parameters determined . 23
A.2 Parameter assessment of the current Copernicus page . 23
Annex B: Example workflows used in creation of the Kepler reference web page . 24
B.1 Determination of parameters for the Kepler web page. 24
B.1.1 Selection of samples . 24
B.1.2 Determining typical parameters . 24
B.1.2.1 Compression Factors. 24
B.1.2.2 Object count and Size . 24
B.1.2.3 Object types . 27
B.1.2.4 Special considerations on "Flash video" objects . 28
B.2 Multiple reference pages for different usage profiles . 28
History . 29
ETSI
4 ETSI TR 102 505 V1.2.1 (2011-12)
Intellectual Property Rights
IPRs essential or potentially essential to the present document may have been declared to ETSI. The information
pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found
in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in
respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web
server (http://ipr.etsi.org).
Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee
can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web
server) which are, or may be, or may become, essential to the present document.
Foreword
This Technical Report (TR) has been produced by ETSI Technical Committee Speech and multimedia Transmission
Quality (STQ).
Introduction
Web browsing is one of the most popular usages of the internet. Therefore, it is also an important target of QoS testing.
While the basics of web browsing are simple (and assumed to be known for the purpose of the present document),
proper performance testing of this service in mobile communications networks is not. There is a complex interplay
between web content, web and fixed network infrastructure, air interface and end-user devices including actual web
browser applications. If this interplay is not properly understood and taken care of, testing results may be meaningless
at best and wrong at worst.
NOTE 1: In the following, the terms "http testing" and "web browsing" are used synonymously.
NOTE 2: For the purpose of the present document, we use the term "transport channel" to describe the entire path
content has to pass from its origin (i.e. the http server) to its destination (the user's computer).
The goal of service testing is to get a quantitatively correct impression of the service usage experience from a typical
user's point of view. For obvious reasons, neither third-party servers nor third-party web content should be used.
Therefore, a specially assigned and well-controlled web server (reference server) will provide specially designed web
content (reference web page) to facilitate the necessary control of environmental conditions and assurance of
reproducibility.
It should be kept in mind that the scope of present work is usage of reference web pages in the context of QoS testing
according to TS 102 250 [i.1]. Therefore, reference to QoS parameters implicitly refers to TS 102 250 [i.1].
The present document attempts to describe reference web page design in a generic way. In annexes A and B, examples
are given. Annex A describes the workflow and parameters used to create the ETSI Copernicus. Annex B, likewise
describes the background information used for creating Kepler.
ETSI
5 ETSI TR 102 505 V1.2.1 (2011-12)
1 Scope
The present document describes the way used by a task force within the STQ Mobile working group to create and
validate the Copernicus and subsequently the Kepler reference web page to be used for QoS testing of http services.
This included acquisition of basic information on parameters of typical "real-world" web pages.
Whereas parts of the present document may appear to the reader as being a general guide, the present document should
be understood primarily as a report on what was done.
It should be clearly mentioned that the issue at hand is not "exact science". We want to design a reference page having
properties of a "typical" web page. However, such typical web pages follow trends, given by available design tools,
design paradigms, and fashions. The limits in which web designers operate are also given by transport channel
capabilities such as higher bandwidth or device capabilities to render multimedia content.
Given these facts, we should be aware that any concrete reference web page will need adjustment and reconsideration
from time to time. QoS testing is about ranking. Given two systems under test having similar performance, choice of
any particular reference web site may well determine ranking order, and open up room for discussions by people who
do not like the results. Since this works in all directions, the conclusion suggested by the authors of the present
document is to keep the qualitative nature of the subject well in mind and to refrain from taking the "winner/loser"
aspect of benchmarking too serious.
2 References
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the
reference document (including any amendments) applies.
Referenced documents which are not found to be publicly available in the expected location might be found at
http://docbox.etsi.org/Reference.
NOTE: While any hyperlinks included in this clause were valid at the time of publication, ETSI cannot guarantee
their long term validity.
2.1 Normative references
The following referenced documents are necessary for the application of the present document.
Not applicable.
2.2 Informative references
The following referenced documents are not necessary for the application of the present document but they assist the
user with regard to a particular subject area.
[i.1] ETSI TS 102 250 (all parts): "Speech Processing, Transmission and Quality Aspects (STQ); QoS
aspects for popular services in GSM and 3G networks".
NOTE: The Reference web sites worked out so far are to be found at http://docbox.etsi.org/STQ/Open/Kepler/.
ETSI
6 ETSI TR 102 505 V1.2.1 (2011-12)
3 Definitions and abbreviations
3.1 Definitions
For the purposes of the present document, the following terms and definitions apply:
object: single component of a web page, e.g. a jpg graphic
web page: overall subject of a single web site download, consisting of a main html document and a number of objects
the page contains
3.2 Abbreviations
For the purposes of the present document, the following abbreviations apply:
API Application Programming Interface
CF Compression Factor
CSS Cascading Style Sheets
DSL Digital Subscriber Line
DUN Dial-Up Networking
IP Internet Protocol
LAN Local Area Network ®
MSIE Microsoft Internet Explorer
OS Operating System
QoS Quality of Service
RAS Remote Access Service
RSS Really Simple Syndication
4 General aspects of reference web page design
4.1 Basic dimensions of typical web page content
4.1.1 General considerations
A reference web page is meant to represent "typical" content. The first and most important question is therefore: what
does "typical" mean exactly? Technically, this question asks for the technical dimensions, or the parameter space, a web
page has at all with respect to QoS testing.
According to the general framework and taxonomy defined in TS 102 250 [i.1], quality in data services usage is
ultimately given - once the basic availability and sustainability has been secured - by the speed a user-initiated activity
is performed, or, equivalently, by the time required to complete the activity, i.e. the download time, e.g. defined by SeT
according to TS 102 250-2 [i.1] (we will however use the well-established general term "download time").
In packet data services, end-to-end speed is determined by two basic transport channel properties: throughput and
packet delay (or roundtrip time). A transport channel able to deliver high throughput but having large delay may be
inferior -from the customer's perspective- to another channel having lower throughput but smaller delay.
The degree of influence throughput and delay has done, however, strongly depend on content structure to be transferred.
Two web pages having the same total size (sum of sizes for all objects that page is composed of) may have quite
different loading speed, depending on the number of objects they contain. Also, since typical web browsers load content
using multiple sockets in parallel, size distribution may have an impact on download time.
Therefore, the parameters or dimensions assumed to describe a web page are:
• Total size (sum of all objects the web page is composed of).
• Number of objects.
ETSI
7 ETSI TR 102 505 V1.2.1 (2011-12)
• Object size distribution.
Any web page can be described by a point in a space having these three dimensions. Graphically, a reference page
would be represented as a point being in the -somehow defined - "centre" of the cloud of points describing a sufficiently
large number of real-world web pages.
It should be noted here that many real-world web pages contain active content which is executed on the target
computer. This may be simple visible (e.g. animated pictures or short movies) and audible effects (e.g. sound clips), but
some pages also display dynamic content, i.e. news tickers or streaming video which constantly downloads new data
from the web. Of course the concept of "download time" implies that there is a clear end to download. Therefore, it is
assumed that any web page where "download time" is measured does either not perform content in aforementioned
sense, or that a clear definition of "end of download" exists. By logic, treatment of dynamic content with a fixed end of
download criterion does not add value to the task of reference web site, but may complicate things considerably and is
therefore excluded from further considerations.
In the scope of the present document, multimedia content of static nature, such as sound clips or videos which are
downloaded once, does not get special treatment: it is assumed that the related download dynamics are not significantly
different to other content such as jpg pictures.
However, even if this approach proved useful and workable, it turned out that there are more aspects to be considered
than just those described so far. The reason is that in the case of data transfer through mobile communication networks,
the transport channel properties may become content-dependent. The impact of this fact is dealt with in the next clause.
4.1.2 Summary of typical parameters determined for Kepler
The following table shows the Copernicus parameters versus the proposed "Kepler" parameters determined from the
survey described above and subsequent discussion and refinement steps in STQ Mobile.
Table 1
Total size, bytes % non-lossless Object count Composition
compressible ®
Kepler 60 75 125 000 bytes in "flash
807 076 (Windows size)
video" surrogate, 3 jpg's of
~45 000 bytes each
Compare: 200 000 57 38 Largest object: ~36 000
Copernicus bytes, type jpg
4.1.3 Composition rule proposal applied for Kepler
The parameters total size, object count and percentage of "non-lossless compressible allow a multitude of page designs.
Apart from application of "common sense", there is no formal rule on how to distribute object sizes and in the author's
opinion the effort to derive such rules may be unreasonably high, last but not least when considering the fact that web
pages constantly evolve. However, some transparency on how the final page is composed appeared to be desirable.
Therefore, the following rule set was proposed and positively discussed in the STQ MOBILE group:
• Apply the following rules for compressible and noncompressible objects separately.
• Define four size classes, each representing 25 % of the page's total size.
• Initially, use a "factor 3" object count scheme to populate each class with as many objects as to reach the total
size in this class.
• To fine tune the desired object count, vary the number and individual size of objects in the "last" class.
• Example: Total size is 100 kBytes, target object count is 50:
- 1 object of 25 kBytes.
- 3 objects of ~8,3 kBytes.
ETSI
8 ETSI TR 102 505 V1.2.1 (2011-12)
- 10 objects of 2,5 kBytes each.
- The remaining objects (here: 36) share the "last" 25 kBytes which makes an average size per object of
approximativaly 690 bytes.
Remarks:
- Actual data shows that a typical site has quite a lot of very small objects (100 Bytes to 200 Bytes).
Consider to apply above rule set to the last class again.
- Please note that the example if for 100 kBytes and actual sizes will have to be adjusted accordingly.
4.2 Accelerators: Basic considerations
In mobile communications, channel capacity is a precious resource. Therefore, the use of accelerators is widespread.
While there are various types of accelerators, and their inner workings not publicly documented, there are some general
principles accelerators that work on. Before going into details: Accelerators make reference web page design much
more challenging and error-prone. A proper understanding of the differences between transport channels with and
without acceleration is required. We will also show that with accelerators in the transport channel, the general concept
of reference web page design has to be expanded; in particular, the server hosting the reference page needs to be taken
into account too.
Given the fact that any acceleration can only be done by reducing the effective data volume to be transferred,
acceleration can be basically achieved in two ways: either by lossless compression of data, or by reduction of data
volume through removing information (lossy compression).
Basically there is not a real choice between these methods when it comes to efficiency of compression. Lossless
compression works very well with data objects of "text" type (such as htm, js, css), but has a very small or even a
negative effect on image-type objects such as gif or jpg which are already well compressed. For such object, only lossy
compression will give useful results.
NOTE: We assume that sound clips behave basically the same way as image-type objects. However no analysis
has been made on sound clips so far.
Lossy compression may, however, affect quality. This will depend on the properties of the source object, and the way
this object is presented on the target device. If a multi-mega pixel picture is displayed as a 10x10 cm image within a
web site, a lot of compression can be made before there is a really visible effect on image quality. However, if the
image is already optimized, any further reduction of object size will result in visible degradation of quality. According
to our findings, typical web sites use quite well-optimized images, which is reasonable because they lead to less server
traffic and provide a better download experience which both are in the commercial interest of web site owners.
Actually, there are good reasons to discuss the question if, the concept of determining web site download "quality" just
by download time, continues to make sense at all, even if this question will not be treated further in the present
document.
Reference web site design has to take accelerator effects into account, otherwise comparison between networks using
different types of accelerators (or no acceleration) would not be possible. If, for example, a reference web page has the
right size, object count and object size distribution, but contains non-optimized images, measurements using this page
would vastly exaggerate the effect of accelerators, i.e. give accelerated networks an unrealistic "measured" advantage.
Also, since in general the compression effect is different, for same-sized objects, in lossless and lossy compression, the
fraction of text-type and image-type objects needs to be considered.
Therefore, the parameter space introduced in the preceding clause needs to be expanded by two additional dimensions:
• Fraction of text-type objects in relation to total size.
• Degree of optimization in source objects of image type.
While these considerations are necessary, they turned out not to be not sufficient in the presence of accelerators. There
is one more aspect to take into consideration, which is the server hosting the reference content.
ETSI
9 ETSI TR 102 505 V1.2.1 (2011-12)
When a browser requests a web page, it may do so indicating its capability to support content compression (encoding).
If the browser indicates that it supports encoding, and the server is also supporting this, certain object types such as htm
(however probably not js or css) are transmitted compressed. If this is the case, any accelerator downstream in the
transport chain will not be able to significantly reduce object size further. This means that if the server supports
compression, the accelerator effect will be much smaller than otherwise.
In other words: it is not sufficient to focus on reference web pages for meaningful http tests; the server is part of the
signal chain, and needs to be considered in test case design.
There is no such thing as an isolated "reference web page" which can just be placed on any server and then be expected
to produce meaningful results.
Different methods of acceleration also are located in different places in the transport chain.
Lossy compression (quality reduction) of image-type takes place on the network infrastructure side and does not require
any special software on the client/browser side.
Lossless compression may take place in three basically different ways:
• If the browser indicates support of compression and the server does not, the accelerator could compress objects
(acting as a kind of proxy). This is, however, considered to be a rather weak form of acceleration since most
servers already support compression.
• The accelerator can modify the original html page to download an applet which will then act as counterpart for
accelerator-specific lossless compression. This raises the question of how the download of such an applet is to
be treated, since its size may be non-negligible in relation to total data size per web page download. If every
download is to be treated as independent, complete entity, the applet would have to be downloaded every time.
It is more realistic to assume that such an applet is only downloaded once and therefore would effectively
disappear from the equation. This point requires additional discussion.
• In a typical real-world situation, a user would purchase and install a network-specific driver and front end
package together with a wireless data card. This package may also contain a de-compression utility, acting
basically the same way as an applet, downloaded through the accelerator, but does not create traffic.
In effect, the first approach is assumed to be unrealistic, while the second and third are considered being equivalent,
provided the applet download is taken out of the size consideration.
For completeness it should be mentioned that another acceleration method is optimization of htm text, e.g. be
restructuring it in a more space-saving way or to remove non-functional content such as comments. It is assumed,
however, that professional web pages are already optimized in that way and therefore this aspect can be neglected.
4.3 Selection of a web page sample set
The approach of how to select the sample set has been discussed in an ad-hoc working group within the
ETSI STQ MOBILE group quite extensively, with the results put to discussion by the whole group. After some
plausibility checks on available ranking lists created the impression that they are not trustable or reliable enough, the
approach of manual selection was chosen. A web page appears to be candidate if it can reasonable be assumed that a
large number of customers will know and use the respective site. Candidate branches included:
• Mobile network operators.
• Popular newspapers and journals.
• Government authorities and agencies (e.g. labour agency).
• Large holiday travel agencies, public transport, airlines, and airports.
• Car makers.
• Banks and insurance companies.
• Web sites of large companies with internet-related fields of work. ®
• Search engines and popular web sites such as Wikipedia .
ETSI
10 ETSI TR 102 505 V1.2.1 (2011-12) ®
• Web shops and popular trade platforms (e.g. eBay ).
Result of above candidate identification leads to a candidate list, which then has to be screened for suitability. Pages
containing dynamic content have to be removed from the list.
A concrete list of web pages will of course be different for each country. For the work being performed, German sites
have been selected; it would be an interesting topic in its own right to compare, in the future, results for other countries.
After having performed the first surveys, with a time of about 6 months between the first and the second one, it became,
however, clear that the set of typical parameters underlies changes anyway which would make the search for precision
below, say, 10 % an academic exercise anyway. We believe that there is a clear and, for the foreseeable time, unbroken
trend towards larger total size and a reduction in object count. This trend is easily understandable and appears to stay
unbroken for a longer period of time. Internet connections tend to become more broadband, so web designers create
more sophisticated and feature-rich pages, which also have more active content.
4.4 Statistical and methodological considerations
If a reference web page is described by a number of parameters, each parameter value should represent a typical value
for a sufficiently large set of real-world pages. There are a number of ways to determine the target values; we will
discuss briefly the most common methods and explain the selection being made.
The criterion used to determine the appropriate method is assumed to be stability of result against changes in the set of
values used to derive the typical value from. Addition or removal of a small fraction of values should not affect the
typical value much. Most of the parameters describing a web page, as outline in the preceding text, can have extreme
values in single web sites. Averaging of some type (linear average, root mean square, etc.) is quite sensitive to single
extreme values and therefore appears to be no good choice. In contrast, percentile-based methods are much more stable
in the aforementioned sense. Therefore, we will select the typical values for object count and total size as the 50 % point
of distribution of respective sample values, i.e. as the value where approximately half of the values taken from the basic
set are below and above, respectively.
Web site content may change within relatively short periods of time, in particular for web site of newspapers and
similar sources. Therefore, downloads for comparison between accelerated and non-accelerated channels should be
made as simultaneous as possible. It is assumed that it is sufficient to have a maximum time difference shift between
download time windows of less than 10 minutes, which should be validated by analysis of respective timestamps. To
in
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...