ISO/TR 13587:2012
(Main)Three statistical approaches for the assessment and interpretation of measurement uncertainty
Three statistical approaches for the assessment and interpretation of measurement uncertainty
ISO/TR 13587:2012 is concerned with three basic statistical approaches for the evaluation and interpretation of measurement uncertainty: the frequentist approach including bootstrap uncertainty intervals, the Bayesian approach, and fiducial inference. The common feature of these approaches is a clearly delineated probabilistic interpretation or justification for the resulting uncertainty intervals. For each approach, the basic method is described and the fundamental underlying assumptions and the probabilistic interpretation of the resulting uncertainty are discussed. Each of the approaches is illustrated using two examples including an example from the ISO/IEC Guide 98-3 (Uncertainty of measurement ? Part 3: Guide to the expression of uncertainty in measurement (GUM:1995)). This document also includes a discussion of the relationship between the methods proposed in GUM Supplement 1 and these three statistical approaches.
Trois approches statistiques pour l'évaluation et l'interprétation de l'incertitude de mesure
General Information
Standards Content (Sample)
TECHNICAL ISO/TR
REPORT 13587
First edition
2012-07-15
Three statistical approaches for the
assessment and interpretation of
measurement uncertainty
Trois approches statistiques pour l'évaluation et l'interprétation de
l'incertitude de mesure
Reference number
ISO/TR 13587:2012(E)
©
ISO 2012
---------------------- Page: 1 ----------------------
ISO/TR 13587:2012(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2012
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2012 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/TR 13587:2012(E)
Contents Page
Foreword . v
Introduction . vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Symbols (and abbreviated terms) . 2
5 The problem addressed . 3
6 Statistical approaches . 4
6.1 Frequentist approach . 4
6.2 Bayesian approach . 5
6.3 Fiducial approach . 5
6.4 Discussion . 6
7 Examples . 6
7.1 General . 6
7.2 Example 1a . 6
7.3 Example 1b . 7
7.4 Example 1c . 7
8 Frequentist approach to uncertainty evaluation . 7
8.1 Basic method . 7
8.2 Bootstrap uncertainty intervals . 10
8.3 Example 1 . 13
8.3.1 General . 13
8.3.2 Example 1a . 14
8.3.3 Example 1b . 15
8.3.4 Example 1c . 15
9 Bayesian approach for uncertainty evaluation . 16
9.1 Basic method . 16
9.2 Example 1 . 18
9.2.1 General . 18
9.2.2 Example 1a . 18
9.2.3 Example 1b . 20
9.2.4 Example 1c . 21
9.2.5 Summary of example . 21
10 Fiducial inference for uncertainty evaluation . 21
10.1 Basic method . 21
10.2 Example 1 . 23
10.2.1 Example 1a . 23
10.2.2 Example 1b . 25
10.2.3 Example 1c . 26
11 Example 2: calibration of a gauge block . 26
11.1 General . 26
11.2 Frequentist approach . 28
11.3 Bayesian approach . 30
11.4 Fiducial approach . 33
12 Discussion . 35
12.1 Comparison of uncertainty evaluations using the three statistical approaches . 35
© ISO 2012 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/TR 13587:2012(E)
12.2 Relation between the methods proposed in GUM Supplement 1 (GUMS1) and the three
statistical approaches .38
13 Summary .40
Bibliography .42
iv © ISO 2012 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/TR 13587:2012(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies
(ISO member bodies). The work of preparing International Standards is normally carried out through ISO
technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of technical committees is to prepare International Standards. Draft International Standards
adopted by the technical committees are circulated to the member bodies for voting. Publication as an
International Standard requires approval by at least 75 % of the member bodies casting a vote.
In exceptional circumstances, when a technical committee has collected data of a different kind from that
which is normally published as an International Standard (“state of the art”, for example), it may decide by a
simple majority vote of its participating members to publish a Technical Report. A Technical Report is entirely
informative in nature and does not have to be reviewed until the data it provides are considered to be no
longer valid or useful.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights.
ISO/TR 13587:2012 was prepared by Technical Committee ISO/TC 69, Applications of statistical methods,
Subcommittee SC 6, Measurement methods and results.
This Technical Report is primarily based on Reference [10].
© ISO 2012 – All rights reserved v
---------------------- Page: 5 ----------------------
ISO/TR 13587:2012(E)
Introduction
[1]
The adoption of ISO/IEC Guide 98-3 (GUM) has led to an increasing recognition of the need to include
uncertainty statements in measurement results. Laboratory accreditation based on International Standards
[2]
like ISO 17025 has accelerated this process. Recognizing that uncertainty statements are required for
effective decision-making, metrologists in laboratories of all types, from National Metrology Institutes to
commercial calibration laboratories, are exerting considerable effort on the development of appropriate
uncertainty evaluations for different types of measurement using methods given in the GUM.
Some of the strengths of the procedures outlined and popularized in the GUM are its standardized approach
to uncertainty evaluation, its accommodation of sources of uncertainty that are evaluated either statistically
(Type A) or non-statistically (Type B), and its emphasis on reporting all sources of uncertainty considered. The
main approach to uncertainty propagation in the GUM, based on linear approximation of the measurement
function, is generally simple to carry out and in many practical situations gives results that are similar to those
obtained more formally. In short, since its adoption, the GUM has sparked a revolution in uncertainty
evaluation.
Of course, there will always be more work needed to improve the evaluation of uncertainty in particular
applications and to extend it to cover additional areas. Among such other work, the Joint Committee for
Guides in Metrology (JCGM), responsible for the GUM since the year 2000, has completed Supplement 1 to
[3]
the GUM, namely, “Propagation of distributions using a Monte Carlo method” (referred to as GUMS1) . The
JCGM is developing other supplements to the GUM on topics such as modelling and models with any number
of output quantities.
Because it should apply to the widest possible set of measurement problems, the definition of measurement
[4]
uncertainty in ISO/IEC Guide 99:2007 as a “non-negative parameter characterizing the dispersion of the
quantity values being attributed to a measurand, based on the information used” cannot reasonably be given
at more than a relatively conceptual level. As a result, defining and understanding the appropriate roles of
different statistical quantities in uncertainty evaluation, even for relatively well-understood measurement
applications, is a topic of particular interest to both statisticians and metrologists.
Earlier investigations have approached these topics from a metrological point of view, some authors focusing
on characterizing statistical properties of the procedures given in the GUM. Reference [5] shows that these
procedures are not strictly consistent with either a Bayesian or frequentist interpretation. Reference [6]
proposes some minor modifications to the GUM procedures that bring the results into closer agreement with a
Bayesian interpretation in some situations. Reference [7] discusses the relationship between procedures for
uncertainty evaluation proposed in GUMS1 and the results of a Bayesian analysis for a particular class of
models. Reference [8] also discusses different possible probabilistic interpretations of coverage intervals and
recommends approximating the posterior distributions for this class of Bayesian analyses by probability
distributions from the Pearson family of distributions.
Reference [9] compares frequentist (“conventional”) and Bayesian approaches to uncertainty evaluation.
However, the study is limited to measurement systems for which all sources of uncertainty can be evaluated
using Type A methods. In contrast, measurement systems with sources of uncertainty evaluated using both
Type A and Type B methods are treated in this Technical Report and are illustrated using several examples,
including one of the examples from Annex H of the GUM.
Statisticians have historically placed strong emphasis on using methods for uncertainty evaluation that have
probabilistic justification or interpretation. Through their work, often outside metrology, several different
approaches for statistical inference relevant to uncertainty evaluation have been developed. This Technical
Report presents some of those approaches to uncertainty evaluation from a statistical point of view and
relates them to the methods that are currently being used in metrology or are being developed within the
metrology community. The particular statistical approaches under which different methods for uncertainty
evaluation will be described are the frequentist, Bayesian, and fiducial approaches, which are discussed
further after outlining the notational conventions needed to distinguish different types of quantities.
vi © ISO 2012 – All rights reserved
---------------------- Page: 6 ----------------------
TECHNICAL REPORT ISO/TR 13587:2012(E)
Three statistical approaches for the assessment and
interpretation of measurement uncertainty
1 Scope
This Technical Report is concerned with three basic statistical approaches for the evaluation and
interpretation of measurement uncertainty: the frequentist approach including bootstrap uncertainty intervals,
the Bayesian approach, and fiducial inference. The common feature of these approaches is a clearly
delineated probabilistic interpretation or justification for the resulting uncertainty intervals. For each approach,
the basic method is described and the fundamental underlying assumptions and the probabilistic interpretation
of the resulting uncertainty are discussed. Each of the approaches is illustrated using two examples, including
an example from ISO/IEC Guide 98-3 (Uncertainty of measurement — Part 3: Guide to the expression of
uncertainty in measurement (GUM:1995)). In addition, this document also includes a discussion of the
relationship between the methods proposed in the GUM Supplement 1 and these three statistical approaches.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO 3534-1:2006, Statistics — Vocabulary and symbols — Part 1: General statistical terms and terms used in
probability
ISO 3534-2:2006, Statistics — Vocabulary and symbols — Part 2: Applied statistics
ISO/IEC Guide 98-3:2008, Uncertainty of measurement — Part 3: Guide to the expression of uncertainty in
measurement (GUM:1995)
ISO/IEC Guide 98-3:2008/Suppl 1:2008, Uncertainty of measurement — Part 3: Guide to the expression of
uncertainty in measurement (GUM:1995) — Supplement 1: Propagation of distributions using a Monte Carlo
method
3 Terms and definitions
For the purposes of this document, the terms and definitions in ISO 3534-1, ISO 3534-2 and the following
apply.
3.1
empirical distribution function
empirical cumulative distribution function
distribution function that assigns probability 1 n to each of the items in a random sample, i.e., the empirical
n
distribution function is a step function defined by
x x
i
Fx() ,
n
n
where x ,., x is the sample and A is the number of elements in the set A .
1 n
© ISO 2012 – All rights reserved 1
---------------------- Page: 7 ----------------------
ISO/TR 13587:2012(E)
3.2
Bayesian sensitivity analysis
study of the effect of the choices of prior distributions for the parameters of the statistical model on the
posterior distribution of the measurand
3.3
sufficient statistic
function of a random sample X ,., X from a probability density function with parameter for which the
1 n
conditional distribution of X ,., X given this function does not depend on
1 n
NOTE A sufficient statistic contains as much information about as X ,., X .
1 n
3.4
observation model
mathematical relation between a set of measurements (indications), the measurand, and the associated
random measurement errors
3.5
structural equation
statistical model relating the observable random variable to the unknown parameters and an unobservable
random variable whose distribution is known and free of unknown parameters
3.6
non-central chi-squared distribution
probability distribution that generalizes the typical (or central) chi-squared distribution
2
NOTE 1 For independent, normally distributed random variables X with mean and variance , the random
k
i i i
k
2
variable XX () is non-central chi-squared distributed. The non-central chi-squared distribution has two
ii
i1
parameters: k , the degrees of freedom (i.e., the number of X ), and , which is related to the means of the random
i
k
2
variables X by () and called the non-centrality parameter.
i ii
i1
2
NOTE 2 The corresponding probability density function is expressed as a mixture of central probability density
functions as given by
2 i
e (2 )
gg() ()
XY
ki2
i!
i0
() k
i 1
,
i
22
e
k
k
i0 2i
2
ii2!
2
2
where Y is distributed as chi-squared with q degrees of freedom.
q
4 Symbols (and abbreviated terms)
In 4.1.1 of the GUM, it is stated that Latin letters are used to represent both physical quantities to be
determined by measurement (i.e., measurands in GUM terminology) as well as random variables that may
take different observed values of a physical quantity. This use of the same symbols, whose different meanings
are only indicated by context, can be difficult to interpret and sometimes leads to unnecessary ambiguities or
misunderstandings. To mitigate this potential source of confusion, the more traditional notation often used in
the statistical literature is employed in this Technical Report. In this notation, Greek letters are used to
represent parameters in a statistical model (e.g., measurands), which can be either random variables or
2 © ISO 2012 – All rights reserved
---------------------- Page: 8 ----------------------
ISO/TR 13587:2012(E)
constants depending on the statistical approach being used and nature of the model. Upper-case Latin letters
are used to represent random variables that can take different values of an observable quantity (e.g., potential
measured values), and lower-case Latin letters to represent specific observed values of a quantity (e.g.,
specific measured values). Since additional notation may be required to denote other physical, mathematical,
1)
or statistical concepts, there will still always be some possibility for ambiguity . In those cases the context
clarifies the appropriate interpretation.
5 The problem addressed
5.1 The concern in this Technical Report is with a measurement model in which ,., are input
1 p
quantities and is the output quantity:
f ., , (1)
1 p
where f is known as the measurement function. The function f is specified mathematically or as a
calculation procedure. In the GUM (4.1, NOTE 1), the same functional relationship is given as
YfX .,X (2)
1 p
which cannot be easily distinguished from the measurement function evaluated at the values of the
corresponding random variables for each observed input.
Using the procedure recommended in the GUM, the p unknown quantities … are estimated by
1 p
values x., x obtained from physical measurement or from other sources. Their associated standard
1 p
uncertainties are also obtained from the relevant data by statistical methods or from probability density
functions based on expert knowledge that characterize the variables. The GUM (also see 4.5 in
Reference [11]) recommends that the same measurement model that relates the measurand to the input
quantities … be used to calculate y from x., x . Thus, the measured value (or, in statistical
1 p 1 p
nomenclature, the estimate) y of is obtained as
yf(.x .,x), (3)
1 p
that is, the evaluated Y , yf (,x .x) , is taken to be the measured value of . The estimates y , x., x
1 p 1 p
are realizations of YX, .,X , respectively.
1 p
5.2 In this Technical Report, three statistical approaches are each used to provide (a) a best estimate y of
, (b) the associated standard uncertainty uy() , and (c) a confidence interval or coverage interval for for a
prescribed coverage probability (often taken as 95 %).
5.3 When discussing standard uncertainties, distinction is made between evaluated standard uncertainties
associated with estimates of various quantities and their corresponding theoretical values. Accordingly,
notation such as or will denote theoretical standard uncertainties and notation such as S and s will
X X x
denote an evaluated standard uncertainty before and after being observed, respectively.
1) For example, not all quantities represented by Greek letters in a statistical model must be parameters of the model.
One common example of this type of quantity is the set of unobservable quantities that represent the random
measurement errors found in most statistical models (i.e., the in the model Y ).
i ii
© ISO 2012 – All rights reserved 3
---------------------- Page: 9 ----------------------
ISO/TR 13587:2012(E)
6 Statistical approaches
6.1 Frequentist approach
6.1.1 The first statistical approach to be considered, in which uncertainty can be evaluated probabilistically,
is frequentist. The frequentist approach is sometimes referred to as “classical” or “conventional”. However,
due to the nature of uncertainty in metrology, these familiar methods must often be adapted to obtain
frequentist uncertainty intervals under realistic conditions.
6.1.2 In the frequentist approach, the input quantities … in the measurement model (1) and the output
1 p
quantity are regarded as unknown constants. Then, data related to each input parameter, , is obtained
i
and used to estimate the value of based on the measurement model or the corresponding statistical
models. Finally, confidence intervals for , for a specified level of confidence, are obtained using one of
several mathematical principles or procedures, for example, least-squares, maximum likelihood, or the
bootstrap.
6.1.3 Because is treated as a constant, a probabilistic statement associated with a confidence interval
for is not a direct probability statement about its value. Instead, it is a probability statement about how
frequently the procedure used to obtain the uncertainty interval for the measurand would encompass the value
of with repeated use. “Repeated use” means that the uncertainty evaluation is replicated many times using
different data drawn from the same distributions. Traditional frequentist uncertainty intervals provide a
probability statement about the long-run properties of the procedure used to construct the interval under the
particular set of conditions assumed to apply to the measurement process.
6.1.4 In most practical metrological settings, on the other hand, uncertainty intervals are to account for the
uncertainty associated with estimates of quantities obtained using measured values (observed data) and also
the uncertainty associated with estimates of quantities based on expert knowledge. To obtain an uncertainty
interval analogous to a confidence interval, the quantities that are not based on measured values are treated
as random variables with probability distributions for their values while those quantities whose values can be
estimated using statistical data are treated as unknown constants.
6.1.5 Traditional frequentist procedures for the construction of confidence intervals are then to be modified
to attain the specified confidence level after averaging over the potential values of the quantities assessed
[5]
using expert judgment . Such modified coverage intervals provide long-run probability statements about the
procedure used to obtain the interval given probability distributions for the quantities that have not been
measured, just as traditional confidence intervals do when all parameters are treated as constants.
6.1.6 Table 1 summarizes interpretations of the frequentist, Bayesian and fiducial approaches to uncertainty
evaluation.
4 © ISO 2012 – All rights reserved
---------------------- Page: 10 ----------------------
ISO/TR 13587:2012(E)
Table 1 — Interpretations of the approaches to uncertainty evaluation
Approach Characterization of quantities Uncertainty interval for Note
in measurement model output quantity
f .,
1 p
Frequentist Long-run occurrence Classical frequentist approach
and the all unknown
i
frequency that interval extended to integrate over
constants
contains uncertainties that are not
statistically evaluated
Bayesian Coverage interval Possible non-uniqueness of
and the are random
i
interval due to the choice of
containing based on a
variables. Their probability
priors
posterior distribution
distributions represent beliefs
for
about the values of the input
and output quantities
Fiducial Coverage interval Non-uniqueness due to the
regarded as random
i
choice of the structural equation
containing based on a
variables whose distributions
fiducial distribution for
are obtained from assumptions
on observed data used to
estimate and expert
i
knowledge about
i
6.2 Bayesian approach
The second approach is called the Bayesian approach. It is named after the fundamental theorem on which it
[12]
is based, which was proved by the Reverend Thomas Bayes in the mid-1700s . In this approach,
knowledge about the quantities in measurement model (1) in Clause 5 is modelled as a set of random
variables that follow a joint probability distribution for … and . Bayes’ theorem then allows these
1 p
probability distributions to be updated based on the observed data (also modelled using probability
distributions) and the interrelationships of the parameters defined by the function f or equivalent statistical
models. Then, a probability distribution is obtained that describes knowledge of given the observed data.
Uncertainty intervals that contain with any specified probability can then be obtained from this distribution.
Because knowledge of the parameter values is described by probability distributions, Bayesian methods
provide direct probabilistic statements about the value of and the other parameters, using a definition of
probability as a measure of belief.
6.3 Fiducial approach
[13]
6.3.1 The fiducial approach was developed by R.A. Fisher in the 1930s. In this approach, a probability
distribution, called the fiducial distribution, for conditional on the data is obtained based on the
interrelationship of and the described by f and the distributional assumptions about the data used to
i
estimate the . Once obtain
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.