Accuracy (trueness and precision) of measurement methods and results — Part 3: Intermediate precision and alternative designs for collaborative studies
This document provides a) a discussion of alternative experimental designs for the determination of trueness and precision measures including reproducibility, repeatability and selected measures of intermediate precision of a standard measurement method, including a review of the circumstances in which their use is necessary or beneficial, and guidance as to the interpretation and application of the resulting estimates, and b) worked examples including specific designs and computations. Each of the alternative designs discussed in this document is intended to address one (or several) of the following issues: a) a discussion of the implications of the definitions of intermediate precision measures; b) a guidance on the interpretation and application of the estimates of intermediate precision measures in practical situations; c) determining reproducibility, repeatability and selected measures of intermediate precision; d) improved determination of reproducibility and other measures of precision; e) improving the estimate of the sample mean; f) determining the range of in-house repeatability standard deviations; g) determining other precision components such as operator variability; h) determining the level of reliability of precision estimates; i) reducing the minimum number of participating laboratories by optimizing the reliability of precision estimates; j) avoiding distorted estimations of repeatability (split-level designs); k) avoiding distorted estimations of reproducibility (taking the heterogeneity of the material into consideration). Often, the performance of the method whose precision is being evaluated in a collaborative study will have previously been assessed in a single-laboratory validation study conducted by the laboratory which developed it. Relevant factors for the determination of intermediary precision will have been identified in this prior single-laboratory study.
Exactitude (justesse et fidélité) des résultats et méthodes de mesure — Partie 3: Fidélité intermédiaire et plans alternatifs pour les études collaboratives
Le présent document fournit: a) une discussion de plans d’expérience alternatifs pour la détermination de mesures de justesse et de fidélité, y compris la reproductibilité, la répétabilité et les mesures sélectionnées de la fidélité intermédiaire d’une méthode de mesure normalisée, incluant un examen des circonstances dans lesquelles leur utilisation est nécessaire ou bénéfique, ainsi que des recommandations relatives à l’interprétation et à l’application des estimations en résultant; et b) des exemples détaillés, incluant des plans et des calculs spécifiques. Chacun des plans alternatifs abordés dans le présent document est destiné à traiter l’un (ou plusieurs) des problèmes suivants: a) une discussion des implications des définitions des mesures de fidélité intermédiaire; b) des recommandations relatives à l’interprétation et à l’application des estimations des mesures de fidélité intermédiaire dans des situations pratiques; c) la détermination de la reproductibilité, de la répétabilité et de mesures sélectionnées de la fidélité intermédiaire; d) la détermination améliorée de la reproductibilité et d’autres mesures de la fidélité; e) l’amélioration de l’estimation de la moyenne de l’échantillon; f) la détermination de la plage des écarts-types de répétabilité interne; g) la détermination d’autres composantes de la fidélité, telles que la variabilité des opérateurs; h) la détermination du niveau de fiabilité des estimations de la fidélité; i) la réduction du nombre minimal de laboratoires participants en optimisant la fiabilité des estimations de la fidélité; j) l’évitement d’estimations biaisées de la répétabilité (plans à niveau fractionné); k) l’évitement d’estimations biaisées de la reproductibilité (en tenant compte de l’hétérogénéité du matériau). Il arrive souvent que la performance de la méthode dont la fidélité est soumise à évaluation dans une étude collaborative ait déjà été évaluée dans le cadre d’une étude de validation intralaboratoire menée par le laboratoire qui l’a élaborée. Des facteurs pertinents pour la détermination de la fidélité intermédiaire ont donc déjà été identifiés lors de cette étude intralaboratoire antérieure.  Autorisant une réduction du nombre de laboratoires.
Standards Content (Sample)
Accuracy (trueness and precision) of
measurement methods and results —
Intermediate precision and alternative
designs for collaborative studies
Exactitude (justesse et fidélité) des résultats et méthodes de mesure —
Partie 3: Fidélité intermédiaire et plans alternatifs pour les études
© ISO 2023
---------------------- Page: 1 ----------------------
COPYRIGHT PROTECTED DOCUMENT
© ISO 2023
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Published in Switzerland
© ISO 2023 – All rights reserved
---------------------- Page: 2 ----------------------
Introduction . vi
1 Scope . 1
2 Normative references . 2
3 Terms and definitions . 2
4 Symbols . 3
5 General requirements . 4
6 Intermediate measures of the precision of a standard measurement method .5
6.1 Factors and factor levels . 5
6.1.1 Definitions and examples . 5
6.1.2 Selection of factors of interest . 6
6.1.3 Random and fixed effects . 6
6.1.4 Statistical model . 7
6.2 Within-laboratory study and analysis of intermediate precision measures . 9
6.2.1 Simplest approach . 9
6.2.2 Alternative method . 10
6.2.3 Effect of the measurement conditions on the final quoted result . 10
7 Nested design .11
7.1 Balanced fully-nested design . 11
7.2 Staggered-nested design . 12
7.3 Balanced partially-nested design . 13
7.4 Orthogonal array design . 14
8 Design for heterogeneous material .16
8.1 Applications of the design for a heterogeneous material . 16
8.2 Layout of the design for a heterogeneous material . 17
8.3 Statistical analysis . 17
9 Split-level design .17
9.1 Applications of the split-level design . 17
9.2 Layout of the split-level design . 19
9.3 Statistical analysis . 19
10 Design across levels .19
10.1 Applications of the design across levels . 19
10.2 Layout of the design across levels . 20
10.3 Statistical analysis . 20
11 Reliability of interlaboratory parameters .20
11.1 Reliability of precision estimates . 20
11.2 Reliability of estimates of the overall mean . 21
11.2.1 General . 21
11.2.2 Balanced fully-nested design (2 factors) . 21
11.2.3 Staggered nested design (2 factors) . 21
11.2.4 Balanced partially-nested design . 21
11.2.5 Orthogonal array design . 21
11.2.6 Split-level design . 22
Annex A (informative) Fully- and partially-nested designs .23
Annex B (informative) Analysis of variance for balanced fully-nested design .25
Annex C (informative) Analysis of variance for staggered design .30
Annex D (informative) Analysis of variance for the balanced partially-nested design (three
© ISO 2023 – All rights reserved
---------------------- Page: 3 ----------------------
Annex E (informative) Statistical model for an experiment with heterogeneous material .41
Annex F (informative) Analysis of variance for split-level design .42
Annex G (informative) Example for split-level design . 44
Annex H (informative) Design across levels .47
Annex I (informative) Restricted maximum likelihood (REML) .48
Annex J (informative) Examples of the statistical analysis of intermediate precision
Annex K (informative) Example for an analysis across levels .55
© ISO 2023 – All rights reserved
---------------------- Page: 4 ----------------------
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use
of (a) patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed
patent rights in respect thereof. As of the date of publication of this document, ISO had not received
notice of (a) patent(s) which may be required to implement this document. However, implementers are
cautioned that this may not represent the latest information, which may be obtained from the patent
database available at www.iso.org/patents. ISO shall not be held responsible for identifying any or all
such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see
This document was prepared by Technical Committee ISO/TC 69, Applications of statistical methods,
Subcommittee SC 6, Measurement methods and results.
This second edition cancels and replaces the first edition (ISO 5725-3:1994), which has been technically
revised. It also incorporates the Technical Corrigendum ISO 5725-3:1994/Cor.1:2001.
The main changes are as follows:
— Several additional experimental designs have been added to this version compared to the previous
version, some of them from ISO 5725-5. These are orthogonal array designs, split level designs,
designs for heterogeneous sample material as well as designs across levels.
— Furthermore, the standard was supplemented by considerations on the selection of factors and
modelling of the factorial effects, as well as by a section in which the reliability of the various
interlaboratory test parameters (mean and precision parameters) are considered.
A list of all parts in the ISO 5725 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
© ISO 2023 – All rights reserved
---------------------- Page: 5 ----------------------
0.1 ISO 5725 uses two terms “trueness” and "precision” to describe the accuracy of a measurement
method. “Trueness” refers to the degree of agreement between the average value of a large number
of test results and the true or accepted reference value. “Precision” refers to the degree of agreement
between test results.
0.2 General consideration of these quantities is given in ISO 5725-1 and is not repeated here. It is
stressed that ISO 5725-1 provides underlying definitions and general principles should be read in
conjunction with all other parts of ISO 5725.
0.3 Many different factors (apart from test material heterogeneity) may contribute to the variability of
results from a measurement method, including:
a) the laboratory;
b) the operator;
c) the equipment used;
d) the calibration of the equipment;
e) the batch of a reagent;
f) the time elapsed between measurements;
g) environment (temperature, humidity, air pollution, etc.);
h) other factors.
0.4 Two conditions of precision, termed repeatability and reproducibility conditions, have been found
necessary and, for many practical cases, useful for describing the variability of a measurement method.
Under repeatability conditions, none of the factors a) to h) in 0.3 are considered to vary, while under
reproducibility conditions, all of the factors are considered to vary and contribute to the variability of
the test results. Thus, repeatability and reproducibility conditions are the two extremes of precision,
the first describing the minimum and the second the maximum variability in results. Intermediate
conditions between these two extreme conditions of precision are also conceivable, when one or more
of the factors listed in b) to g) are allowed to vary.
To illustrate the need for including a consideration of intermediate conditions in method validation,
consider the operation of a present-day laboratory connected with a production plant involving, for
example, a three-shift working system where measurements are made by different operators on
different equipment. Operators and equipment are then some of the factors that contribute to the
variability in the test results.
The standard deviation of test results obtained under repeatability conditions is generally less than
that obtained under intermediate precision conditions. Generally, in chemical analysis, the standard
deviation under intermediate precision conditions may be two or three times larger than that under
repeatability conditions. It should not, of course, exceed the reproducibility standard deviation.
As an example, in the determination of copper in copper ore, a collaborative study among 35 laboratories
revealed that the standard deviation under intermediate precision conditions (different times) was
1,5 times larger than that under repeatability conditions, both for the electrolytic gravimetry and
Na S 0 titration methods.
2 2 3
0.5 This document focuses on intermediate precision and alternative designs for collaborative studies
of a measurement method. Apart from the determination of intermediate precision measures, the
aims of these alternative designs include reducing the number of required measurements, increasing
the reliability of the estimates for precision and overall mean and taking into account test material
© ISO 2023 – All rights reserved
---------------------- Page: 6 ----------------------
Indeed, a t -factor fully-nested experiment with two levels per factor (inside each laboratory, there are
t−1 factors) and two replicates per setting requires 22 · test results from each laboratory, which
can be an excessive requirement on the laboratories. For this reason, in the previous version of
ISO 5725-3, the staggered nested design is also discussed. While the estimation of the precision
parameters is more complex and subject to greater uncertainty in a staggered nested design, the
workload is reduced. This document offers alternative strategies to reduce the workload without
compromising the reliability of the precision estimates.
As far as the special designs for sample heterogeneity are concerned, they were discussed in the
previous version of ISO 5725-5. However, it is convenient to have one part of this standard dedicated to
the question of the design of experiments.
0.6 The repeatability precision as determined in accordance with ISO 5725-2 is computed as a mean
across participating laboratories. Whether it can be used for quality control purposes depends on
whether the repeatability standard deviation can be considered to remain constant across laboratories.
For this reason, it is important to obtain information on how the repeatability standard deviation varies
within and between the laboratories under different conditions.
0.7 In many collaborative studies, the between-laboratory variability is large in comparison to the
repeatability, and it would be useful to a) decompose it into several different precision components, b)
reduce, if possible, some sources of variability which are due to the intermediate precision conditions.
This can be done by identifying factors (e.g. time, calibration, operator or equipment) which contribute
to the variability under intermediate precision conditions of measurement, by quantifying the
corresponding variability components and, wherever achievable, decreasing their contribution. In this
manner, the intermediate precision component of the overall variance is enlarged while the between-
laboratory component of the overall variance is reduced. Only random effects are considered: it is only
reasonable to model a factor as a fixed effect after a method or calibration optimization study has been
conducted. In this standard, different relationships between factors are taken into account, e.g. whether
a particular factor is subsumed under another factor or not.
0.8 Estimates for precision and overall mean are subject to random variability. Accordingly, it
is important to determine the uncertainty associated with each estimate, and to understand the
relationships between this uncertainty, the number of participants and the design. Once these
relationships are understood, it becomes possible to make much more informed decisions concerning
the number of participants and the experimental design.
0.9 Provided different factorial effects do contribute to the variability, determining the respective
precision components may make it possible to reduce the required number of participating laboratories,
since the between-laboratory variability can be expected to be less dominant. However, it is highly
recommended to have a reasonable number of participating laboratories in order to ensure a realistic
assessment of the overall method variability obtained under routine conditions of operation.
0.10 In the uniform-level design according to part 2 of this standard, there is a risk that an operator will
allow the result of a measurement on one sample to influence the result of a subsequent measurement
on another sample of the same material, causing the estimates of the repeatability and reproducibility
standard deviations to be biased. When this risk is considered to be serious, the split-level design
described in this document may be preferred as it reduces this risk. Care should be taken that the
two materials used at a particular level of the experiment are sufficiently similar to ensure that the
same precision measures can be expected (in other words: the question arises whether the precision
component associated with a particular factor remains unchanged across a range of similar matrices).
0.11 The experimental design presented in ISO 5725-2 requires the preparation of a number of
identical samples of the material for use in the experiment. With heterogeneous materials this may not
be possible, so that the use of the basic method then gives estimates of the reproducibility standard
deviation that are inflated by the variation between the samples. The design for a heterogeneous
material given in this document yields information about the variability between samples which is not
obtainable from the basic method; it may be used to calculate an estimate of reproducibility from which
the between-sample variation has been removed.
© ISO 2023 – All rights reserved
---------------------- Page: 7 ----------------------
INTERNATIONAL STANDARD ISO 5725-3:2023(E)
Accuracy (trueness and precision) of measurement
methods and results —
Intermediate precision and alternative designs for
This document provides
a) a discussion of alternative experimental designs for the determination of trueness and precision
measures including reproducibility, repeatability and selected measures of intermediate precision
of a standard measurement method, including a review of the circumstances in which their use
is necessary or beneficial, and guidance as to the interpretation and application of the resulting
b) worked examples including specific designs and computations.
Each of the alternative designs discussed in this document is intended to address one (or several) of the
a) a discussion of the implications of the definitions of intermediate precision measures;
b) a guidance on the interpretation and application of the estimates of intermediate precision
measures in practical situations;
c) determining reproducibility, repeatability and selected measures of intermediate precision;
d) improved determination of reproducibility and other measures of precision;
e) improving the estimate of the sample mean;
f) determining the range of in-house repeatability standard deviations;
g) determining other precision components such as operator variability;
h) determining the level of reliability of precision estimates;
i) reducing the minimum number of participating laboratories by optimizing the reliability of
j) avoiding distorted estimations of repeatability (split-level designs);
k) avoiding distorted estimations of reproducibility (taking the heterogeneity of the material into
Often, the performance of the method whose precision is being evaluated in a collaborative study will
have previously been assessed in a single-laboratory validation study conducted by the laboratory
which developed it. Relevant factors for the determination of intermediary precision will have been
identified in this prior single-laboratory study.
1) Allowing a reduction in the number of laboratories.
© ISO 2023 – All rights reserved
---------------------- Page: 8 ----------------------
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 3534-1, Statistics — Vocabulary and symbols — Part 1: General statistical terms and terms used in
ISO 3534-2, Statistics — Vocabulary and symbols — Part 2: Applied statistics
ISO 5725-1, Accuracy (trueness and precision) of measurement methods and results — Part 1: General
principles and definitions
ISO Guide 33, Reference materials — Good practice in using reference materials
ISO Guide 35, Reference materials — Guidance for characterization and assessment of homogeneity and
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 3534-1, ISO 3534-2 and
ISO 5725-1 and the following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
group of settings (3.7) conducted in parallel or within a short time interval, and with the same samples
EXAMPLE Two settings:
Operator 1 + Calibration 1 + Equipment 1 + Batch 1
Operator 1 + Calibration 2 + Equipment 2 + Batch 1
Note 1 to entry: This definition is more specific than the general definition given in ISO 3534-3:2013, 3.1.25,
where block is defined as a collection of experimental units.
feature under examination as a potential source of variation
EXAMPLE Operator, calibration, equipment, day, reagent batch, storage temperature, shaker orbit, shaker
Note 1 to entry: Strictly speaking, the factor laboratory is a factor just like any other. However, since the ISO 5725
standard focuses on method validation by means of interlaboratory studies, the factor laboratory can be
considered to have a somewhat privileged role. The following characteristics distinguish it from other factors:
— The factor laboratory is indispensable: For each measurement, the name of the particular laboratory where
it was performed will always be provided in a collaborative study.
— The factor laboratory will almost always have more levels than other factors.
© ISO 2023 – All rights reserved
---------------------- Page: 9 ----------------------
It should also be noted that categories such as measurand, sample/matrix and level may also be
considered to be factors. However, in collaborative studies, they are often not taken into account
as such in the factorial design. The reason is that, for these factors, one is interested in a separate
statistical analysis for each separate factor level. In other words, one is interested in obtaining separate
precision measures for each particular measurand or concentration level, not across measurands or
concentration levels. However, in cases where it is required to quantify precision across, say, matrices,
then the factor sample/matrix should also be included in the design. Accordingly, in this document,
designs are discussed to be applied for a particular measurand or concentration level by different
laboratories all applying the same measurement procedure.
[SOURCE: ISO 3534-3:2013, 3.1.5, modified — Note 1 to entry was modified and Note 2 to entry was
setting (3.7), value or assignment of a factor (3.2)
EXAMPLE Operator 1, Operator 2
Note 1 to entry: In many designs, the majority of factors will be varied across two levels.
nested design, where there is a nesting hierarchy for every pair of factors (3.2)
EXAMPLE There are 2 operators in each laboratory, and each operator performs 2 calibrations, i.e., the
study includes 2 operators and 4 calibrations for each laboratory.
nested design where one factor (3.2) (the factor laboratory) is ranked higher than all other factors (i.e.,
all other factors are nested within the factor laboratory