SIST EN ISO 4259-5:2024
(Main)Petroleum and related products - Precision of measurement methods and results - Part 5: Statistical assessment of agreement between two different measurement methods that claim to measure the same property (ISO 4259-5:2023)
Petroleum and related products - Precision of measurement methods and results - Part 5: Statistical assessment of agreement between two different measurement methods that claim to measure the same property (ISO 4259-5:2023)
This document specifies statistical methodology for assessing the expected agreement between two test methods that purport to measure the same property of a material, and for deciding if a simple linear bias correction can further improve the expected agreement.
This document is applicable for analytical methods which measure quantitative properties of petroleum or petroleum products resulting from a multi-sample-multi-lab study (MSMLS). These types of studies include but are not limited to interlaboratory studies (ILS) meeting the requirements of ISO 4259-1 or equivalent, and proficiency testing programmes (PTP) meeting the requirements of ISO 4259-3 or equivalent.
The methodology specified in this document establishes the limiting value for the difference between two results where each result is obtained by a different operator using different apparatus and two methods X and Y, respectively, on identical material. One of the methods (X or Y) has been appropriately bias-corrected to agree with the other in accordance with this practice. This limit is designated as the between-methods reproducibility. This value is expected to be exceeded with a probability of 5 % under the correct and normal operation of both test methods due to random variation.
NOTE Further conditions for application of this methodology are given in 5.1 and 5.2.
Mineralölerzeugnisse - Präzision von Messverfahren und Ergebnissen - Teil 5: Statistische Bewertung der Übereinstimmung zweier verschiedener Messverfahren die vorgeben, dieselbe Eigenschaft zu messen (ISO 4259-5:2023)
Dieses Dokument legt die statistische Methodik zur Beurteilung der erwarteten Übereinstimmung zweier Prüfverfahren, die beanspruchen, dieselbe Eigenschaft eines Materials zu messen, und für die Entscheidung, ob eine einfache lineare Bias-Korrektur (Korrektur der systematischen Abweichung) die erwartete Übereinstimmung weiter verbessern kann, fest.
Dieses Dokument ist für Analyseverfahren anwendbar, bei denen die quantitativen Eigenschaften von Mineralölerzeugnissen gemessen werden, die sich aus einer Multi-Sample-Multi-Lab-Studie (MSMLS, en: multi-sample-multi-lab study) ergeben. Zu dieser Art von Studien gehören unter anderem Ringversuche (ILS, en: interlaboratory study), die den Anforderungen der ISO 4259 1 oder einer gleichwertigen Norm entsprechen, und Eignungsprüfungsprogramme (PTP, en: proficiency testing program), die den Anforderungen der ISO 4259 3 oder einer gleichwertigen Norm entsprechen.
Die in diesem Dokument festgelegte Methodik bestimmt den Grenzwert für die Differenz zwischen zwei Ergebnissen, wenn jedes Ergebnis von einem anderen Bearbeiter mit einem anderen Prüfgerät und zwei Verfahren X bzw. Y an identischem Material erhalten wurde. Eines der Verfahren (X oder Y) wurde entsprechend Bias-korrigiert, so dass es mit dem anderen Verfahren entsprechend dieser Praxis übereinstimmt. Dieser Grenzwert wird als die Vergleichbarkeit zwischen den Verfahren bezeichnet. Für diesen Wert wird erwartet, dass er bei üblicher und korrekter Durchführung beider Prüfverfahren aufgrund zufälliger Schwankung mit einer Wahrscheinlichkeit von 5 % überschritten wird.
ANMERKUNG Weitere Bedingungen für die Anwendung dieser Methodik sind in 5.1 und 5.2 angegeben.
Produits pétroliers et connexes - Fidélité des méthodes de mesure et de leurs résultats - Partie 5: Évaluation statistique de l'accord entre deux méthodes de mesure différentes qui prétendent mesurer la même propriété(ISO 4259-5:2023)
Le présent document spécifie une méthodologie statistique permettant d'évaluer l'accord attendu entre deux méthodes d'essai qui prétendent mesurer la même propriété d'un matériau et de déterminer si une simple correction de biais linéaire peut encore améliorer l'accord attendu.
Le présent document est applicable à des méthodes d'analyse qui mesurent des propriétés quantitatives du pétrole ou des produits connexes résultant d'une étude multi-échantillon et multi-laboratoire (MSMLS). Ces types d'études comprennent les essais interlaboratoires (ILS) conformes aux exigences de l'ISO 4259-1, ou équivalent, et les programmes d'essais d'aptitude (PTP) conformes aux exigences de l'ISO 4259-3, ou équivalent, sans toutefois s'y limiter.
La méthodologie spécifiée dans le présent document établit la valeur limite de la différence entre deux résultats, lorsque chaque résultat est obtenu par un opérateur différent utilisant un appareil différent et appliquant les deux méthodes X et Y, respectivement, sur un matériau identique. L'une des méthodes (X ou Y) a fait l'objet d'une correction de biais appropriée pour être en accord avec l'autre méthode conformément à ce mode opératoire. Cette limite est désignée comme la reproductibilité inter-méthodes. Il est attendu, avec une probabilité de 5 %, que cette valeur soit dépassée dans les conditions normales et correctes d'application de ces deux méthodes en raison de la variation aléatoire.
NOTE Les conditions supplémentaires pour l'application de cette méthodologie sont données en 5.1 et 5.2.
Nafta in sorodni proizvodi - Natančnost merilnih metod in rezultatov - 5. del: Statistična ocena skladnosti med dvema različnima merilnima metodama, za kateri velja trditev, da merita isto lastnost (ISO 4259-5:2023)
Ta dokument določa statistično metodologijo za ocenjevanje pričakovane skladnosti med dvema preskusnima metodama, s katerima naj bi se merilo isto lastnost materiala, in za določanje, ali lahko enostaven popravek linearne nevzorčne napake izboljša pričakovano skladnost.
Ta dokument se uporablja za analitične metode, s katerimi se merijo kvantitativne lastnosti nafte
ali naftnih proizvodov, ki so nastali v študiji več vzorcev v več laboratorijih (MSMLS). Te vrste med drugim vključujejo
medlaboratorijske študije (ILS), ki izpolnjujejo zahteve iz standarda ISO 4259-1 ali enakovredne, in programe preskušanja strokovne usposobljenosti (PTP), ki izpolnjujejo zahteve iz standarda ISO 4259-3 ali enakovredno.
Metodologija, določena v tem dokumentu, vzpostavlja mejno vrednost za razlikovanje med dvema rezultatoma, pri čemer je vsak rezultat pridobil drug upravljavec z uporabo druge naprave in metodama X in Y na istem materialu. V skladu s to prakso je bila pri eni od metod (X ali Y) ustrezno
popravljena nevzorčna napaka. Ta meja je označena kot ponovljivost med metodama. Pričakuje se, da bo ta vrednost prekoračena z verjetnostjo 5 % ob
pravilni in normalni izvedbi obeh preskusnih metod zaradi naključne spremembe.
OPOMBA: Dodatni pogoji za uporabo te metodologije so navedeni v 5.1 in 5.2.
General Information
Standards Content (Sample)
SLOVENSKI STANDARD
01-marec-2024
Nafta in sorodni proizvodi - Natančnost merilnih metod in rezultatov - 5. del:
Statistična ocena skladnosti med dvema različnima merilnima metodama, za kateri
velja trditev, da merita isto lastnost (ISO 4259-5:2023)
Petroleum and related products - Precision of measurement methods and results - Part
5: Statistical assessment of agreement between two different measurement methods
that claim to measure the same property (ISO 4259-5:2023)
Mineralölerzeugnisse - Präzision von Messverfahren und Ergebnissen - Teil 5:
Statistische Bewertung der Übereinstimmung zweier verschiedener Messverfahren die
vorgeben, dieselbe Eigenschaft zu messen (ISO 4259-5:2023)
Produits pétroliers et connexes - Fidélité des méthodes de mesure et de leurs résultats -
Partie 5: Évaluation statistique de l'accord entre deux méthodes de mesure différentes
qui prétendent mesurer la même propriété(ISO 4259-5:2023)
Ta slovenski standard je istoveten z: EN ISO 4259-5:2024
ICS:
75.080 Naftni proizvodi na splošno Petroleum products in
general
75.180.20 Predelovalna oprema Processing equipment
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
EN ISO 4259-5
EUROPEAN STANDARD
NORME EUROPÉENNE
January 2024
EUROPÄISCHE NORM
ICS 75.080
English Version
Petroleum and related products - Precision of
measurement methods and results - Part 5: Statistical
assessment of agreement between two different
measurement methods that claim to measure the same
property (ISO 4259-5:2023)
Produits pétroliers et connexes - Fidélité des méthodes Mineralölerzeugnisse - Präzision von Messverfahren
de mesure et de leurs résultats - Partie 5: Évaluation und Ergebnissen - Teil 5: Statistische Bewertung der
statistique de l'accord entre deux méthodes de mesure Übereinstimmung zweier verschiedener
différentes qui prétendent mesurer la même Messverfahren die vorgeben, dieselbe Eigenschaft zu
propriété(ISO 4259-5:2023) messen (ISO 4259-5:2023)
This European Standard was approved by CEN on 10 December 2023.
CEN members are bound to comply with the CEN/CENELEC Internal Regulations which stipulate the conditions for giving this
European Standard the status of a national standard without any alteration. Up-to-date lists and bibliographical references
concerning such national standards may be obtained on application to the CEN-CENELEC Management Centre or to any CEN
member.
This European Standard exists in three official versions (English, French, German). A version in any other language made by
translation under the responsibility of a CEN member into its own language and notified to the CEN-CENELEC Management
Centre has the same status as the official versions.
CEN members are the national standards bodies of Austria, Belgium, Bulgaria, Croatia, Cyprus, Czech Republic, Denmark, Estonia,
Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Norway,
Poland, Portugal, Republic of North Macedonia, Romania, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, Türkiye and
United Kingdom.
EUROPEAN COMMITTEE FOR STANDARDIZATION
COMITÉ EUROPÉEN DE NORMALISATION
EUROPÄISCHES KOMITEE FÜR NORMUNG
CEN-CENELEC Management Centre: Rue de la Science 23, B-1040 Brussels
© 2024 CEN All rights of exploitation in any form and by any means reserved Ref. No. EN ISO 4259-5:2024 E
worldwide for CEN national Members.
Contents Page
European foreword . 3
European foreword
This document (EN ISO 4259-5:2024) has been prepared by Technical Committee ISO/TC 28
"Petroleum and related products, fuels and lubricants from natural or synthetic sources" in
collaboration with Technical Committee CEN/TC 19 “Gaseous and liquid fuels, lubricants and related
products of petroleum, synthetic and biological origin” the secretariat of which is held by NEN.
This European Standard shall be given the status of a national standard, either by publication of an
identical text or by endorsement, at the latest by July 2024, and conflicting national standards shall be
withdrawn at the latest by July 2024.
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. CEN shall not be held responsible for identifying any or all such patent rights.
Any feedback and questions on this document should be directed to the users’ national standards
body/national committee. A complete listing of these bodies can be found on the CEN website.
According to the CEN-CENELEC Internal Regulations, the national standards organizations of the
following countries are bound to implement this European Standard: Austria, Belgium, Bulgaria,
Croatia, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland,
Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Norway, Poland, Portugal, Republic of
North Macedonia, Romania, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, Türkiye and the
United Kingdom.
Endorsement notice
The text of ISO 4259-5:2023 has been approved by CEN as EN ISO 4259-5:2024 without any
modification.
INTERNATIONAL ISO
STANDARD 4259-5
First edition
2023-12
Petroleum and related products —
Precision of measurement methods
and results —
Part 5:
Statistical assessment of agreement
between two different measurement
methods that claim to measure the
same property
Produits pétroliers et connexes — Fidélité des méthodes de mesure et
de leurs résultats —
Partie 5: Évaluation statistique de l'accord entre deux méthodes de
mesure différentes qui prétendent mesurer la même propriété
Reference number
ISO 4259-5:2023(E)
ISO 4259-5:2023(E)
© ISO 2023
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
ISO 4259-5:2023(E)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Symbols . 3
5 Procedure overview .4
5.1 General requirements . 4
5.2 Additional requirements for PTP data . 5
5.2.1 General conditions . 5
5.2.2 Test on existence of extreme samples . 5
5.2.3 Test on distribution of lab results . 6
5.2.4 Comparison of precision . 7
5.3 Brief sequential steps of the procedure . 7
5.4 Flow diagram of the procedure . 9
6 Procedure .11
6.1 Sample mean and standard error . 11
6.1.1 General . 11
6.1.2 Computation of the means . 11
6.1.3 Calculation of standard errors . 11
6.2 Suitability of the data .12
6.2.1 Test on property variation .12
6.2.2 Correlation of the test methods .12
6.3 Bias correction selection statistics . 13
6.3.1 General .13
6.3.2 Class 0—No bias correction . 13
6.3.3 Class 1a—Constant bias correction . 13
6.3.4 Class 1b — Proportional bias correction . 14
6.3.5 Class 2 — Proportional and constant bias correction . 14
6.4 Selection of the appropriate bias correction class . 15
6.5 Confirming the normal distribution of weighted residuals . 16
6.6 Sample-specific biases . 17
7 Report .19
8 Confirmation of the correlation.19
Annex A (informative) Worked example using ILS data .21
Annex B (informative) Worked example using PTP data .33
Bibliography .48
iii
ISO 4259-5:2023(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use
of (a) patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed
patent rights in respect thereof. As of the date of publication of this document, ISO had not received
notice of (a) patent(s) which may be required to implement this document. However, implementers are
cautioned that this may not represent the latest information, which may be obtained from the patent
database available at www.iso.org/patents. ISO shall not be held responsible for identifying any or all
such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see
www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 28, Petroleum and related products, fuels
and lubricants from natural or synthetic sources, in collaboration with the European Committee for
Standardization (CEN) Technical Committee CEN/TC 19, Gaseous and liquid fuels, lubricants and related
products of petroleum, synthetic and biological origin, in accordance with the Agreement on technical
cooperation between ISO and CEN (Vienna Agreement).
A list of all parts in the ISO 4259 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
iv
ISO 4259-5:2023(E)
Introduction
This document explains the statistical methodology for assessing the expected agreement between
two standardized test methods that purport to measure the same property of a material. Subsequently,
it is investigated whether a linear bias correction can significantly improve the expected agreement.
The degree of agreement is expressed as a between-methods reproducibility after a bias correction (if
necessary) has been applied.
The method uses numerical results from a set of samples that have been analysed independently using
both test methods by different laboratories. The variation associated with each test method result is
used for assessing the required bias correction.
Annexes A and B give worked out examples showing how the methodology is applied.
v
INTERNATIONAL STANDARD ISO 4259-5:2023(E)
Petroleum and related products — Precision of
measurement methods and results —
Part 5:
Statistical assessment of agreement between two different
measurement methods that claim to measure the same
property
1 Scope
This document specifies statistical methodology for assessing the expected agreement between two
test methods that purport to measure the same property of a material, and for deciding if a simple
linear bias correction can further improve the expected agreement.
This document is applicable for analytical methods which measure quantitative properties of petroleum
or petroleum products resulting from a multi-sample-multi-lab study (MSMLS). These types of studies
include but are not limited to interlaboratory studies (ILS) meeting the requirements of ISO 4259-1
or equivalent, and proficiency testing programmes (PTP) meeting the requirements of ISO 4259-3 or
equivalent.
The methodology specified in this document establishes the limiting value for the difference between
two results where each result is obtained by a different operator using different apparatus and two
methods X and Y, respectively, on identical material. One of the methods (X or Y) has been appropriately
bias-corrected to agree with the other in accordance with this practice. This limit is designated as the
between-methods reproducibility. This value is expected to be exceeded with a probability of 5 % under
the correct and normal operation of both test methods due to random variation.
NOTE Further conditions for application of this methodology are given in 5.1 and 5.2.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 4259-1, Petroleum and related products — Precision of measurement methods and results — Part 1:
Determination of precision data in relation to methods of test
ISO 4259-3, Petroleum and related products — Precision of measurement methods and results — Part 3:
Monitoring and verification of published precision data in relation to methods of test
ISO 4259-4, Petroleum and related products — Precision of measurement methods and results — Part 4:
Use of statistical control charts to validate 'in-statistical-control' status for the execution of a standard test
method in a single laboratory
3 Terms and definitions
For the purposes of this document, the terms and definitions in ISO 4259-1 and the following terms and
definitions apply.
ISO 4259-5:2023(E)
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
multi-sample-multi-lab study
MSMLS
study in which one or more performance characteristics are determined on the basis of analytical
results from multiple samples and multiple laboratories
Note 1 to entry: Under certain conditions, inter laboratory studies and proficiency testing schemes meet this
definition of multi-sample-multi-lab study.
3.2
interlaboratory study
ILS
study specifically designed to estimate the repeatability and reproducibility of a standard test method
achieved at a fixed point in time by multiple laboratories through the statistical analysis of their test
results obtained on aliquots prepared from multiple materials
3.3
proficiency testing programme
PTP
programme designed for the periodic evaluation testing capability of participating laboratories of a
standard test method through the statistical analysis of their test results obtained on aliquots prepared
from a single batch of homogeneous material
Note 1 to entry: PTP is sometimes referred to as a proficiency testing (PT)-study or an interlaboratory cross
check programme (ILCP).
3.4
between-methods bias correction
quantitative expression of the mathematical correction, when applied to the outcome of either one of two
methods claiming to measure the same property, can result in a statistically significant improvement
between the expected values of the two test methods claiming to measure the same property
3.5
correlation coefficient
ρ
statistical measure of the strength and direction of the relationship between two variables
Note 1 to entry: Values always range between −1 (strong negative relationship) and +1 (strong positive
relationship). Values at or close to zero imply a weak or nonlinear relationship.
3.6
standard error
Δ
E
statistic estimating the standard deviation of the distribution of the average statistic obtained from the
repeat random sampling of a population
3.7
sample standard deviation
s
i
estimator of the population standard deviation using the sample mean and sample size
Note 1 to entry: Sample standard deviation is also referred to as standard deviation of the sample.
ISO 4259-5:2023(E)
3.8
between-methods reproducibility
R
XY
quantitative expression for the computation of the limiting value that the difference between two single
results is expected to exceed with a probability of 5 % due to random variation, under the correct and
normal operation of both test methods, where each result is obtained by different operators on an
identical test sample using different apparatus and applying the two methods X and Y, respectively;
when the methods have been assessed and an appropriate between-methods bias correction has been
applied to the result from either method (X or Y) in accordance with this practice
3.9
sum of squared residuals
Σ
SR
statistic used to quantify the degree of agreement between the results from two test methods after
between-methods bias-correction (3.4) using the methodology of this practice
Note 1 to entry: Σ is used as an optimality criterion in parameter selection and bias-correction model selection.
SR
3.10
total sum of squares
Σ
ST
statistic used to quantify the information content from the interlaboratory study (3.2) in terms of total
variation of sample means relative to the standard error (3.6) of each sample mean
3.11
resolution
smallest difference in two results that is represented by a different value
4 Symbols
Symbol Explanation
X, Y reference to the X- and Y-methods, respectively
th th th
Single k result on the i common material by the j lab using X-method and Y-method,
X , Y
ijk ijk
respectively
th
X , Y arithmetic mean of the i sample using X-method and Y-method, respectively
i i
weighted average across the samples used in the calculation of total sum of squares Σ and
ST,X̅
XY,
Σ for the X-method and Y-method, respectively
ST,Y̅
weighted average across the samples used in the calculation of the correlation coefficient ρ
XY,
for the X-method and Y-method
th
Δ , Δ absolute deviation of the weighted means of the i sample results from X̅ and Y̅ , respectively
xi yi
predicted Y-method value for a sample by applying the bias correction established from this
Ŷ
practice to an actual X-method result for the same sample
th
predicted i sample Y-method mean, by applying the bias correction established from this
Ŷ
i
practice to its corresponding X-method mean
S number of samples in the multi-lab-multi-sample data set
th
number of laboratories that returned results on the i sample using the X-method and
L , L
Xi Yi
Y-method, respectively
th th
n , n number of repeated results on the i sample of j lab using the X- and Y-methods, respectively
Xij Yij
R , R reproducibility of the X- and Y-methods, respectively
X Y
th
R , R reproducibility of the X- and Y-methods, evaluated at the method X and Y means of the i sample
Xi Yi
R between-methods reproducibility
XY
th
reproducibility standard deviation, evaluated at the i sample using method X and Y, re-
s , s
R,Xi R,Yi
spectively
th
s , s repeatability standard deviation, evaluated at the i sample using method X and Y, respectively
r,Xi r,Yi
ISO 4259-5:2023(E)
Symbol Explanation
weighted residual of Y-method mean values predicted from the corresponding X-method
ε
i th
mean values, Ŷ and mean of Y-method results, Y on the i sample
i i
th
Δ , , Δ standard error of the means of the i sample
E Xi E,Yi
weighted sum of squared residuals of the mean results of Y-method and the bias-corrected
Σ
SR,p
mean results of the X-method for a given model p where p = 0, 1a, 1b or 2 over all samples i
Σ , Σ total sum of squares, around the weighted averages X̅ and Y̅ over all samples i
ST,X̅ ST,Y̅
F test statistic for comparing variances, defined by the quotient of two variances
t student t-value at a specified confidence level and specified degrees of freedom
k class number of selected bias correction class
ν , ν degrees of freedom for reproducibility variances
X Y
th
w weight associated with the difference between (corrected) mean results from the i sample
i
a, b parameter of the bias correction: Ŷ = a + bX
h leverage of sample i in the set of samples
i
Z natural logarithm of the sample mean, averaged over both methods for sample i
i
overall average of natural logarithm Z of all samples
Z
i
t , t ratio for assessing reductions in sums of squares
1 2
standardized difference between Y and Ŷ sometimes referred to as error
ε
i i,
i
parameters of the quadratic function used for the iterative calculation of the proportional
A, B, C
coefficient b for class 1b and class 2 correction class
D difference statistic for confirmation of the correlation
22*
Anderson-Darling test statistic and modified test statistic, respectively
AA,
ii
ρ correlation coefficient
5 Procedure overview
5.1 General requirements
The procedures are intended to be executed by an analyst with sufficient working knowledge of the
statistical tools and theories described in the document.
The statistical methodology is based on the premise that a bias correction is not required. In the
absence of statistical evidence that a bias correction would improve the expected agreement between
the two methods, a bias correction is not made.
If a bias correction is required, then the parsimony principle is followed whereby a simple correction
is favoured over a more complex one if the latter does not yield a statistically observable improvement
over the former. Failure to adhere to this generally results in a model that is over-fitted and does not
perform well in practice.
NOTE 1 The parsimony principle is that the most acceptable explanation of an occurrence, phenomenon, or
event is the simplest, involving the fewest entities, assumptions.
The bias corrections of this practice are limited to a constant correction, proportional correction or a
linear (proportional + constant) correction.
The bias-correction methods of this practice are method symmetric, in the sense that equivalent
corrections are obtained regardless of which method is bias-corrected to match the other.
The methodology described in this document is applicable only if the standard error associated with
each mean test result is known or can be calculated and the degrees of freedom associated with all
standard errors are at least 30.
ISO 4259-5:2023(E)
This methodology is applied to a data source derived from a MSMLS. The study shall be conducted on at
least 10 independent materials that span the intersecting scopes of the test methods. The results shall
be obtained from at least six (6) laboratories using each method.
The results are obtained on the same comparison set of samples and it is recommended that both test
methods are not performed by the same laboratory. If this is the case, care shall be taken to ensure
independence of test results, for example by double-blind testing of samples in random order.
This methodology shall not be used on the basis of interim or temporary published precision
statements. Interim or temporary statements of accuracy generally lack the magnitude of the amount
of data applied and, as a result, insufficient degrees of freedom are available.
Combining multiple data sources is permissible provided the quality requirements for the data set as
specified in this document are met.
The test methods used by each laboratory shall be under statistical control, meeting the requirements
in ISO 4259-4.
This methodology requires data with sufficient resolution to permit variation to be observable in a
statistically meaningful manner. Statistically meaningful variation implies that the total number
of unique values in a set of data, i.e. the lab results of each sample for each test method, should be
sufficiently large. If, in the opinion of the analyst, the number of individual values in the data set is
insufficient, the data shall be requested again from the relevant laboratories with sufficient resolution.
If the data are only available with insufficient resolution, this evaluation should not be continued.
In case the data for the procedure originates from an ILS, all requirements of ISO 4259-1 shall be met
and the additional requirements regarding proficiency testing programme (PTP) data do not apply.
NOTE 2 Leverage is a measure of how far away the independent variables of an observation are from those of
the other observations.
NOTE 3 Cook’s distance is an estimate of the influence of a data point. It is used within the context of the
reference to indicate influential data points that are particularly worth checking for validity.
5.2 Additional requirements for PTP data
5.2.1 General conditions
The statistical calculations are also applicable for this evaluation, provided the results and associated
statistics for the test method are obtained from a PTP, which shall meet the requirements of ISO 4259-3.
A characteristic of data derived from such a PTP is that for each sample, a single result is provided by
each laboratory for the test method.
The following requirements apply when using PTP data:
— the results shall be obtained from at least 10 laboratories using the test method and are equidistantly
distributed over the range;
— the leverage of each sample in the data set shall not exceed the limiting value of 0,5 (see 5.2.2);
— the Anderson-Darling statistics for the tests on normal distribution of lab results per sample ≤1,12
shall be used (see 5.2.3);
— the sample standard deviations shall not significantly exceed the published reproducibility standard
deviations for at least 80 % of the samples at the 0,05 significance level (see 5.2.4).
5.2.2 Test on existence of extreme samples
The leverage value h for each sample i in the data set is examined and may not exceed the limiting
i
value of 0,5. If a value for h of a sample exceeds this limiting value, this sample is characterized as
i
ISO 4259-5:2023(E)
extreme. For each of the two methods, the average of the laboratory results is calculated per sample.
Subsequently, each laboratory average per sample is averaged over both test methods.
The leverage value h is defined by Formula (1):
i
ZZ−
()
i
h =+ (1)
i
S
S
()ZZ− ²
∑ k
k=1
where
h is the leverage of sample i, i = 1 … S,
i
S is the total number of samples,
Z is the natural logarithm (ln) of the sample mean, averaged over both methods,
i
Z̄ is the overall average of all Z .
i
If one or more samples are characterized as extreme, they shall be removed and the procedure should
be repeated. The minimum number of remaining samples shall be taken into account. If the minimum
requirement for a number of samples can no longer be met, the procedure shall be discontinued.
5.2.3 Test on distribution of lab results
The distribution of the lab results for each sample are tested for normality by confirming the goodness-
of-fit of the normal distribution using the Anderson-Darling statistic per sample.
NOTE 1 The Anderson-Darling test is a statistical test of whether a given sample of data are drawn from a
given probability distribution. Within the context of this document, this test is used as a test on normality, with
probability distribution parameters (mean and standard deviation) estimated from the sample. See Reference [7]
for further details.
NOTE 2 The critical value of 1,12 is based on a significance level of approximately 1 %, taking into account the
effects of rounding of the input data on the resolution.
2*
The test statistic A is calculated according to Formula (2):
i
07,,52 25
22*
AA=+1 + (2)
ii
2
N
N
i
i
where
N is the total number of lab results in the set,
i
N
A
−−N ()21iF− {}ln[]()xF+−ln[]1 ()x ,
i
iN−+i 1
∑
N
i=1
F(x ) is the cumulative normal distribution function based on sample average and standard deviation,
i
x is the data sorted in increasing order, x ≤ x ≤ x … ≤ x .
i 1 2 3 N
2*
The distribution of the results is assumed to follow a normal distribution if the corresponding A
i
value ≤1,12.
If this test shows that the distribution of one or more samples does not meet the above criterion, this
sample shall be removed. The minimum number of samples for this procedure should be considered.
If the minimum requirement for a number of samples can no longer be met, the procedure shall be
discontinued.
ISO 4259-5:2023(E)
Data with insufficient resolution due to rounding can overestimate the normality assessment statistics.
See 5.1 for resolution provisions.
5.2.4 Comparison of precision
The sample standard deviations s should not significantly exceed the published reproducibility
i
standard deviations s for at least 80 % of the samples at a significance level of 0,05 using a statistical
Ri
F-test for the comparison of two variances s and s .
i Ri
For any sample i where s is numerically larger than s , perform the following F-test specified in
i Ri
Formula (3):
s
i
F = (3)
s
Ri
where
s is the standard deviation of the sample i, calculated over the lab results,
i
s is the published reproducibility standard deviation evaluated at concentration level of the
Ri
average results for sample i.
The number of degrees of freedom associated with s equals N-1, where N equals the number of result
i
for sample i.
The number of degrees of freedom associated with s is preferably taken from the published precision
Ri
statement of the test method or underlying research report. If s is not given as such, it is permitted to
Ri
estimate s based on the published reproducibility R , according to s = R /(t√2), where t represents
Ri i Ri i
the student-t value at a confidence level of 0,05 and degrees of freedom associated with R .
i
If in this latter case the degrees of freedom for R is unknown, it may be estimated by the minimum
i
value of 30, and the published reproducibility standard deviation is estimated by s = (R /2,888).
Ri i
If the above criterion is not met for one or more samples, the failing samples shall be removed. The
minimum number of samples for this procedure should be considered. If the minimum requirement for
a number of samples can no longer be met, the procedure shall be discontinued.
5.3 Brief sequential steps of the procedure
The following compressed overview summarizes the steps of the procedure. See Figures 1 and 2 for a
flow diagram of these procedural steps.
1) Checking the adequacy of the available data
The available data are checked against the general requirements (see 5.1). If applicable, the
additional requirements when using PTP data (see 5.2, 5.2.1, 5.2.2, 5.2.3 and 5.2.4) are also checked.
2) Calculate the means and standard error of the samples
The arithmetic means of the results for each common sample obtained by each method are
calculated (see 6.1.2) and the estimates of the standard errors of these means are computed (see
6.1.3).
3) Test the suitability of the data
Test for sufficient variation in the properties of both methods by computing the weighted sums of
squared residuals for the total variation of the mean results across all common samples for each
method. These sums of squares are assessed against the standard errors of the mean results for
each method to ensure that the samples are sufficiently varied before continuing with the practice
(see 6.2.1).
ISO 4259-5:2023(E)
Test for sufficient correlation between both methods by assessing the weighted sums of squared
residuals for the linear correction against the total variation in the mean results for both methods
to ensure that there is sufficient correlation between the two methods (see 6.2.2).
4) Calculate the bias correction statistics for each bias correction class
The closeness of agreement of the mean results by each method is evaluated using appropriate
weighted sums of squared residuals. Such sums of squares are computed from the data, first with no
bias correction, then with a constant bias correction, then, when appropriate, with a proportional
correction, and finally, with a linear (proportional + constant) correction (see 6.3).
5) Select the appropriate bias correction class
The most parsimonious bias correction is selected based on the weighted sum of squared residuals
from each bias correction and the appropriate t- and F-tests (see 6.4).
6) Test on distribution of residuals for normality
The (weighted) residuals per sample are tested for normality. The residuals are defined by the
difference between each individual Y and bias-corrected X . The test for normality is performed
i i
using the Anderson-Darling test for normality. When the weighted residuals are not found to be
normally distributed this practice is considered terminated (see 6.5).
7) Test for sample-specific biases
The weighted sum of squared residuals are assessed to determine whether additional unexplained
sources of variation remain in the residual data (see 6.6).
Any remaining, unexplained variation is attributed to sample-specific biases, also known as
method-material interactions or matrix effects. If sample-specific biases are found to be consistent
with a random-effects model, then their contribution to the between-methods reproducibility is
estimated, and accumulated into an all-encompassing between-methods reproducibility estimate.
8) Compute the between-methods reproducibility
Calculate the between-methods reproducibility taking into account possible sample specific biases.
When residuals are found to be normally distributed and sample-specific biases are not found to be
present, the between-methods reproducibility is defined by Formula (40).
When residuals are found to be normally distributed and sample-specific biases are present, the
between-methods reproducibility is defined by Formula (41).
9) Reporting
The results of this practice are reported in the precision and bias section of the appropriate
standard(s) (see Clause 7).
10) Confirmation of the correlation
The results of the assessment are periodically confirmed by users of the correlation by monitoring
the difference statistics by means of control charts (see Clause 8).
ISO 4259-5:2023(E)
5.4 Flow diagram of the procedure
Figure 1 — Flowchart for suitability and applicability of the data
ISO 4259-5:2023(E)
Figure 2 — Procedure for determining the bias correction
ISO 4259-5:2023(E)
6 Procedure
6.1 Sample mean and standard error
6.1.1 General
Calculate sample means X and Y and standard errors from results from the MSMLS. Published precision
i i
estimates are used to estimate the standard errors of these means, Δ and Δ .
S,Xi S,Yi
th th
NOTE The i material is the same for both data sets, but the j lab in one data set is not generally the same
th
lab as the j lab in the other data set.
6.1.2 Computation of the means
th
The arithmetic mean X-method result for the i sample is shown in Formula (4):
X
∑ ijk
k
X = (4)
i ∑
j
L n
Xi X
ij
th
where X is the average of the cell averages on the i sample by method X.
i
th
Similarly, the mean Y-method result for the i sample is given by the analogous Formulae (5):
Y
∑ ijk
k
Y = (5)
i ∑
j
L n
Y Y
iij
6.1.3 Calculation of standard errors
The standard errors are assigned to the standard deviations of the means and are calculated as follows.
If s is the reproducibility standard deviation from the X-method, and s is the repeatability standard
R,Xi r,Xi
deviation, then an estimate of the standard error for X is given by Formula (6):
i
1 11
Δ =−ss 1− (6)
EX,,RXir,Xi ∑
i
j
L Ln
Xi Xi Xij
The estimated standard error for Y , is given by the analogous Formula (7):
i
1 11
Δ =−ss 1− (7)
EY,,RYir,Yi ∑
i
j
L Ln
Yi Yi Yij
The repeatability standard deviations and reproducibility standard deviations are calculated from
published repeatability and published reproducibility by dividing these by t√2. Here, t refers to the
student t-value at a confidence level of 0,05 and the number of degrees of freedom as associated with
the precision figures.
In case the repeatability and reproducibility are known, but the number of degrees of freedom
associated with these precision figures are unknown, a value of 30 for number of degrees of freedom is
permitted.
Since repeatability and reproducibility may vary with the mean X-method results X , even if the L were
i Xi
the same for all materials and the n were the same for all laboratories and all materials, the Δ can
Xij E,Xi
still differ from one material to the next. The same is also true for method Y.
------------
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...