Standard Practice for Validation of Empirically Derived Multivariate Calibrations

SIGNIFICANCE AND USE
5.1 This practice outlines a universally applicable procedure to validate the performance of a quantitative or qualitative, empirically derived, multivariate calibration relative to an accepted reference method.  
5.2 This practice provides procedures for evaluating the capability of a calibration to provide reliable estimations relative to an accepted reference method.  
5.3 This practice provides purchasers of a measurement system that incorporates an empirically derived multivariate calibration with options for specifying validation requirements to ensure that the system is capable of providing estimations with an appropriate degree of agreement with an accepted reference method.  
5.4 This practice provides the user of a measurement system that incorporates an empirically derived multivariate calibration with procedures capable of providing information that may be useful for ongoing quality assurance of the performance of the measurement system.  
5.5 Validation information obtained in the application of this practice is applicable only to the material type and property range of the materials used to perform the validation and only for the individual measurement system on which the practice is completely applied. It is the user's responsibility to select the property levels and the compositional characteristics of the validation samples such that they are suitable to the application. This practice allows the user to write a comprehensive validation statement for the analyzer system including specific limits for the validated range of application and specific restrictions to the permitted uses of the measurement system. Users are cautioned against extrapolation of validation results beyond the material type(s) and property range(s) used to obtain these results.  
5.6 Users are cautioned that a validated empirically derived multivariate calibration is applicable only to samples that fall within the subset population represented in the validation set. The esti...
SCOPE
1.1 This practice covers requirements for the validation of empirically derived calibrations (Note 1) such as calibrations derived by Multiple Linear Regression (MLR), Principal Component Regression (PCR), Partial Least Squares (PLS), Artificial Neural Networks (ANN), or any other empirical calibration technique whereby a relationship is postulated between a set of variables measured for a given sample under test and one or more physical, chemical, quality, or membership properties applicable to that sample.
Note 1: Empirically derived calibrations are sometimes referred to as “models” or “calibrations.” In the following text, for conciseness, the term “calibration” may be used instead of the full name of the procedure.  
1.2 This practice does not cover procedures for establishing said postulated relationship.  
1.3 This practice serves as an overview of techniques used to verify the applicability of an empirically derived multivariate calibration to the measurement of a sample under test and to verify equivalence between the properties calculated from the empirically derived multivariate calibration and the results of an accepted reference method of measurement to within control limits established for the prespecified statistical confidence level.  
1.4 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use.  
1.5 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.

General Information

Status
Published
Publication Date
14-Dec-2017

Relations

Effective Date
15-Dec-2017
Effective Date
01-Mar-2010
Effective Date
01-Mar-2010
Effective Date
01-Sep-2005
Effective Date
01-Dec-2004
Effective Date
01-Nov-2004
Effective Date
10-Sep-2002
Effective Date
10-Sep-2000
Effective Date
10-Sep-2000
Effective Date
10-Sep-2000
Effective Date
15-Dec-2017
Effective Date
15-Dec-2017
Effective Date
15-Dec-2017
Effective Date
15-Dec-2017
Effective Date
15-Dec-2017

Overview

ASTM E2617-17: Standard Practice for Validation of Empirically Derived Multivariate Calibrations is a globally recognized guideline developed by ASTM International for validating empirically derived calibrations, sometimes referred to as "models." This standard applies to both quantitative and qualitative multivariate calibrations, which are crucial in modern measurement systems relying on analytical techniques such as Multiple Linear Regression (MLR), Principal Component Regression (PCR), Partial Least Squares (PLS), Artificial Neural Networks (ANN), and other empirical modeling approaches.

The main objective of ASTM E2617-17 is to ensure that these calibrations reliably estimate physical, chemical, quality, or membership properties of samples when compared to results from accepted reference methods. This practice is a critical part of quality assurance programs in laboratories and industries that use advanced analytical instrumentation and modeling.

Key Topics

  • Validation Process: Outlines requirements and procedures for validating empirically derived multivariate calibrations relative to reference methods.
  • Applicability: Discusses the scope of validation, ensuring it is only valid within the type and property range of materials used in the validation and for the specific system validated.
  • Initial and Ongoing Validation: Differentiates between initial validation upon system installation or revision and ongoing periodic revalidation for quality assurance.
  • Performance Metrics: Addresses evaluation criteria like precision, accuracy, bias (for quantitative calibrations), and positive/negative fraction identified (for qualitative calibrations).
  • Statistical Comparison: Requires statistically comparing calibration estimates to known reference values.
  • Eligibility Testing: Incorporates statistical tests such as Mahalanobis Distance, Nearest Neighbor Mahalanobis Distance (NNMD), and Standard Residual Variance in the Independent Variables (SRVIV) to ensure the sample under test falls within the validated space.
  • Selection of Validation Samples: Emphasizes the need for representative and sufficiently diverse sample sets that cover the expected range of application.

Applications

ASTM E2617-17 is widely applicable across industries and research domains that rely on analytical measurement systems using multivariate calibration. Practical applications include:

  • Pharmaceutical Analysis: Ensuring drug quality and composition through validated spectroscopic models.
  • Chemical Manufacturing: Verifying the accuracy of process analyzers and at-line measurement systems.
  • Food & Agriculture: Assessing food quality, composition, and authenticity using near-infrared (NIR) or other spectroscopic techniques.
  • Environmental Testing: Monitoring pollutants via validated multivariate calibrations from complex sample matrices.
  • Quality Assurance Laboratories: Routine calibration verification and maintenance of analytical models to ensure ongoing compliance with regulatory or internal standards.

Organizations and purchasers of measurement systems can use this standard to specify validation requirements, improving confidence in the analytical results generated by their systems.

Related Standards

ASTM E2617-17 references and aligns with other ASTM and international standards to ensure a comprehensive approach to multivariate calibration validation:

  • ASTM E131: Terminology Relating to Molecular Spectroscopy
  • ASTM E1655: Practices for Infrared Multivariate Quantitative Analysis
  • ASTM E1790: Practice for Near Infrared Qualitative Analysis

These related standards provide supplementary terminology, methodological, and performance assessment guidance relevant to molecular spectroscopy and multivariate techniques.


Keywords: multivariate calibration validation, empirical model validation, ASTM E2617, spectroscopic calibration, quality assurance, reference method comparison, calibration model, laboratory standards, analytical measurements.

Buy Documents

Standard

ASTM E2617-17 - Standard Practice for Validation of Empirically Derived Multivariate Calibrations

English language (11 pages)
sale 15% off
sale 15% off
Standard

REDLINE ASTM E2617-17 - Standard Practice for Validation of Empirically Derived Multivariate Calibrations

English language (11 pages)
sale 15% off
sale 15% off

Get Certified

Connect with accredited certification bodies for this standard

BSMI (Bureau of Standards, Metrology and Inspection)

Taiwan's standards and inspection authority.

TAF Taiwan Verified

Sponsored listings

Frequently Asked Questions

ASTM E2617-17 is a standard published by ASTM International. Its full title is "Standard Practice for Validation of Empirically Derived Multivariate Calibrations". This standard covers: SIGNIFICANCE AND USE 5.1 This practice outlines a universally applicable procedure to validate the performance of a quantitative or qualitative, empirically derived, multivariate calibration relative to an accepted reference method. 5.2 This practice provides procedures for evaluating the capability of a calibration to provide reliable estimations relative to an accepted reference method. 5.3 This practice provides purchasers of a measurement system that incorporates an empirically derived multivariate calibration with options for specifying validation requirements to ensure that the system is capable of providing estimations with an appropriate degree of agreement with an accepted reference method. 5.4 This practice provides the user of a measurement system that incorporates an empirically derived multivariate calibration with procedures capable of providing information that may be useful for ongoing quality assurance of the performance of the measurement system. 5.5 Validation information obtained in the application of this practice is applicable only to the material type and property range of the materials used to perform the validation and only for the individual measurement system on which the practice is completely applied. It is the user's responsibility to select the property levels and the compositional characteristics of the validation samples such that they are suitable to the application. This practice allows the user to write a comprehensive validation statement for the analyzer system including specific limits for the validated range of application and specific restrictions to the permitted uses of the measurement system. Users are cautioned against extrapolation of validation results beyond the material type(s) and property range(s) used to obtain these results. 5.6 Users are cautioned that a validated empirically derived multivariate calibration is applicable only to samples that fall within the subset population represented in the validation set. The esti... SCOPE 1.1 This practice covers requirements for the validation of empirically derived calibrations (Note 1) such as calibrations derived by Multiple Linear Regression (MLR), Principal Component Regression (PCR), Partial Least Squares (PLS), Artificial Neural Networks (ANN), or any other empirical calibration technique whereby a relationship is postulated between a set of variables measured for a given sample under test and one or more physical, chemical, quality, or membership properties applicable to that sample. Note 1: Empirically derived calibrations are sometimes referred to as “models” or “calibrations.” In the following text, for conciseness, the term “calibration” may be used instead of the full name of the procedure. 1.2 This practice does not cover procedures for establishing said postulated relationship. 1.3 This practice serves as an overview of techniques used to verify the applicability of an empirically derived multivariate calibration to the measurement of a sample under test and to verify equivalence between the properties calculated from the empirically derived multivariate calibration and the results of an accepted reference method of measurement to within control limits established for the prespecified statistical confidence level. 1.4 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use. 1.5 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.

SIGNIFICANCE AND USE 5.1 This practice outlines a universally applicable procedure to validate the performance of a quantitative or qualitative, empirically derived, multivariate calibration relative to an accepted reference method. 5.2 This practice provides procedures for evaluating the capability of a calibration to provide reliable estimations relative to an accepted reference method. 5.3 This practice provides purchasers of a measurement system that incorporates an empirically derived multivariate calibration with options for specifying validation requirements to ensure that the system is capable of providing estimations with an appropriate degree of agreement with an accepted reference method. 5.4 This practice provides the user of a measurement system that incorporates an empirically derived multivariate calibration with procedures capable of providing information that may be useful for ongoing quality assurance of the performance of the measurement system. 5.5 Validation information obtained in the application of this practice is applicable only to the material type and property range of the materials used to perform the validation and only for the individual measurement system on which the practice is completely applied. It is the user's responsibility to select the property levels and the compositional characteristics of the validation samples such that they are suitable to the application. This practice allows the user to write a comprehensive validation statement for the analyzer system including specific limits for the validated range of application and specific restrictions to the permitted uses of the measurement system. Users are cautioned against extrapolation of validation results beyond the material type(s) and property range(s) used to obtain these results. 5.6 Users are cautioned that a validated empirically derived multivariate calibration is applicable only to samples that fall within the subset population represented in the validation set. The esti... SCOPE 1.1 This practice covers requirements for the validation of empirically derived calibrations (Note 1) such as calibrations derived by Multiple Linear Regression (MLR), Principal Component Regression (PCR), Partial Least Squares (PLS), Artificial Neural Networks (ANN), or any other empirical calibration technique whereby a relationship is postulated between a set of variables measured for a given sample under test and one or more physical, chemical, quality, or membership properties applicable to that sample. Note 1: Empirically derived calibrations are sometimes referred to as “models” or “calibrations.” In the following text, for conciseness, the term “calibration” may be used instead of the full name of the procedure. 1.2 This practice does not cover procedures for establishing said postulated relationship. 1.3 This practice serves as an overview of techniques used to verify the applicability of an empirically derived multivariate calibration to the measurement of a sample under test and to verify equivalence between the properties calculated from the empirically derived multivariate calibration and the results of an accepted reference method of measurement to within control limits established for the prespecified statistical confidence level. 1.4 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use. 1.5 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.

ASTM E2617-17 is classified under the following ICS (International Classification for Standards) categories: 17.020 - Metrology and measurement in general. The ICS classification helps identify the subject area and facilitates finding related standards.

ASTM E2617-17 has the following relationships with other standards: It is inter standard links to ASTM E2617-10, ASTM E131-10, ASTM E1790-04(2010), ASTM E131-05, ASTM E1655-04, ASTM E1790-04, ASTM E131-02, ASTM E1655-00, ASTM E1790-00, ASTM E131-00a, ASTM D7889-21, ASTM D8431-22, ASTM E2475-10(2016), ASTM D7825-18, ASTM E2891-20. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

ASTM E2617-17 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.

Standards Content (Sample)


This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the
Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
Designation: E2617 − 17
Standard Practice for
Validation of Empirically Derived Multivariate Calibrations
This standard is issued under the fixed designation E2617; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision.Anumber in parentheses indicates the year of last reapproval.A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope 2. Referenced Documents
1.1 This practice covers requirements for the validation of
2.1 ASTM Standards:
empirically derived calibrations (Note 1) such as calibrations
E131Terminology Relating to Molecular Spectroscopy
derivedbyMultipleLinearRegression(MLR),PrincipalCom-
E1655 Practices for Infrared Multivariate Quantitative
ponent Regression (PCR), Partial Least Squares (PLS),Artifi-
Analysis
cial Neural Networks (ANN), or any other empirical calibra-
E1790Practice for Near Infrared Qualitative Analysis
tion technique whereby a relationship is postulated between a
setofvariablesmeasuredforagivensampleundertestandone
3. Terminology
or more physical, chemical, quality, or membership properties
3.1 For terminology related to molecular spectroscopic
applicable to that sample.
methods, refer to Terminology E131. For terminology related
NOTE 1—Empirically derived calibrations are sometimes referred to as
to multivariate quantitative modeling refer to Practices E1655.
“models”or“calibrations.”Inthefollowingtext,forconciseness,theterm
“calibration” may be used instead of the full name of the procedure. While Practices E1655 is written in the context of multivariate
spectroscopic methods, the terminology is also applicable to
1.2 This practice does not cover procedures for establishing
other multivariate technologies.
said postulated relationship.
3.2 Definitions of Terms Specific to This Standard:
1.3 This practice serves as an overview of techniques used
3.2.1 accuracy—the closeness of agreement between a test
to verify the applicability of an empirically derived multivari-
result and an accepted reference value.
ate calibration to the measurement of a sample under test and
to verify equivalence between the properties calculated from
3.2.2 bias—the arithmetic average difference between the
the empirically derived multivariate calibration and the results
reference values and the values produced by the analytical
of an accepted reference method of measurement to within
method under test, for a set of samples.
control limits established for the prespecified statistical confi-
3.2.3 detection limit—the lowest level of a property in a
dence level.
sample that can be detected, but not necessarily quantified, by
1.4 This standard does not purport to address all of the
the measurement system.
safety concerns, if any, associated with its use. It is the
3.2.4 estimate—theconstituentconcentration,identification,
responsibility of the user of this standard to establish appro-
or other property of a sample as determined by the analytical
priate safety, health, and environmental practices and deter-
method being validated.
mine the applicability of regulatory limitations prior to use.
1.5 This international standard was developed in accor-
3.2.5 initial validation—validation that is performed when
dance with internationally recognized principles on standard-
an analyzer system is initially installed or after major mainte-
ization established in the Decision on Principles for the
nance.
Development of International Standards, Guides and Recom-
3.2.6 Negative Fraction Identified—the fraction of samples
mendations issued by the World Trade Organization Technical
not having a particular characteristic that is identified as not
Barriers to Trade (TBT) Committee.
having that characteristic.
This practice is under the jurisdiction ofASTM Committee E13 on Molecular
Spectroscopy and Separation Science and is the direct responsibility of Subcom-
mittee E13.11 on Multivariate Analysis. For referenced ASTM standards, visit the ASTM website, www.astm.org, or
Current edition approved Dec. 15, 2017. Published February 2018. Originally contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
approved in 2008. Last previous edition approved in 2010 as E2617–10. DOI: Standards volume information, refer to the standard’s Document Summary page on
10.1520/E2617-17. the ASTM website.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E2617 − 17
3.2.6.1 Discussion—Negative Fraction Identified assumes 5. Significance and Use
that the characteristic that the test measures either is or is not
5.1 Thispracticeoutlinesauniversallyapplicableprocedure
present. It is not applicable to tests with multiple possible
to validate the performance of a quantitative or qualitative,
outcomes.
empirically derived, multivariate calibration relative to an
3.2.7 ongoing periodic revalidation—the quality assurance accepted reference method.
process by which, in the case of quantitative calibrations, the
5.2 This practice provides procedures for evaluating the
biasandprecisionor,inthecaseofqualitativecalibrations,the
capability of a calibration to provide reliable estimations
Positive Fraction Identified and Negative Fraction Identified
relative to an accepted reference method.
performance determined during initial validation are shown to
5.3 This practice provides purchasers of a measurement
be sustained.
system that incorporates an empirically derived multivariate
3.2.8 Positive Fraction Identified—the fraction of samples
calibration with options for specifying validation requirements
having a particular characteristic that is identified as having
to ensure that the system is capable of providing estimations
that characteristic.
with an appropriate degree of agreement with an accepted
3.2.8.1 Discussion—Positive Fraction Identified assumes
reference method.
that the characteristic that the test measures either is or is not
5.4 Thispracticeprovidestheuserofameasurementsystem
present. It is not applicable to tests with multiple possible
that incorporates an empirically derived multivariate calibra-
outcomes.
tionwithprocedurescapableofprovidinginformationthatmay
3.2.9 precision—the closeness of agreement between inde-
be useful for ongoing quality assurance of the performance of
pendent test results obtained under stipulated conditions.
the measurement system.
3.2.9.1 Discussion—Precision may be a measure of either
5.5 Validation information obtained in the application of
the degree of reproducibility or degree of repeatability of the
thispracticeisapplicableonlytothematerialtypeandproperty
analytical method under normal operating conditions. In this
range of the materials used to perform the validation and only
context, reproducibility refers to the use of the analytical
fortheindividualmeasurementsystemonwhichthepracticeis
procedure in different laboratories, as in a collaborative study.
completely applied. It is the user’s responsibility to select the
3.2.10 quantification limit—the lowest level of a sample
property levels and the compositional characteristics of the
property which can be determined with acceptable precision validation samples such that they are suitable to the applica-
and accuracy under the stated experimental conditions.
tion. This practice allows the user to write a comprehensive
validation statement for the analyzer system including specific
3.2.11 range—the interval between the upper and lower
limits for the validated range of application and specific
levels of a property (including these levels) that has been
restrictions to the permitted uses of the measurement system.
demonstrated to be determined with a suitable level of preci-
Users are cautioned against extrapolation of validation results
sion and accuracy using the method as specified.
beyond the material type(s) and property range(s) used to
3.2.12 reference value—the metric of a property as deter-
obtain these results.
mined by well-characterized method, the accuracy of which
5.6 Users are cautioned that a validated empirically derived
has been stated or defined, that is, another, already-validated
multivariate calibration is applicable only to samples that fall
method.
within the subset population represented in the validation set.
3.2.13 validation—the statistically quantified judgment that
The estimation from an empirically derived multivariate cali-
an empirically derived multivariate calibration is applicable to
bration can only be validated when the applicability of the
the measurement on which the calibration is to be applied and
calibration is explicitly established for the particular measure-
canperformpropertyestimateswith,inthecaseofquantitative
ment for which the estimation is produced. Applicability
calibrations, acceptable precision, accuracy and bias or, in the
cannot be assumed.
case of qualitative calibrations, acceptable Positive Fraction
Identified and Negative Fraction Identified, as compared with
6. Methods and Considerations
results from an accepted reference method.
6.1 When validating an empirically derived multivariate
3.2.14 validation space—the region(s) of a calibration’s
calibration, it is the responsibility of the user to describe the
multivariate sample space populated by the independent vali-
measurement system and the required level of agreement
dation samples which are used to validate the calibration.
between the estimations produced by the calibration and the
accepted reference method(s).
4. Summary of Practice
6.2 When validating a measurement system incorporating
4.1 Validating an empirically derived multivariate calibra-
anempiricallyderivedmultivariatecalibration,itistherespon-
tion (model) consists of four major procedures: validation at sibilityoftheusertosatisfytherequirementsofanyapplicable
initial development, revalidation at initial deployment or after
tests specific to the measurement system including any Instal-
a revision, ongoing periodic revalidation, and qualification of lation Qualification (IQ), Operational Qualification (OQ), and
each measurement before using the calibration to estimate the Performance Qualification (PQ) requirements; which may be
property(s) of the sample being measured. mandated by competent regulatory authorities, an applicable
E2617 − 17
Quality Assurance (QA), or Standard Operating Procedure 7.3.2 For simple systems, sufficient validation samples can
(SOP) or be recommended by the instrument or equipment generally be obtained to meet the criteria in 7.3.1.1 – 7.3.1.4.
manufacturer.
Forcomplexmixtures,obtaininganidealvalidationsetmaybe
difficultifnotimpossible.Insuchcases,itmaybenecessaryto
6.3 Reference Values and Quality Controls for the Accepted
validate discrete subranges of the calibration incrementally,
Reference Method:
over time as samples become available.
6.3.1 The reference (or true) value which is compared with
7.3.3 The number of samples needed to validate a calibra-
each respective estimate produced by the empirically derived
multivariate calibration is established by applying an accepted tiondependsonthecomplexityofthecalibration,therangesof
reference method, the characteristics of which are known and property variation over which the calibration is to be applied,
stated, to the sample from which the measurement system and the degree of confidence required. It is important to
derives the measurement. validate a calibration with as many samples as possible to
6.3.2 To ensure the reliability of the reference values maximize the likelihood of challenging the calibration with
provided by an accepted reference method, appropriate quality
rarely occurring, but potentially troublesome samples. The
controls should be applied to the accepted reference method. number and range of validation samples should be sufficient to
validate the calibration to the statistical degree of confidence
7. Procedure
required for the application. In all cases, a minimum of 20
validationsamplesisrecommended.Inaddition,thevalidation
7.1 The objective of the validation procedure is to quantify
samples should:
theperformanceofanempiricallyderivedmultivariatecalibra-
7.3.3.1 Multivariately span the ranges of sample property
tion in terms of, in the case of quantitative calibrations,
valuesoverwhichthecalibrationwillbeused;thatis,thespan
precision, accuracy and bias or, in the case of qualitative
and the standard deviation of the ranges of sample property
calibrations,PositiveFractionIdentifiedandNegativeFraction
Identified relative to an accepted reference method for each values for the validation samples should be at least 100% of
property of interest. The user must specify, based on the the spans of the sample property values over which the
intendeduseofthecalibration,acceptableprecisionandbiasor calibrationwillbeused,andthesamplepropertyvaluesforthe
Positive Fraction Identified and Negative Fraction Identified validation samples should be distributed as uniformly as
performance criteria before initiating the validation. These
possible throughout their respective ranges, and the variations
criteria will be dependent on the intended use of the analyzer of the sample property values among the samples should be as
and may be based, all or in part, on risk based criteria.
mutually independent as possible; and
7.1.1 The acceptable performance criteria specified by the
7.3.3.2 Span the ranges of the independent variables over
user may be constant over the entire range of sample variabil-
which the calibration will be used; that is, if the range of an
ity.Alternatively,differentacceptableperformancecriteriamay
independent variable is expected to vary from a to b, and the
be specified by the user for different sub-ranges of the full
standard deviation of the independent variable is c, then the
sample variability.
variations of that independent variable in the set of validation
samples should cover at least 100% of the range from a to b,
7.2 Validation of calibration is accomplished by using the
calibration to estimate the property(s) of a set of validation and should be distributed as uniformly as possible across the
range such that the standard deviation in that independent
samples and statistically comparing the estimates for these
samples to known reference values. Validation requires thor- variable estimated for the validation samples will be at least
ough testing of the model with a sufficient number of repre- 95% of c.
sentative validation samples to ensure that it performs ad- (1)When validating a calibration for which detection limit
equately over the entire range of possible sample variability. or quantification limit is an important consideration, the user
should include a number of validation samples whose proper-
7.3 Initial Validation Sample Set:
ty(s) are close to the detection or quantification limit(s)
7.3.1 For the initial validation of a multivariate model, an
sufficient to validate the respective limit(s) to the statistical
ideal validation sample set will:
degree of confidence required for the application.
7.3.1.1 Contain samples that provide sufficient examples of
all combinations of variation in the sample properties which
7.4 For quantitative calibrations, the validation error for
are expected to be present in the samples which are to be
each property in each sample is given by the Standard Error of
analyzed using the calibration;
Validation (SEV) and bias for that property.
7.3.1.2 Containsamplesforwhichtherangesofvariationin
7.4.1 The validation bias, ev-, is a measure of the average
the sample properties is comparable to the ranges of variation
difference between the estimates made based on the empirical
expected for samples that are to be analyzed using the model;
model and the results obtained on the same validation samples
7.3.1.3 Contain samples for which the respective variations
using the reference method.
of the sample properties are uniformly and mutually indepen-
7.4.1.1 If there are single reference values and estimates for
dently distributed over their full respective ranges or, when
each validation sample, the validation bias is calculated as:
applicable, subranges of variation; and
v
7.3.1.4 Contain a sufficient number of samples to statisti-
vˆ 2 v
~ !
i i
(
callytesttherelationshipsbetweenthemeasuredvariablesand
i51
e¯ 5 (1)
v
the properties that are modeled by the calibration. v
E2617 − 17
where: v
~vˆ 2 v !
( i i
vˆ = estimate from the model for the ith sample,
i
i51
SEV 5 (5)
!
v = accepted reference value for the ith sample, and
i v
v = number of validation samples.
v
7.4.1.2 Ifreplicateestimatesandasinglereferencevalueare
~vˆ 2 v 2 e¯ !
( i i v
i51
available for the validation samples, then the validation bias is
SDV 5
!
v
calculated as:
v r
i
where:
~vˆ 2 v !
( ( ij i
i51 j51 vˆ = estimate from the model for the ith sample,
i
e¯ 5 (2)
v v
v = accepted reference value for the ith sample, and
i
r
( i
v = number of validation samples.
i51
7.4.2.2 Ifreplicateestimatesandasinglereferencevalueare
where:
available for the validation samples, then SEV and SDV are
vˆ = the jth estimate for the ith validation sample, and
ij
calculated as:
r = number of replicate estimates for the ith validation
i
v r
sample. i
vˆ 2 v
~ !
ij j
( (
i51 j51
7.4.1.3 Ifasingleestimateandmultiplereferencevaluesare
SEV 5 (6)
v
available for the validation samples, then the validation bias is
!
r
i
(
i51
calculated as:
v r
i
v r
i
vˆ 2 v
~ !
( ( i ij 2
~vˆ 2 v 2 e¯ !
i51 j51
( ( ij i v
e¯ 5 (3) i51 j51
v v
SDV 5
v
s
( i
!
r
i51
( i
i51
where:
where:
vˆ = estimate for the ith validation sample,
i
vˆ = the jth estimate for the ith validation sample, and
ij
v = the jthreferencevalueforthe ithvalidationsample,and
ij
r = number of replicate estimates for the ith validation
i
s = number of replicate reference values for the ith valida-
i
sample.
tion sample.
NOTE 2—If each validation sample is estimated r times, an average
estimate could be used in 7.4.2.1, but then the SEV calculated would
7.4.1.4 If multiple estimates and multiple reference values
represent the expected agreement between the average of r estimations
are available for the validation samples, then the validation
and a single reference measurement, not the agreement based on a single
bias is calculated as:
estimation from the empirical model.
v r s
i i
7.4.2.3 Ifasingleestimateandmultiplereferencevaluesare
vˆ 2 v
~ !
ij ik
( ( (
available for the validation samples, then SEV and SDV are
i51 j51 k51
e¯ 5 (4)
v v
calculated as:
r s
i i
(
i51
v r
i
~vˆ 2 v !
( ( i ij
where:
i51 j51
SEV 5 (7)
v
vˆ = the jth estimate for the ith validation sample,
ij
!
s
( i
v = the kth reference value for the ith validation sample,
ik i51
r = number of replicate estimates for the ith validation
i
v r
sample, and
i
vˆ 2 v 2 e¯
~ !
s = number of replicate reference values for the ith valida- i ij v
( (
i
i51 j51
SDV 5
tion sample.
v
!
s
i
(
7.4.2 The SEV, also called the Standard Error of Prediction i51
(SEP) and the Standard Deviation of Validation Residuals
where:
(SDV), are measures of the expected agreement of the empiri-
vˆ = estimate for the ith validation sample,
i
cal model and the reference method. The calculation of SEV
vˆ = the jthreferencevalueforthe ithvalidationsample,and
ij
and SDV depend on whether replicate estimates or reference
s = number of replicate reference values for the ith valida-
i
values, or both, are used.
tion sample.
7.4.2.1 If there are single reference values and estimates for
NOTE 3—If each validation sample has s reference values, an average
each validation sample, then SEV and SDV are calculated as: estimate could be used in 7.4.2.1, but then the SEV calculated would
E2617 − 17
represent the expected agreement between an estimate from the empirical
identified as having a stated characteristic) / (total number of
model and the average of s reference measurements, not a the agreement
samples having the stated characteristic).
relative to a single reference measurement.
7.5.2 The Negative Fraction Identified of the calibration is
7.4.2.4 If multiple estimates and multiple reference values
given by: Negative Fraction Identified = (number of samples
areavailableforthevalidationsamples,thenSEVandSDVare
identified as not having a stated characteristic) / (total number
calculated as:
of samples not having the stated characteristic).
7.5.3 The equations for Positive Fraction Identified and
v r s
i i
Negative Fraction Identified assume that the characteristic
~vˆ 2 v !
( ( ( ij ik
i51 j51 k51
being measured either is or isn’t present. It is not applicable to
SEV 5 (8)
v
tests with multiple possible outcomes.
!
r s
( i i
i51
7.6 The users should use statistical tests and decision
criteria appropriate to the application to decide if the SEV and
v r s
i i
vˆ 2 v 2 e¯ bias are within statistically acceptable limits.
~ !
ij ik v
( ( (
i51 j51 k51
SDV 5
v
7.7 Samples for Revalidation After Initial Deployment and
!
r s
i i
( Ongoing Periodic Revalidation Samples:
i51
7.7.1 The user must determine, based on the particulars of
where:
each application, the appropriate timing and number of
vˆ = the jth estimate for the ith validation sample,
samples required for revalidation after initial deployment and
ij
vˆ = the kth reference value for the ith validation sample,
ik for ongoing periodic revalidation.
r = number of replicate estimates for the ith validation
i 7.7.1.1 Thetimingandnumberofrevalidationsamplesmay
sample, and
be adjusted from time to time as experience is gained in
s = number of replicate reference values for the ith valida-
i
applying the calibration under actual conditions.
tion sample.
7.7.1.2 In many cases revalidation samples are restricted to
NOTE 4—If each validation sample has r estimates and s reference
“samples of opportunity” and limited to samples from actual
values, average estimates and reference values could be used in 7.4.1.1,
production operations. In such cases, care should be taken to
but then the SEV calculated would represent the expected agreement
schedule revalidation samples as asynchronously as possible
between r estimates from the empirical model and the average of s
reference measurements, not a the agreement between a single estimate with respect to recurring conditions such as time of day,
and reference measurement.
production process operating conditions, phase or stage of
production process, ambient conditions, operating personnel,
7.4.3 Significance of Validation Bias—A t-value can be
etc. This listing of potential conditions for consideration is
calculated as:
exemplary, not comprehensive; the user should take into
e d
v v
account any external conditions pertinent to the application.
t 5 (9)
SDV
7.7.2 Itisrecommendedthattheresultsofongoingperiodic
revalidation should be monitored or tracked by control chart-
where:
ing.
d = degrees of freedom and is equal to the denominator in
v
the bias calculation.
8. Qualification of Each Measurement Prior to
NOTE 5—The t-value is compared to a critical t-value for the desired
Application of the Validated Calibration
probability level (typically 95%).
8.1 The independent variables measured from a sample
7.4.3.1 If the calculated t-value is less than the critical
undertestmustbeevaluatedtoensurethatthismeasurementis
t-value, then the validation bias is not statistically significant
eligibletobeprocessedbythecalibrationtoproduceestimates
and the empirical model and reference method are expected to
oftheproperty(s)ofinterest.Thepurposeofthiseligibilitytest
on average yield the same result. In this case, either SEV or
is to determine, within user specified statistical limits, if the
SDV are adequate measures of the expected agreement be-
validation samples used to validate the calibration are suffi-
tween the empirical model and the reference method. If the
ciently representative of (similar to) the sample under test. In
validation bias is of practical significance relative to the user
other words, the purpose of this step is to confirm that the
specified bias requirement, then the precision of the empirical
measurement from the sample under test is within the calibra-
model results is insufficient to achieve the user requirement.
tion’s validation space. If the measurement is eligible, the
7.4.3.2 If the calculated t-value is greater than the critical
estimates should fall within accuracy and precision bounds
t-value, then the validation bias is statistically significant. In
determined during the validation. If the measurement is not
this case SDV is a better measure of the expected agreement
eligible, then the accuracy and precision of the estimates are
between the results of the empirical model and the reference
not known based on the validation. The measurement of a
method. While the bias may be statistically significant, it may
sampleundertestmaybetestedforeligibilityusingMahalano-
not be of practical significance relative to the user specified
bis distance, Nearest Neighbor Mahalanobis Distance
requirements for the empirical model.
(NNMD), or Standard Residual Variance in the Independent
7.5 Positive and Negative Fractions Identified:
Variables (SRVIV), either singly or in combination. The user
7.5.1 The Positive Fraction Identified of the calibration is may also specify additional eligibility criteria if and as appro-
given by: Positive Fraction Identified = (number of samples priate to the application.
E2617 − 17
8.1.1 The development of an empirical model will typically matrix P comprises column vectors, each of which contains
involvetransformationoftheindependentvariables.Bywayof one of the factors comprising the PCA basis space; the f× k
illustration, such transformation may include one or more of matrix T comprises the scores of the v validation samples for
ˆ
the following:
the k basis vectors; the f× v matrix X comprises column
8.1.1.1 Linearization of the independent variables (for
vectors, each of which contains the reconstructions of respec-
example, conversion from transmission to absorbance, from
tive columns of X by the k factors comprising the basis space
reflectance to log(1-reflectance), etc.);
in P; and the f × v matrix R comprises column vectors, each
8.1.1.2 Digital filtering (smoothing, digital derivatives);
of which contains the residual variance in each corresponding
8.1.1.3 Orthogonalization (Orthogonal Signal Correction);
measurement in X which is not spanned by the basis space;
8.1.1.4 Rank reduction (Principal Components Analysis
then:
ˆ
(PCA) or PLS);
R5X2X
T
8.1.1.5 Squares, cross products or nonlinear functions of
5X2PT
T 21 T
variables;
5X2P P P P X
~ !
8.1.1.6 Explicit artifact removal (cosmic ray event re-
(10)
moval);
8.1.1.7 Centering or baseline correction;
and the standard residual (SRVIV) is then given by:
8.1.1.8 Arbitrary scaling, variance scaling, or auto scaling;
f v
R
8.1.1.9 Exclusion of one or more independent variables ij
(i51 (j51
SRVIV 5Œ (11)
f v 2 k
from use in the calibration; and ~ !
8.1.1.10 Integration of peaks with or without baseline cor-
where:
rection.
T
= matrix transpose, and
8.1.2 Mahalanobis distance, NNMD, and SRVIV statistics
f(v–k) = number of degrees of freedom of the residuals.
are calculated after applying the same transformations to the
For PCA, an alternative method of calculating the residual
measurement being qualified which were applied to the mea-
varianceusestheloadings L,singularvalues,∑,andscores, S,
surements used to produce and validate the calibration.
from the singular value decomposition (SVD)(see Practices
8.2 SRVIVscansometimesbeemployedtodetermineifthe
E1655)of X. The equation for SVD is:
samples used to validate the empirical model are sufficiently
T
X 5 L S (12)
representative of (similar to) the sample under test. SRVIV is
v v
(v
intended to detect any anomalous variance which may be
If the f× k matrix L comprises column vectors, each of
v
present in the measurement from new signals (for example,
which contains one of the SVD factors (loadings) comprising
new chemical components, new instrumental or sample
the basis space; the k× k matrix∑ is a diagonal matrix which
v
conditions, etc.) that were not represented in the validation
contains the k singular values for the respective factors in L ;
v
samples. If the validation samples are sufficiently representa-
the v× k matrix S comprises column vectors, each of which
tive of the (unknown) sample under test, then the amount of
containsthe kscoresofeachvalidationsamplein Xagainstthe
residual variance in the independent variables of the sample
respective factors in L ; and the f× v matrix R comprises
v
under test will be statistically indistinguishable from the
columnvectors,eachofwhichcontainstheresidualvariancein
amount of residual variance in the validation samples. This is
eachcorrespondingmeasurementin Xwhichisnotspannedby
alwaysanecessarycriterionforqualificationtesting,butitmay
the basis space; then:
not always be solely sufficient. If the empirical calibration
ˆ
utilizes most of the non-noise portion of variance in the R 5 X 2 X (13)
independent variables, the residual variance will be a very
where
...


This document is not an ASTM standard and is intended only to provide the user of an ASTM standard an indication of what changes have been made to the previous version. Because
it may not be technically possible to adequately depict all changes accurately, ASTM recommends that users consult prior editions as appropriate. In all cases only the current version
of the standard as published by ASTM is to be considered the official document.
Designation: E2617 − 10 E2617 − 17
Standard Practice for
Validation of Empirically Derived Multivariate Calibrations
This standard is issued under the fixed designation E2617; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope
1.1 This practice covers requirements for the validation of empirically derived calibrations (Note 1) such as calibrations derived
by Multiple Linear Regression (MLR), Principal Component Regression (PCR), Partial Least Squares (PLS), Artificial Neural
Networks (ANN), or any other empirical calibration technique whereby a relationship is postulated between a set of variables
measured for a given sample under test and one or more physical, chemical, quality, or membership properties applicable to that
sample.
NOTE 1—Empirically derived calibrations are sometimes referred to as “models” or “calibrations.” In the following text, for conciseness, the term
“calibration” may be used instead of the full name of the procedure.
1.2 This practice does not cover procedures for establishing said postulated relationship.
1.3 This practice serves as an overview of techniques used to verify the applicability of an empirically derived multivariate
calibration to the measurement of a sample under test and to verify equivalence between the properties calculated from the
empirically derived multivariate calibration and the results of an accepted reference method of measurement to within control
limits established for the prespecified statistical confidence level.
1.4 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility
of the user of this standard to establish appropriate safety safety, health, and healthenvironmental practices and determine the
applicability of regulatory limitations prior to use.
1.5 This international standard was developed in accordance with internationally recognized principles on standardization
established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued
by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
2. Referenced Documents
2.1 ASTM Standards:
E131 Terminology Relating to Molecular Spectroscopy
E1655 Practices for Infrared Multivariate Quantitative Analysis
E1790 Practice for Near Infrared Qualitative Analysis
3. Terminology
3.1 For terminology related to molecular spectroscopic methods, refer to Terminology E131. For terminology related to
multivariate quantitative modeling refer to Practices E1655. While Practices E1655 is written in the context of multivariate
spectroscopic methods, the terminology is also applicable to other multivariate technologies.
3.2 Definitions of Terms Specific to This Standard:
3.2.1 accuracy—the closeness of agreement between a test result and an accepted reference value.
3.2.2 bias—the arithmetic average difference between the reference values and the values produced by the analytical method
under test, for a set of samples.
3.2.3 detection limit—the lowest level of a property in a sample that can be detected, but not necessarily quantified, by the
measurement system.
This practice is under the jurisdiction of ASTM Committee E13 on Molecular Spectroscopy and Separation Science and is the direct responsibility of Subcommittee
E13.11 on Multivariate Analysis.
Current edition approved March 1, 2010Dec. 15, 2017. Published April 2010February 2018. Originally approved in 2008. Last previous edition approved in 20092010
as E2617 – 09a.E2617 – 10. DOI: 10.1520/E2617-10.10.1520/E2617-17.
For referenced ASTM standards, visit the ASTM website, www.astm.org, or contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM Standards
volume information, refer to the standard’s Document Summary page on the ASTM website.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E2617 − 17
3.2.4 estimate—the constituent concentration, identification, or other property of a sample as determined by the analytical
method being validated.
3.2.5 initial validation—validation that is performed when an analyzer system is initially installed or after major maintenance.
3.2.6 Negative Fraction Identified—the fraction of samples not having a particular characteristic that is identified as not having
that characteristic.
E2617 − 17
3.2.6.1 Discussion—
Negative Fraction Identified assumes that the characteristic that the test measures either is or is not present. It is not applicable to
tests with multiple possible outcomes.
3.2.7 ongoing periodic revalidation—the quality assurance process by which, in the case of quantitative calibrations, the bias
and precision or, in the case of qualitative calibrations, the Positive Fraction Identified and Negative Fraction Identified
performance determined during initial validation are shown to be sustained.
3.2.8 Positive Fraction Identified—the fraction of samples having a particular characteristic that is identified as having that
characteristic.
3.2.8.1 Discussion—
Positive Fraction Identified assumes that the characteristic that the test measures either is or is not present. It is not applicable to
tests with multiple possible outcomes.
3.2.9 precision—the closeness of agreement between independent test results obtained under stipulated conditions.
3.2.9.1 Discussion—
Precision may be a measure of either the degree of reproducibility or degree of repeatability of the analytical method under normal
operating conditions. In this context, reproducibility refers to the use of the analytical procedure in different laboratories, as in a
collaborative study.
3.2.10 quantification limit—the lowest level of a sample property which can be determined with acceptable precision and
accuracy under the stated experimental conditions.
3.2.11 range—the interval between the upper and lower levels of a property (including these levels) that has been demonstrated
to be determined with a suitable level of precision and accuracy using the method as specified.
3.2.12 reference value—the metric of a property as determined by well-characterized method, the accuracy of which has been
stated or defined, that is, another, already-validated method.
3.2.13 validation—the statistically quantified judgment that an empirically derived multivariate calibration is applicable to the
measurement on which the calibration is to be applied and can perform property estimates with, in the case of quantitative
calibrations, acceptable precision, accuracy and bias or, in the case of qualitative calibrations, acceptable Positive Fraction
Identified and Negative Fraction Identified, as compared with results from an accepted reference method.
3.2.14 validation space—the region(s) of a calibration’s multivariate sample space populated by the independent validation
samples which are used to validate the calibration.
4. Summary of Practice
4.1 Validating an empirically derived multivariate calibration (model) consists of four major procedures: validation at initial
development, revalidation at initial deployment or after a revision, ongoing periodic revalidation, and qualification of each
measurement before using the calibration to estimate the property(s) of the sample being measured.
5. Significance and Use
5.1 This practice outlines a universally applicable procedure to validate the performance of a quantitative or qualitative,
empirically derived, multivariate calibration relative to an accepted reference method.
5.2 This practice provides procedures for evaluating the capability of a calibration to provide reliable estimations relative to an
accepted reference method.
5.3 This practice provides purchasers of a measurement system that incorporates an empirically derived multivariate calibration
with options for specifying validation requirements to ensure that the system is capable of providing estimations with an
appropriate degree of agreement with an accepted reference method.
5.4 This practice provides the user of a measurement system that incorporates an empirically derived multivariate calibration
with procedures capable of providing information that may be useful for ongoing quality assurance of the performance of the
measurement system.
5.5 Validation information obtained in the application of this practice is applicable only to the material type and property range
of the materials used to perform the validation and only for the individual measurement system on which the practice is completely
applied. It is the user’s responsibility to select the property levels and the compositional characteristics of the validation samples
such that they are suitable to the application. This practice allows the user to write a comprehensive validation statement for the
E2617 − 17
analyzer system including specific limits for the validated range of application and specific restrictions to the permitted uses of the
measurement system. Users are cautioned against extrapolation of validation results beyond the material type(s) and property
range(s) used to obtain these results.
5.6 Users are cautioned that a validated empirically derived multivariate calibration is applicable only to samples that fall within
the subset population represented in the validation set. The estimation from an empirically derived multivariate calibration can only
be validated when the applicability of the calibration is explicitly established for the particular measurement for which the
estimation is produced. Applicability cannot be assumed.
6. Methods and Considerations
6.1 When validating an empirically derived multivariate calibration, it is the responsibility of the user to describe the
measurement system and the required level of agreement between the estimations produced by the calibration and the accepted
reference method(s).
6.2 When validating a measurement system incorporating an empirically derived multivariate calibration, it is the responsibility
of the user to satisfy the requirements of any applicable tests specific to the measurement system including any Installation
Qualification (IQ), Operational Qualification (OQ), and Performance Qualification (PQ) requirements; which may be mandated by
competent regulatory authorities, an applicable Quality Assurance (QA), or Standard Operating Procedure (SOP) or be
recommended by the instrument or equipment manufacturer.
6.3 Reference Values and Quality Controls for the Accepted Reference Method:
6.3.1 The reference (or true) value which is compared with each respective estimate produced by the empirically derived
multivariate calibration is established by applying an accepted reference method, the characteristics of which are known and stated,
to the sample from which the measurement system derives the measurement.
6.3.2 To ensure the reliability of the reference values provided by an accepted reference method, appropriate quality controls
should be applied to the accepted reference method.
7. Procedure
7.1 The objective of the validation procedure is to quantify the performance of an empirically derived multivariate calibration
in terms of, in the case of quantitative calibrations, precision, accuracy and bias or, in the case of qualitative calibrations, Positive
Fraction Identified and Negative Fraction Identified relative to an accepted reference method for each property of interest. The user
must specify, based on the intended use of the calibration, acceptable precision and bias or Positive Fraction Identified and
Negative Fraction Identified performance criteria before initiating the validation. These criteria will be dependent on the intended
use of the analyzer and may be based, all or in part, on risk based criteria.
7.1.1 The acceptable performance criteria specified by the user may be constant over the entire range of sample variability.
Alternatively, different acceptable performance criteria may be specified by the user for different sub-ranges of the full sample
variability.
7.2 Validation of calibration is accomplished by using the calibration to estimate the property(s) of a set of validation samples
and statistically comparing the estimates for these samples to known reference values. Validation requires thorough testing of the
model with a sufficient number of representative validation samples to ensure that it performs adequately over the entire range of
possible sample variability.
7.3 Initial Validation Sample Set:
7.3.1 For the initial validation of a multivariate model, an ideal validation sample set will:
7.3.1.1 Contain samples that provide sufficient examples of all combinations of variation in the sample properties which are
expected to be present in the samples which are to be analyzed using the calibration;
7.3.1.2 Contain samples for which the ranges of variation in the sample properties is comparable to the ranges of variation
expected for samples that are to be analyzed using the model;
7.3.1.3 Contain samples for which the respective variations of the sample properties are uniformly and mutually independently
distributed over their full respective ranges or, when applicable, subranges of variation; and
7.3.1.4 Contain a sufficient number of samples to statistically test the relationships between the measured variables and the
properties that are modeled by the calibration.
7.3.2 For simple systems, sufficient validation samples can generally be obtained to meet the criteria in 7.3.1.1 – 7.3.1.4. For
complex mixtures, obtaining an ideal validation set may be difficult if not impossible. In such cases, it may be necessary to validate
discrete subranges of the calibration incrementally, over time as samples become available.
7.3.3 The number of samples needed to validate a calibration depends on the complexity of the calibration, the ranges of
property variation over which the calibration is to be applied, and the degree of confidence required. It is important to validate a
calibration with as many samples as possible to maximize the likelihood of challenging the calibration with rarely occurring, but
potentially troublesome samples. The number and range of validation samples should be sufficient to validate the calibration to the
statistical degree of confidence required for the application. In all cases, a minimum of 20 validation samples is recommended. In
addition, the validation samples should:
E2617 − 17
7.3.3.1 Multivariately span the ranges of sample property values over which the calibration will be used; that is, the span and
the standard deviation of the ranges of sample property values for the validation samples should be at least 100 % of the spans
of the sample property values over which the calibration will be used, and the sample property values for the validation samples
should be distributed as uniformly as possible throughout their respective ranges, and the variations of the sample property values
among the samples should be as mutually independent as possible; and
7.3.3.2 Span the ranges of the independent variables over which the calibration will be used; that is, if the range of an
independent variable is expected to vary from a to b, and the standard deviation of the independent variable is c, then the variations
of that independent variable in the set of validation samples should cover at least 100 % of the range from a to b, and should be
distributed as uniformly as possible across the range such that the standard deviation in that independent variable estimated for
the validation samples will be at least 95 % of c.
(1) When validating a calibration for which detection limit or quantification limit is an important consideration, the user should
include a number of validation samples whose property(s) are close to the detection or quantification limit(s) sufficient to validate
the respective limit(s) to the statistical degree of confidence required for the application.
7.4 For quantitative calibrations, the validation error for each property in each sample is given by the Standard Error of
Validation (SEV) and bias for that property.
7.4.1 The validation bias, ev-, is a measure of the average difference between the estimates made based on the empirical model
and the results obtained on the same validation samples using the reference method.
7.4.1.1 If there are single reference values and estimates for each validation sample, the validation bias is calculated as:
v
~vˆ 2 v !
( i i
i51
e¯ 5 (1)
v
v
where:
vˆ = estimate from the model for the ith sample,
i
v = accepted reference value for the ith sample, and
i
v = number of validation samples.
7.4.1.2 If replicate estimates and a single reference value are available for the validation samples, then the validation bias is
calculated as:
v r
i
vˆ 2 v
~ !
ij i
( (
i51 j51
e¯ 5 (2)
v v
r
i
(
i51
where:
vˆ = the jth estimate for the ith validation sample, and
ij
r = number of replicate estimates for the ith validation sample.
i
7.4.1.3 If a single estimate and multiple reference values are available for the validation samples, then the validation bias is
calculated as:
v r
i
vˆ 2 v
~ !
( ( i ij
i51 j51
e¯ 5 (3)
v
v
s
( i
i51
where:
vˆ = estimate for the ith validation sample,
i
v = the jth reference value for the ith validation sample, and
ij
s = number of replicate reference values for the ith validation sample.
i
7.4.1.4 If multiple estimates and multiple reference values are available for the validation samples, then the validation bias is
calculated as:
v r s
i i
vˆ 2 v
~ !
( ( ( ij ik
i51 j51 k51
e¯ 5 (4)
v
v
r s
( i i
i51
where:
vˆ = the jth estimate for the ith validation sample,
ij
E2617 − 17
v = the kth reference value for the ith validation sample,
ik
r = number of replicate estimates for the ith validation sample, and
i
s = number of replicate reference values for the ith validation sample.
i
7.4.2 The SEV, also called the Standard Error of Prediction (SEP) and the Standard Deviation of Validation Residuals (SDV),
are measures of the expected agreement of the empirical model and the reference method. The calculation of SEV and SDV depend
on whether replicate estimates or reference values, or both, are used.
7.4.2.1 If there are single reference values and estimates for each validation sample, then SEV and SDV are calculated as:
v
vˆ 2 v
~ !
i i
(
i51
SEV 5 (5)
!
v
v
vˆ 2 v 2 e¯
~ !
( i i v
i51
SDV 5
!
v
where:
vˆ = estimate from the model for the ith sample,
i
v = accepted reference value for the ith sample, and
i
v = number of validation samples.
7.4.2.2 If replicate estimates and a single reference value are available for the validation samples, then SEV and SDV are
calculated as:
v r
i
vˆ 2 v
~ !
( ( ij j
i51 j51
SEV 5 (6)
v
! r
( i
i51
v r
i
vˆ 2 v 2 e¯
~ !
( ( ij i v
i51 j51
SDV 5
v
!
r
i
(
i51
where:
vˆ = the jth estimate for the ith validation sample, and
ij
r = number of replicate estimates for the ith validation sample.
i
NOTE 2—If each validation sample is estimated r times, an average estimate could be used in 7.4.2.1, but then the SEV calculated would represent
the expected agreement between the average of r estimations and a single reference measurement, not the agreement based on a single estimation from
the empirical model.
7.4.2.3 If a single estimate and multiple reference values are available for the validation samples, then SEV and SDV are
calculated as:
v r
i
~vˆ 2 v !
( ( i ij
i51 j51
SEV 5 (7)
v
!
s
( i
i51
v r
i
vˆ 2 v 2 e¯
~ !
i ij v
( (
i51 j51
SDV 5
v
!
s
i
(
i51
where:
vˆ = estimate for the ith validation sample,
i
vˆ = the jth reference value for the ith validation sample, and
ij
s = number of replicate reference values for the ith validation sample.
i
NOTE 3—If each validation sample has s reference values, an average estimate could be used in 7.4.2.1, but then the SEV calculated would represent
the expected agreement between an estimate from the empirical model and the average of s reference measurements, not a the agreement relative to a
single reference measurement.
E2617 − 17
7.4.2.4 If multiple estimates and multiple reference values are available for the validation samples, then SEV and SDV are
calculated as:
v r s
i i
~vˆ 2 v !
( ( ( ij ik
i51 j51 k51
SEV 5 (8)
v
!
r s
( i i
i51
v r s
i i
vˆ 2 v 2 e¯
~ !
ij ik v
( ( (
i51 j51 k51
SDV 5
v
!
r s
i i
(
i51
where:
vˆ = the jth estimate for the ith validation sample,
ij
vˆ = the kth reference value for the ith validation sample,
ik
r = number of replicate estimates for the ith validation sample, and
i
s = number of replicate reference values for the ith validation sample.
i
NOTE 4—If each validation sample has r estimates and s reference values, average estimates and reference values could be used in 7.4.1.1, but then
the SEV calculated would represent the expected agreement between r estimates from the empirical model and the average of s reference measurements,
not a the agreement between a single estimate and reference measurement.
7.4.3 Significance of Validation Bias—A t-value can be calculated as:
e d
v v
t 5 (9)
SDV
where:
d = degrees of freedom and is equal to the denominator in the bias calculation.
v
NOTE 5—The t-value is compared to a critical t-value for the desired probability level (typically 95 %).
7.4.3.1 If the calculated t-value is less than the critical t-value, then the validation bias is not statistically significant and the
empirical model and reference method are expected to on average yield the same result. In this case, either SEV or SDV are
adequate measures of the expected agreement between the empirical model and the reference method. If the validation bias is of
practical significance relative to the user specified bias requirement, then the precision of the empirical model results is insufficient
to achieve the user requirement.
7.4.3.2 If the calculated t-value is greater than the critical t-value, then the validation bias is statistically significant. In this case
SDV is a better measure of the expected agreement between the results of the empirical model and the reference method. While
the bias may be statistically significant, it may not be of practical significance relative to the user specified requirements for the
empirical model.
7.5 Positive and Negative Fractions Identified:
7.5.1 The Positive Fraction Identified of the calibration is given by: Positive Fraction Identified = (number of samples identified
as having a stated characteristic) / (total number of samples having the stated characteristic).
7.5.2 The Negative Fraction Identified of the calibration is given by: Negative Fraction Identified = (number of samples
identified as not having a stated characteristic) / (total number of samples not having the stated characteristic).
7.5.3 The equations for Positive Fraction Identified and Negative Fraction Identified assume that the characteristic being
measured either is or isn’t present. It is not applicable to tests with multiple possible outcomes.
7.6 The users should use statistical tests and decision criteria appropriate to the application to decide if the SEV and bias are
within statistically acceptable limits.
7.7 Samples for Revalidation After Initial Deployment and Ongoing Periodic Revalidation Samples:
7.7.1 The user must determine, based on the particulars of each application, the appropriate timing and number of samples
required for revalidation after initial deployment and for ongoing periodic revalidation.
7.7.1.1 The timing and number of revalidation samples may be adjusted from time to time as experience is gained in applying
the calibration under actual conditions.
7.7.1.2 In many cases revalidation samples are restricted to “samples of opportunity” and limited to samples from actual
production operations. In such cases, care should be taken to schedule revalidation samples as asynchronously as possible with
respect to recurring conditions such as time of day, production process operating conditions, phase or stage of production process,
ambient conditions, operating personnel, etc. This listing of potential conditions for consideration is exemplary, not comprehen-
sive; the user should take into account any external conditions pertinent to the application.
7.7.2 It is recommended that the results of ongoing periodic revalidation should be monitored or tracked by control charting.
E2617 − 17
8. Qualification of Each Measurement Prior to Application of the Validated Calibration
8.1 The independent variables measured from a sample under test must be evaluated to ensure that this measurement is eligible
to be processed by the calibration to produce estimates of the property(s) of interest. The purpose of this eligibility test is to
determine, within user specified statistical limits, if the validation samples used to validate the calibration are sufficiently
representative of (similar to) the sample under test. In other words, the purpose of this step is to confirm that the measurement from
the sample under test is within the calibration’s validation space. If the measurement is eligible, the estimates should fall within
accuracy and precision bounds determined during the validation. If the measurement is not eligible, then the accuracy and precision
of the estimates are not known based on the validation. The measurement of a sample under test may be tested for eligibility using
Mahalanobis distance, Nearest Neighbor Mahalanobis Distance (NNMD), or Standard Residual Variance in the Independent
Variables (SRVIV), either singly or in combination. The user may also specify additional eligibility criteria if and as appropriate
to the application.
8.1.1 The development of an empirical model will typically involve transformation of the independent variables. By way of
illustration, such transformation may include one or more of the following:
8.1.1.1 Linearization of the independent variables (for example, conversion from transmission to absorbance, from reflectance
to log(1-reflectance), etc.);
8.1.1.2 Digital filtering (smoothing, digital derivatives);
8.1.1.3 Orthogonalization (Orthogonal Signal Correction);
8.1.1.4 Rank reduction (Principal Components Analysis (PCA) or PLS);
8.1.1.5 Squares, cross products or nonlinear functions of variables;
8.1.1.6 Explicit artifact removal (cosmic ray event removal);
8.1.1.7 Centering or baseline correction;
8.1.1.8 Arbitrary scaling, variance scaling, or auto scaling;
8.1.1.9 Exclusion of one or more independent variables from use in the calibration; and
8.1.1.10 Integration of peaks with or without baseline correction.
8.1.2 Mahalanobis distance, NNMD, and SRVIV statistics are calculated after applying the same transformations to the
measurement being qualified which were applied to the measurements used to produce and validate the calibration.
8.2 SRVIVs can sometimes be employed to determine if the samples used to validate the empirical model are sufficiently
representative of (similar to) the sample under test. SRVIV is intended to detect any anomalous variance which may be present
in the measurement from new signals (for example, new chemical components, new instrumental or sample conditions, etc.) that
were not represented in the validation samples. If the validation samples are sufficiently representative of the (unknown) sample
under test, then the amount of residual variance in the independent variables of the sample under test will be statistically
indistinguishable from the amount of residual variance in the validation samples. This is always a necessary criterion for
qualification testing, but it may not always be solely sufficient. If the empirical calibration utilizes most of the non-noise portion
of variance in the independent variables, the residual variance will be a very sensitive measure of any aberrant variance present
in the data for the sample under test. Alternatively, if the empirical model is based on a small fraction of the non-noise portion of
the variance in the independent variable, then tests based on the statistics of the SRVIV are unlikely, used alone, to provide
adequate warning of measurements, which are not qualified for estimation by the calibration.
8.2.1 The residual variance in the independent variables is defined as that fraction of the variance in the variables which is not
spanned by the validation samples’ basis space comprising an appropriate number of abstract factors determined by either PCA
(1, 2)). or PLS (1, 2).If the f × v matrix X comprises column vectors, each of which contains the f independent variables (for
example, the spectrum) of v validation samples; the f × k matrix P comprises column vectors, each of which contains one of the
factors comprising the PCA or PLS basis space; the f × k matrix T comprises the scores of the v validation samples for the k basis
ˆ
vectors; the f × v matrix X comprises column vectors, each of which contains the reconstructions of respective columns of X by
the k factors comprising the basis space in P; and the f × v matrix R comprises column vectors, each of which contains the residual
variance in each corresponding measurement in X which is not spanned by the basis space; then:
ˆ
R5X2X
T
5X2PT
T 21 T
5X2P~P P! P X
(10)
and the standard residual (SRVIV) is then given by:
f v
R
( ( ij
i51 j51
SRVIV5 (11)
!
f·k
The boldface numbers in parentheses refer to the list of references at the end of this standard.
E2617 −
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...