Oilseeds — Application of near infrared spectrometry

This International Standard gives a procedure for the prediction by near infrared spectroscopy of constituents such as moisture, fat and protein and some minor parameters such as total glucosinolates in oilseeds and oilseeds meals. The determinations are based on spectrometric measurement in the near infrared spectral region.

Graines oléagineuses — Application de la spectrométrie dans le proche infrarouge

General Information

Status
Not Published
Current Stage
5020 - FDIS ballot initiated: 2 months. Proof sent to secretariat
Start Date
13-Nov-2025
Completion Date
13-Nov-2025
Ref Project
Draft
ISO/FDIS 18419 - Oilseeds — Application of near infrared spectrometry Released:30. 10. 2025
English language
29 pages
sale 15% off
sale 15% off
Draft
REDLINE ISO/FDIS 18419 - Oilseeds — Application of near infrared spectrometry Released:30. 10. 2025
English language
29 pages
sale 15% off
sale 15% off
Draft
ISO/FDIS 18419 - Graines oléagineuses — Application de la spectrométrie dans le proche infrarouge Released:12/2/2025
French language
32 pages
sale 15% off
sale 15% off

Standards Content (Sample)


FINAL DRAFT
International
Standard
ISO/TC 34/SC 2
Oilseeds — Application of near
Secretariat: AFNOR
infrared spectrometry
Voting begins on:
Graines oléagineuses — Application de la spectrométrie dans le 2025-11-13
proche infrarouge
Voting terminates on:
2026-01-08
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT,
WITH THEIR COMMENTS, NOTIFICATION OF ANY
RELEVANT PATENT RIGHTS OF WHICH THEY ARE AWARE
AND TO PROVIDE SUPPOR TING DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO­
LOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE
TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL
TO BECOME STAN DARDS TO WHICH REFERENCE MAY BE
MADE IN NATIONAL REGULATIONS.
Reference number
FINAL DRAFT
International
Standard
ISO/TC 34/SC 2
Oilseeds — Application of near
Secretariat: AFNOR
infrared spectrometry
Voting begins on:
Graines oléagineuses — Application de la spectrométrie dans le
proche infrarouge
Voting terminates on:
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT,
WITH THEIR COMMENTS, NOTIFICATION OF ANY
RELEVANT PATENT RIGHTS OF WHICH THEY ARE AWARE
AND TO PROVIDE SUPPOR TING DOCUMENTATION.
© ISO 2025
IN ADDITION TO THEIR EVALUATION AS
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO­
LOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL
or ISO’s member body in the country of the requester.
TO BECOME STAN DARDS TO WHICH REFERENCE MAY BE
MADE IN NATIONAL REGULATIONS.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland Reference number
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Principle . 9
5 Apparatus . 9
6 Calibration and initial validation . 10
6.1 General .10
6.2 Reference methods .10
6.3 Development of the prediction model .11
6.4 Cross-validation . 12
6.5 Outliers . 12
6.6 Validation of prediction models . 12
6.6.1 General . 12
6.6.2 External validation . 13
6.6.3 Bias correction . 13
6.6.4 Slope adjustment .14
6.6.5 Expansion of prediction sample set .14
6.7 Changes in measuring and instrument conditions .14
7 Statistics for performance measurement . 14
7.1 General .14
7.2 Plot the results . 15
7.3 Bias .16
7.4 Standard error of prediction .17
7.5 Root mean square error of prediction . 20
7.6 Slope . 20
7.7 Coefficient of determination .21
7.8 Ratio of performance to deviation . 22
8 Sampling .23
9 Procedure .23
9.1 Preparation of test sample . 23
9.2 Measurement . 23
9.3 Evaluation of results . .24
10 Checking instrument stability .24
10.1 Instrument diagnostics .24
10.2 Control sample .24
10.3 Instruments in a network .24
11 Running performance check of the prediction models .24
11.1 General .24
11.2 Control charts using the difference between reference and NIR results (validation
samples) . 25
12 Precision and accuracy .26
12.1 Repeatability . 26
12.2 Accuracy . 26
13 Test report .27
Bibliography .28

iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO’s adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 34, Food products, Subcommittee SC 2,
Oleaginous seeds and fruits and oilseed meals.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.

iv
Introduction
This document has been drafted using, as a basis, ISO 12099:2017 (prepared by Technical Committee
ISO/TC 34, Food products, Subcommittee SC 10, Animal feeding stuffs) and ISO 21543:2020 (prepared
by Technical Committee ISO/TC 34, Food products, Subcommittee SC 5, Milk and milk products) and
References [19], [20], [21], [22], [23], [24] [25], [26], [27], [28] and [29].

v
FINAL DRAFT International Standard ISO/FDIS 18419:2025(en)
Oilseeds — Application of near infrared spectrometry
1 Scope
This document specifies a procedure for the prediction by near infrared spectroscopy (NIRS) of constituents
such as moisture, fat and protein and some minor parameters such as total glucosinolates in oilseeds and
oilseed meals.
The determinations are based on spectrometric measurement in the near infrared spectral region.
2 Normative references
There are no normative references in this document.
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
near infrared spectroscopy
NIRS
analytical technique that uses the near-infrared region of the electromagnetic spectrum, typically 780 nm
−1 −1
to 500 nm (25 000 cm to 4 000 cm ) or segments of this region or selected wavelengths or wavenumbers
to probe the constituent content (3.8) of a sample
3.2
near infrared spectrometer
NIR spectrometer
analytical instrument designed to measure the absorbance or reflectance of near-infrared light (typically in
the wavelength range of 780 nm to 2 500 nm) of oilseed samples (seed and meals)
Note 1 to entry: When used under specified conditions, NIR spectrometer can predict the constituent contents (3.8)
of oilseeds and oilseed meals by modelling relationships between the sample constituent composition and their near
infrared spectra (3.7) via a prediction model (3.13) using multivariate prediction techniques (3.14).
Note 2 to entry: Some instruments span a wider range of wavelengths (400 nm to 2 500 nm) and are able to measure
in the visible range (400 nm to 800 nm).
Note 3 to entry: Differences can still arise between NIR spectrometers of the same make and model, due to several
factors, even if the hardware appears identical. Even small differences in alignment of mirrors, gratings, or detector
performances can affect spectral output
3.3
near infrared reflectance
NIR
R
type of near infrared spectroscopy (3.1) where the basic measurement is the absorption of near-infrared light
diffusely reflected back from the surface of a sample collected by a detector in front of the sample

3.4
near infrared transmittance
NIR
T
type of near infrared spectroscopy (3.1) where the basic measurement is the absorption of near-infrared light
that has travelled through a sample and is then collected by a detector behind the sample
3.5
near infrared transflectance
type of near infrared spectroscopy (3.1) where the basic measurement is the absorption of near-infrared light
that has travelled through a sample and then it reflects back from a reflective surface behind it
Note 1 to entry: The detector captures the combined transmitted (3.4) and reflected (3.3) light.
3.6
near infrared spectroscopy network
NIRS network
number of near infrared spectrometers (3.2), operated using the same prediction models (3.13), which are
usually standardized so that the differences in predicted values (3.11) for a set of test samples (3.27) are
minimized
3.7
near infrared spectra
NIR spectra
graphical representation of the absorbance, obtained by near infrared reflectance (3.3), near infrared
transmittance (3.4) or near infrared transflectance (3.5), of NIR radiation by a sample as a function of
wavelength or wavenumber
Note 1 to entry: It captures the molecular overtones and combination vibrations primarily associated with C-H, N-H,
O-H, and S-H bonds.
Note 2 to entry: The term “spectra” is used rather than 'spectrum' because modern NIR instruments typically acquire
and automatically average multiple individual spectra to generate a representative composite output.
Note 3 to entry: In literature, the term “scans” is sometimes used in place of spectra.
3.8
constituent content
mass fraction of substances of oilseeds and oilseed meals determined using reference methods (3.9) or
forecasted using the prediction model (3.13)
Note 1 to entry: Examples of constituents that can be predicted by near infrared (NIR) spectroscopy include moisture,
fat, protein, total glucosinolate, etc. For examples of appropriate methods, see ISO 10565, ISO 10632, ISO 16634-1,
ISO 1705, ISO 659, ISO 771, ISO 9167 and ISO 10519. The measuring units of the parameters that will be predicted shall
be the units used in the reference methods.
Note 2 to entry: It is possible to develop and validate NIR methods for other constituents than the ones listed in Note 1
to entry, as long as the procedure from this document is observed.
3.9
reference method
method or protocol producing definitive results, called reference value (3.10), against which the predicted
NIR results are compared.
Note 1 to entry: These reference methods are typically direct analytical techniques capable of quantifying one or
more seed constituents without reliance on intermediate or inferential measurements. These methods can be
internationally standardized methods or in-house validated methods recognized by experts or by agreement between
parties.
Note 2 to entry: The reference method shall be a validated method of known accuracy, repeatability and reproducibility.
Note 3 to entry: International standardized methods that have been developed and approved by one or more
internationally recognized standards organizations such as ISO are often used as reference methods.

3.10
reference value
true
experimentally determined or certified value of a constituent of oilseeds or oilseed meals, obtained using a
reference method (3.9)
3.11
predicted value
constituent content (3.8) of oilseeds or oilseed meals forecasted by a prediction model (3.13) using the near
infrared spectra (3.7) from oilseeds and oilseed meals produced by a near infrared spectrometer (3.2)
3.12
standardization of the instrument
process whereby a group of near infrared instruments is adjusted so that it predicts similar constituent
contents (3.8) when operating the same prediction model(s) (3.13) on the same sample(s)
Note 1 to entry: A number of techniques can be used but these can be broadly defined as either pre-prediction methods
where the spectra of samples are adjusted to minimize the differences between the response of a master instrument
and each instrument in the group or post-prediction methods where linear regression is used to adjust the predicted
values (3.11) produced by each instrument to make them as similar as possible to those from a master instrument. See
the Bibliography for more information.
3.13
prediction model
calibration model
multivariate prediction equation designed to forecast the constituent content (3.8) of oilseeds and oilseeds
meals based on their near infrared (NIR) spectra (3.7) data
Note 1 to entry: These models are typically developed using multivariate prediction techniques (3.14) such as principal
component regression (3.16), partial least squares regression (3.17), multiple linear regression (3.18) regressions or
more advanced nonlinear methods.
Note 2 to entry: Prediction models can also be developed to predict the physical properties of a sample.
Note 3 to entry: The term “calibration model” is well known and well understood by advanced NIR spectroscopy users.
To outline that NIR spectroscopy is a secondary form of measurement, the term “prediction model” is used in this
document instead of the term “calibration model”.
3.14
multivariate prediction technique
statistical modelling approach used to establish quantitative relationships between multiple predictor
variables (e.g. near infrared spectra (3.7)) and one or more response variables (e.g. constituent content (3.8))
Note 1 to entry: This technique is commonly applied in NIR spectroscopic, where overlapping and correlated signals,
from an NIR scan, require methods such as principal component regression (3.16), partial least squares regression (3.17)
or multiple linear regression (3.18) or more advanced nonlinear methods to extract meaningful information to develop
a prediction model (3.13).
Note 2 to entry: In literature, multivariate prediction techniques are often referred as “chemometrics”.
3.15
principal component analysis
PCA
form of data compression, which for a set of samples works solely with the x data, e.g. near infrared spectra
(3.7) for near infrared spectroscopy (3.1), and finds principal components (PCs) (factors) according to a rule
that says that each PC expresses the maximum variation in the data at any time and is uncorrelated with any
other PC
Note 1 to entry: The first PC expresses as much as possible of the variability in the original spectra data of the
samples. Its effect is then subtracted from the x data and a new PC derived again expressing as much as possible of the
variability in the remaining data. It is possible to derive as many PCs as there are either data points in the spectrum
or samples in the data set, but the major effects in spectra can be shown to be concentrated in the first few PCs and
therefore the number of data that need to be considered is dramatically reduced.

Note 2 to entry: PCA produces two new sets of variables at each stage:
a) PC scores represent the response of each sample on each PC;
b) principal component loadings represent the relative importance of each data point in the original spectra to the
principal component.
Note 3 to entry: PCA has many uses (e.g. in spectral interpretation) but is most widely used in the identification of
spectral outliers (3.21).
3.16
principal component regression
PCR
technique which uses the scores on each principal component as regressors in a multiple linear regression
(3.18) against y values representing the composition of samples
Note 1 to entry: As each principal component is orthogonal to every other principal component, the scores form an
uncorrelated data set with better properties than the original spectra. While it is possible to select a combination
of principal components for regression based on how well each principal component correlates to the constituent of
interest, most commercial software forces the regression to use all principal components up to the highest principal
component selected for the model (“the top-down approach”).
Note 2 to entry: When used in near infrared spectroscopy (3.1), the regression coefficients in the principal component
space are usually converted back to a prediction model (3.13) using all the data points in wavelength space.
3.17
partial least squares regression
PLS
form of data compression which uses a rule to derive the factors consisting of allowing each factor in turn to
maximize the covariance (3.47) between the y data and all possible linear combinations of the x data
Note 1 to entry: PLS is a balance between variance and correlation with each factor being influenced by both effects.
PLS factors are therefore more directly related to variability in y values than are principal components. PLS produces
three new variables: loading weights (which are not orthogonal to each other), loadings and scores which are both
orthogonal.
Note 2 to entry: PLS models are produced by regressing PLS scores against y values. As with principal component
regression (3.16), when used in near infrared spectroscopy (3.1), the regression coefficients in PLS space are usually
converted back to a prediction model (3.13) using all the data points in wavelength space.
3.18
multiple linear regression
MLR
technique using a combination of several x variables to predict a single y variable
Note 1 to entry: In near infrared spectroscopy (3.1), the x values are either absorbance values at selected wavelengths
in the near infrared reflectance (3.3) or derived variables such as principal component analysis (3.15) or partial least
squares regression (3.17) scores.
3.19
artificial neural network
ANN
non-linear modelling technique loosely based on the architecture of biological neural systems
Note 1 to entry: The network is initially trained by supplying a data set with several x (spectral or derived variables
such as principal component analysis scores) values and reference y values. During the training process, the
architecture of the network may be modified, and the neurons assigned weighting coefficients for both inputs and
outputs to produce the best possible predictions of the parameter values.
Note 2 to entry: Neural networks require a lot of data in training.

3.20
Mahalanobis distance
global h-value
distance in principal component (PC) space between a data point and the centre of the principal component space
Note 1 to entry: Mahalanobis distance is a nonlinear measurement. In PC space, a set of samples usually form a curve
shaped distribution. The ellipsoid that best represents the probability distribution of the set can be estimated by
building the covariance (3.47) matrix of the samples. The Mahalanobis distance is simply the distance of the test point
from the centre of mass divided by the width of the ellipsoid in the direction of the test point. In some software, the
Mahalanobis distance is referred to as the “global h-value” and outlier detection depends upon how many standard
deviations of h a sample is from the centre.
3.21
outlier
member of a set of values which is inconsistent with the other members of that set
Note 1 to entry: As defined by ISO 5725-1.
Note 2 to entry: For near infrared spectroscopy (3.1) data, outliers are points in any data set that can be shown
statistically to have values that lie well outside an expected distribution. Outliers are normally classified as either x-
(spectral) outliers (3.22) or y- (reference data) outliers (3.23).
3.22
x-outlier
outlier (3.21) related to near infrared (NIR) spectra (3.7)
Note 1 to entry: An x-outlier can arise from a spectrum with instrumental faults or from a sample type that is radically
different from the other samples, or in prediction, a sample type not included in the original prediction sample set (3.25).
Note 2 to entry: NIR spectra outlier detection can be performed on either raw or pre-processed data using principal
component analysis (3.15) or Mahalanobis distance (3.20) analysis.
3.23
y-outlier
outlier (3.21) related to an error in the reference value (3.10) of the constituent content (3.8)
EXAMPLE An error in transcription or in the value obtained by the reference laboratory.
3.24
leverage
measure of how far a sample lies from the centre of the population space defined by a model
Note 1 to entry: Samples with high leverage have high influence on the model. Leverage is calculated by measuring the
distance between a projected point and the centre of the model.
3.25
prediction sample set
calibration sample set
samples with of known constituent contents (3.8) established by reference methods (3.9) to develop one or
several prediction models (3.13) using their near infrared spectra (3.7)
3.26
validation sample set
samples with a known constituent contents (3.8) established by reference methods (3.9) to validate or prove
a prediction model (3.13)
Note 1 to entry: The validation set usually contains samples having the same characteristics as those selected in the
th
prediction sample set (3.25). Often alternate or n samples (ranked in order of the constituent of interest) are allocated
to the prediction and validation data sets from the same pool of samples. None of the samples of the validation sample
set shall be used in the prediction sample set.

3.27
test sample
when using or testing a prediction model (3.13), any sample or set of samples, excluding those used to develop
the model
3.28
independent sample test set
test sample set that consists of samples that are from a different geographical region, industrial plant or have
been collected at different time (e.g. different harvest) than those used to create and validate a prediction
model (3.13)
Note 1 to entry: Independent test samples are never used to develop the prediction model.
Note 2 to entry: These samples form a true test of a prediction model.
3.29
monitoring sample set
set of samples that is used for the routine control of prediction models (3.13)
Note 1 to entry: This sample set shall be independent from the prediction sample set (3.25).
3.30
cross-validation
method of generating prediction statistics where, repeatedly, a subset of samples is removed from a
prediction sample set (3.25), a model is calculated on the remaining samples, residuals (3.33) are calculated
on the validation subset, and when this process has been run a number of times, a calculation of prediction
statistics is made on all the residuals
Note 1 to entry: Full cross-validation omits one sample at a time and is run n times (where there are n prediction
samples). Where a larger subset is removed, the cross-validation cycle is usually run at least eight times before the
statistics are calculated. Finally, a model is calculated using all the prediction samples.
Note 2 to entry: There are several disadvantages to the use of cross-validation methods to validate the near infrared
reflectance (3.3) model. First, cross-validation statistics tend to be optimistic when compared with those for an
independent test sample set. Second, if there is any duplication in the calibration data (e.g. the same sample scanned
on several instruments or at different times), it is necessary to always assign all copies of the same sample to the same
cross-validation segment, otherwise very optimistic statistics are produced.
3.31
overfitting
addition of too many regression terms in a multiple linear regression (3.18)
Note 1 to entry: This is a modelling error where a prediction model (3.13) captures not only the essential chemical
information but also the noise and/or irrelevant variability in the spectra (3.7) of the material being analysed.
Note 2 to entry: A result of overfitting, when samples not in the prediction sample set (3.25) are predicted, is that
statistics such as root mean square error of prediction (3.41) or SEP are much poorer than expected.
3.32
z-score
performance criterion calculated by dividing the difference between the near infrared predicted result (3.11)
and the reference value (3.10) by a target value for the standard deviation, usually the standard deviation for
proficiency assessment
Note 1 to entry: This is a standardized measure of laboratory bias, calculated using the reference value and the
standard deviation of the validation sample set (3.26) or the independent sample set (3.25).
3.33
residual
difference between an observed value of the response variable and the corresponding predicted value (3.11)
of the response variable
Note 1 to entry: As defined by ISO 3534-3.

Note 2 to entry: For near infrared spectroscopy (3.1) data, a residual is the difference between a reference value (3.10)
and the value predicted by a regression model. Residuals are used in the calculation of regression statistics.
3.34
spectral residual
residual (3.33) after chemometric treatment of a spectrum arising from spectral variation not described by
the model
3.35
bias
e
difference between the mean reference value (3.10), ȳ, and the mean value, x̄, forecasted by the NIR prediction
model (3.13) for a set of samples
3.36
bias confidence limit
BCL
Tb
value greater than which a bias (3.35) is significantly different from zero at the confidence level specified
Note 1 to entry: See also 7.3.
3.37
standard error of calibration
SEC
sSEC
for a prediction model (3.13), an expression of the average difference between predicted values (3.11) and
reference values (3.10) for samples used to develop the model
Note 1 to entry: As defined by standard error of cross-validation (3.38), root mean square error of cross-validation (3.39),
standard error of prediction (3.40), root mean square error of prediction (3.41), this illustration of the average difference
refers to the square root of the sum of squared residual values divided by the number of values corrected for degrees
of freedom, where 68 % of the errors are below this value.
3.38
standard error of cross-validation
SECV
sSECV
used with the data from the prediction sample set (3.25), expression of the bias-corrected average difference
between predicted values (3.11) and reference values (3.10) for the subset of samples selected as prediction
sample set during the cross-validation (3.30) process
3.39
root mean square error of cross-validation
RMSECV
sRMSECV
expression of the average difference between predicted values (3.11) and reference values (3.10) for the subset
of samples selected as prediction samples (3.25) during the cross-validation (3.30) process
Note 1 to entry: RMSECV includes any bias (3.35) in the predictions.
3.40
standard error of prediction
SEP
standard error of prediction corrected for the bias
SEP(C)
sSEP
expression of the bias-corrected average difference between predicted values (3.11) and reference values
(3.10) predicted by a prediction model (3.13) when applied to a set of samples not included in the prediction
model development
Note 1 to entry: SEP shall only be calculated for the validation sample set (3.26) or the independent sample test set (3.28).

Note 2 to entry: The SEP covers a confidence interval of 68 % (multiplied with 1,96 an interval of 95 %).
3.41
root mean square error of prediction
RMSEP
sRMSEP
expression of the average difference between reference values (3.10) and predicted values (3.11) by a prediction
model (3.13) when applied to a set of samples not included in the development of the prediction model
Note 1 to entry: SEP shall only be calculated for the validation sample set (3.26) or the independent sample test set (3.28).
Note 2 to entry: RMSEP includes any bias (3.35) in the predictions.
3.42
unexplained error confidence limit
UECL
T
UE
maximum standard error of prediction (3.40) limit significantly different from the standard error of
calibration (3.37) at the confidence limit specified
3.43
RSQ
r
xy
coefficient of determination, square of the multiple correlation coefficient between predicted values (3.11)
and reference values (3.10)
Note 1 to entry: When expressed as a percentage, it represents the proportion of the variance explained by the
regression model.
3.44
slope
b
in a regression line, representation of the amount y increases per increase in x
3.45
intercept
a
in a regression line, value of y when x is zero
3.46
residual standard deviation
S
RES
expression of the average size of the difference between reference and fitted values after a slope (3.44) and
intercept (3.45) correction has been performed
3.47
covariance
s
ˆ
yy
measure of how much two random variables vary together
Note 1 to entry: If, for a population of samples, an increase in x is matched by an increase in y, then the covariance
between the two variables will be positive. If an increase in x is matched by a decrease in y, then the covariance will be
negative. When values are uncorrelated then the covariance is zero.
3.48
prediction model robustness
ability of a NIR prediction model (3.13), built using multivariate prediction techniques (3.14), using a known
sample set to maintain accurate and reliable predictions across varying conditions such as sample types,
environmental factors, and time for any test sample (3.27)

3.49
ratio of performance to deviation
RPD
statistical parameter used to evaluate the predictive quality, robustness (3.48), of a prediction model (3.13)
during validation using a validation sample set (3.26) or testing using an independent sample test set (3.28)
4 Principle
NIR spectroscopy is an indirect analytical technique: it requires developing prediction models using samples
of known composition determined by standard/reference wet-chemical methods. These prediction models
are based on the correlations of NIR spectra of samples with their reference data. Once a prediction model
is developed, quantitative predictions for the composition of a sample can be attempted from the sample
NIR spectra alone. The accuracy of prediction models shall be validated by predicting the composition
of “unknown” samples using the NIR spectrometer for which the prediction models were developed and
comparing the predicted NIR results with the reference results obtained by the standard/reference wet-
chemistry methods used for the prediction model development.
5 Apparatus
5.1 Near-infrared instruments, using diffuse reflectance, transmittance or transflectance measurement
covering the NIR wavelength region, 780 nm to 2 500 nm (25 000 cm−1 to 4 000 cm−1), or segments of this
region or selected wavelengths (or wavenumbers). Some instruments span a wider range of wavelengths
(400 nm to 2 500 nm) and are able to measure in the visible range (400 nm to 800 nm). The optical principle
may be dispersive (e.g. grating monochromators), interferometric or non-thermal (e.g. light-emitting diodes,
laser diodes, lasers). The instrument should be provided with a diagnostic test system for testing photometric
noise and reproducibility, wavelength or wavenumber accuracy and wavelength or wavenumber precision
(for scanning spectrophotometers).
A background measurement (reference or baseline measurement) shall be taken routinely before analysing
a sample set. It captures the spectral response of the system without the sample. It can be done using air, a
reference material provided by the manufacturer or an empty sample holder.
The scanning resolution (spectral resolution) of an NIR spectrometer is a key parameter that affects
the quality and interpretability of the spectral data. Higher scanning resolution improves the distinction
between overlapping absorption bands, usually leading to improve precision in the chemometric models.
However, higher scanning resolution means slower scans, more noise in the signal and larger data size
scans. For NIR spectrometers measuring in transmittance, the sample pathlength (sample thickness)
measurements should be optimized according to the manufacturer’s recommendation with respect to signal
intensity for obtaining linearity and maximum signal/noise ratio.
For NIR spectrometers measuring in reflectance, reflective surface quality is important, a quartz window or
other appropriate material to eliminate drying effects can be used to cover the interacting sample surface layer.
For NIR spectrometers measuring in transflectance, both sample thickness and reflective surface quality
shall be optimized.
For all instruments, sampling effects can be reduced by scanning a sufficiently large sample volume or
surface to eliminate any significant influence by the non-homogeneity of chemical composition or physical
properties of the test sample. The sample size effect can be determined through validation of constituent
predictions using a representative set of samples of interest.
NOTE Some compounds need instruments with the visible range to be analysed. For example, chlorophyll is an
important parameter for canola seed quality. Chlorophyll content can only be predicted by including wavelengths in
the 650 nm to 700 nm range (visible range).
5.2 Appropriate milling or grinding device, for preparing the sample (if needed).
NOTE Changes in grinding or milling conditions can affect the NIR measurements.

When a prediction model has been developed for a constituent for a ground sample with a specific particle
size, it is recommended to:
— use the same particle size for all the samples analysed with the model; or
— ensure the prediction model was developed using representative samples spanning the range of particle
sizes intended to be measured, and the validation of prediction model is conducted using representative
samples spanning all particle sizes.
6 Calibration and initial validation
6.1 General
Prior to any analyses with the NIR spectrometer, run the system diagnostic and the manufacturer’s daily
check according to the manufacturer’s guidelines.
NIR spectroscopy allows for the prediction of compounds of interest by the development of statistical models,
prediction models, acquired via a process called “instrument calibration”. A number of different multivariate
prediction techniques can be applied to the NIR spect
...


ISO/DIS FDIS 18419:2024(en)
ISO/TC 34/SC 2
Secretariat: AFNOR
Date: 2025-09-1610-29
Oilseeds — Application of near infrared spectrometry
Graines oléagineuses — Application de la spectrométrie dans le proche infrarouge
FDIS stage
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication
may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying,
or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO
at the address below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: + 41 22 749 01 11
E-mail: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents
Foreword . iv
Introduction . v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Principle . 10
5 Apparatus . 10
6 Calibration and initial validation . 11
6.1 General . 11
6.2 Reference methods . 12
6.3 Development of the prediction model . 12
6.4 Cross-validation . 13
6.5 Outliers . 13
6.6 Validation of prediction models . 14
6.7 Changes in measuring and instrument conditions . 16
7 Statistics for performance measurement . 16
7.1 General . 16
7.2 Plot the results . 17
7.3 Bias . 19
7.4 Standard error of prediction . 21
7.5 Root mean square error of prediction . 26
7.6 Slope . 27
7.7 Coefficient of determination . 29
7.8 Ratio of performance to deviation . 30
8 Sampling . 31
9 Procedure . 31
9.1 Preparation of test sample . 31
9.2 Measurement . 31
9.3 Evaluation of results . 31
10 Checking instrument stability . 31
10.1 Instrument diagnostics . 31
10.2 Control sample . 32
10.3 Instruments in a network . 32
11 Running performance check of the prediction models . 32
11.1 General . 32
11.2 Control charts using the difference between reference and NIR results (validation
samples) . 33
12 Precision and accuracy . 34
12.1 Repeatability . 34
12.2 Accuracy . 35
13 Test report . 35
Bibliography . 36

iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of
ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent rights
in respect thereof. As of the date of publication of this document, ISO had not received notice of (a) patent(s)
which may be required to implement this document. However, implementers are cautioned that this may not
represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO’s adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 34, Food products, Subcommittee SC 2,
Oleaginous seeds and fruits and oilseed meals.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
iv
Introduction
This document has been drafted using, as a basis, ISO 12099:2017 (prepared by Technical Committee ISO/TC
34, Food products, Subcommittee SC 10, Animal feeding stuffs) and ISO 21543:2020 (prepared by Technical
Committee ISO/TC 34, Food products, Subcommittee SC 5, Milk and milk products) and other reference
material listed in the Bibliography (References [19], [20], [21], [22], [23], [24] [25], [26], [27], [28 [1] to [5]]
and [29[27] to [32]).].
v
Oilseeds — Application of near infrared spectrometry
1 Scope
This document specifies a procedure for the prediction by near infrared spectroscopy (NIRS) of constituents
such as moisture, fat and protein and some minor parameters such as total glucosinolates in oilseeds and
oilseed meals.
The determinations are based on spectrometric measurement in the near infrared spectral region.
2 Normative references
There are no normative references in this document.
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https://www.iso.org/obp
— IEC Electropedia: available at https://www.electropedia.org/
3.1
near infrared spectroscopy
NIRS
analytical technique that uses the near-infrared region of the electromagnetic spectrum, typically 780–
−1 −1
2500 nm to 500 nm (25 000 cm to 4 000 cm ) or segments of this region or selected wavelengths or
wavenumbers to probe the constituent content (3.8) of a sample
3.2
near infrared spectrometer
NIR spectrometer
analytical instrument designed to measure the absorbance or reflectance of near-infrared light (typically in
the wavelength range of 780 nm to 2500 2 500 nm) of oilseed samples (seed and meals).)
Note 1 to entry: When used under specified conditions, NIR spectrometer can predict the constituent contents (3.8) of
oilseeds and oilseed meals by modelling relationships between the sample constituent composition and their NIRnear
infrared spectra (3.7) via a prediction model (3.13) using multivariate prediction techniques (3.14).
Note 1 2 to entry: Some instruments span a wider range of wavelengths (400 nm to 2 500 nm) and are able to measure
in the visible range (400 nm to 800 nm).
Note 2 3 to entry: Differences can still arise between NIR spectrometers (3.2) of the same make and model, due to several
factors, even if the hardware appears identical. Even small differences in alignment of mirrors, gratings, or detector
performances can affect spectral output
3.3
near infrared reflectance
NIRR
type of near infrared spectroscopy (3.1) where the basic measurement is the absorption of near-infrared light
diffusely reflected back from the surface of a sample collected by a detector in front of the sample

3.4
near infrared transmittance
NIR
T
type of near infrared spectroscopy (3.1) where the basic measurement is the absorption of near-infrared light
that has travelled through a sample and is then collected by a detector behind the sample
3.5
near infrared transflectance
type of near infrared spectroscopy (3.1) where the basic measurement is the absorption of near-infrared light
that has travelled through a sample and then it reflects back from a reflective surface behind it. The detector
captures the combined transmitted (3.4) and reflected (3.3) light.
Note 1 to entry: The detector captures the combined transmitted (3.4) and reflected (3.3) light.
3.6
near infrared spectroscopy network
NIRS network
number of near infrared spectrometers (3.2), operated using the same prediction models (3.13), which are
usually standardized so that the differences in predicted values (3.11) for a set of test samples (3.27) are
minimized
3.7
near infrared spectra
NIR spectra
refer to the graphical representation of the absorbance, obtained by near infrared reflectance (3.3), near
infrared transmittance (3.4) or near infrared transflectance (3.5), of NIR radiation by a sample as a function of
wavelength or wavenumber.
Note 1 to entry: It captures the molecular overtones and combination vibrations primarily associated with C-H, N-H, O-
H, and S-H bonds.
Note 1 2 to entry: the The term 'spectra'“spectra” is used rather than 'spectrum' because modern NIR instruments
typically acquire and automatically average multiple individual spectra to generate a representative composite output.
Note 2 3 to entry: in In literature, the term “scans” is sometimes used in place of spectra.
3.8
constituent content
mass fraction of substances of oilseeds and oilseed meals determined using reference methods (3.9) or
forecasted using the prediction model (3.13)
Note 1 to entry: Examples of constituents that can be predicted by near infrared (NIR) spectroscopy include moisture,
fat, protein, total glucosinolate, etc. For examples of appropriate methods, see reference list (references
[6],[7],[10],[12],[21],[22],[24],[25] in bibliography).ISO 10565, ISO 10632, ISO 16634-1, ISO 1705, ISO 659, ISO 771,
ISO 9167 and ISO 10519. The measuring units of the parameters that will be predicted shall be the units used in the
reference methods.
Note 2 to entry: It is possible to develop and validate near infrared (NIR) methods for other constituents than the ones
listed above (in Note 1), to entry, as long as, the procedure from this document is observed.
3.9
Reference reference method
method or protocol producing definitive results, called reference results or true value (3.10), against which the
predicted NIR results (3.11) are compared.
Note 1 to entry: These reference methods are typically direct analytical techniques capable of quantifying one or more
seed constituents without reliance on intermediate or inferential measurements. These methods couldcan be
internationally standardized methods or in-house validated methods recognized by experts or by agreement between
parties.
Note 1 2 to entry: The reference method (3.9) shall be a validated method of known accuracy, repeatability and
reproducibility.
Note 3 to entry: International standardized methods that hashave been developed and approved by one or more
internationally recognized standards organizations such as the International Organization for Standardization (ISO) are
often used as reference methods.
3.10
Reference reference value
also called true
, is the experimentally determined or certified value of a constituent (3.8) of oilseeds or oilseed meals,
obtained using a reference method (3.9)
3.11
Predicted predicted value
constituent content (3.8(3.8)) of oilseeds or oilseed meals forecasted by a prediction model (3.13) using the
NIR near infrared spectra (3.7) from oilseeds and oilseed meals produced by an NIRa near infrared
spectrometer (3.2)
3.12
standardization of anthe instrument
process whereby a group of near infrared instruments (3.2) areis adjusted so that they predictit predicts
similar constituent contents (3.8) when operating the same prediction model(s) (3.13) on the same sample(s)
Note 1 to entry: A number of techniques can be used but these can be broadly defined as either pre-prediction methods
where the spectra of samples are adjusted to minimize the differences between the response of a master instrument and
each instrument in the group or post-prediction methods where linear regression is used to adjust the predicted values
(3.11) produced by each instrument to make them as similar as possible to those from a master instrument. See
bibliographythe Bibliography for more information.
3.13
prediction model
calibration model
a multivariate prediction equation designed to forecast the constituent content (3.8) of oilseeds and oilseeds
meals based on their near infrared (NIR) spectra spectral(3.7) data (3.7).
Note 1 to entry: These models are typically developed using multivariate statisticalprediction techniques (3.14(3.14))
such as principal component regression (3.16), partial least squares regression (3.17), multiple linear regression (3.18PCR
(3.16), PLS (3.17), MLR (3.18)) regressions or more advanced nonlinear methods (3.19).
Note 2 to entry: prediction Prediction models can also be developed to predict the physical properties of a sample.
Note 3 to entry: The term “calibration model” is well known and well understood by advanced NIR spectroscopy users.
To outline that NIR spectroscopy is a secondary form of measurement, the term “prediction model” is used in this
document instead of the term “calibration model”.
3.14
Multivariable multivariate prediction technique
statistical modelling approach used to establish quantitative relationships between multiple predictor
variables (e.g. near infrared spectra (3.7)) and one or more response variables (e.g. constituent content (3.8)).
Note 1 to entry: This technique is Technique commonly applied in NIR spectroscopic, where overlapping and correlated
signals, from an NIR scan (3.7),, require methods such as principal component regression (3.16), partial least squares
regression (3.17PCR (3.16), PLS (3.17),) or multiple linear regression (3.18MLR (3.18)) or more advanced nonlinear
methods (3.19) to extract meaningful information to develop a prediction model (3.13(3.13).).
Note 2 to entry: Multivariable In literature, multivariate prediction techniques are often referred as “chemometrics in
literature.”.
3.15
principal component analysis
PCA
form of data compression, which for a set of samples works solely with the x data, e.g. near infrared spectra
(3.7) for near infrared spectroscopy NIRs (3.1), and finds principal components (PCs) (factors) according to a
rule that says that each PC expresses the maximum variation in the data at any time and is uncorrelated with
any other PC
Note 1 to entry: The first PC expresses as much as possible of the variability in the original spectra data of the samples.
Its effect is then subtracted from the x data and a new PC derived again expressing as much as possible of the variability
in the remaining data. It is possible to derive as many PCs as there are either data points in the spectrum or samples in
the data set, but the major effects in spectra can be shown to be concentrated in the first few PCs and therefore the
number of data that need to be considered is dramatically reduced.
Note 2 to entry: PCA produces two new sets of variables at each stage:
a) (1) PC scores represent the response of each sample on each PC and (2) ;
a)b) principal component loadings represent the relative importance of each data point in the original spectra to the
principal component.
Note 3 to entry: PCA has many uses (e.g. in spectral interpretation) but is most widely used in the identification of spectral
outliers (3.21).
3.16
principal component regression
PCR
technique which uses the scores on each principal component as regressors in a multiple linear regression
(3.18) against y values representing the composition of samples
Note 1 to entry: As each principal component is orthogonal to every other principal component, the scores form an
uncorrelated data set with better properties than the original spectra. While it is possible to select a combination of
principal components for regression based on how well each principal component correlates to the constituent of
interest, most commercial software forces the regression to use all principal components up to the highest principal
component selected for the model (“the top-down approach”).
Note 2 to entry: When used in near infrared spectroscopy (3.1,), the regression coefficients in the principal component
space are usually converted back to a prediction model (3.13) using all the data points in wavelength space.
3.17
partial least squares regression
PLS
form of data compression which uses a rule to derive the factors consisting of allowing each factor in turn to
maximize the covariance (3.47) between the y data and all possible linear combinations of the x data
Note 1 to entry: PLS is a balance between variance and correlation with each factor being influenced by both effects. PLS
factors are therefore more directly related to variability in y values than are principal components. PLS produces three
new variables: loading weights (which are not orthogonal to each other), loadings, and scores which are both orthogonal.
Note 2 to entry: PLS models are produced by regressing PLS scores against y values. As with principal component
regression (3.16,), when used in near infrared spectroscopy (3.1,), the regression coefficients in PLS space are usually
converted back to a prediction model (3.13) using all the data points in wavelength space.
3.18
multiple linear regression
MLR
technique using a combination of several x variables to predict a single y variable
Note 1 to entry: In near infrared spectroscopy (3.1,), the x values are either absorbance values at selected wavelengths in
the near infrared reflectance (3.3) or derived variables such as principal component analysis (3.15) or partial least squares
regression (3.17) scores.
3.19
artificial neural network
ANN
non-linear modelling technique loosely based on the architecture of biological neural systems
Note 1 to entry: The network is initially trained by supplying a data set with several x (spectral or derived variables such
as principal component analysis scores) values and reference y values. During the training process, the architecture of
the network may be modified, and the neurons assigned weighting coefficients for both inputs and outputs to produce
the best possible predictions of the parameter values.
Note 2 to entry: Neural networks require a lot of data in training.
3.20
Mahalanobis distance
global h-value
distance in principal component (PC) space between a data point and the centre of the principal component
space
Note 1 to entry: Mahalanobis distance is a non-linearnonlinear measurement. In PC space, a set of samples usually form
a curve shaped distribution. The ellipsoid that best represents the probability distribution of the set can be estimated by
building the covariance (3.47) matrix of the samples. The Mahalanobis distance is simply the distance of the test point
from the centre of mass divided by the width of the ellipsoid in the direction of the test point. In some software, the
Mahalanobis distance is referred to as the “global h-value” and outlier detection depends upon how many standard
deviations of h a sample is from the centre.
3.21
outlier
member of a set of values which is inconsistent with the other members of that set
Note 1 to entry: As defined by ISO 5725-1.
Note 2 to entry: For near infrared spectroscopy (3.1) data, outliers are points in any data set that can be shown statistically
to have values that lie well outside an expected distribution. Outliers are normally classified as either x- (spectral) outliers
(3.22) or y- (reference data) outliers (3.23.).
3.22
x-outlier
outlier (3.21) related to near infrared reflectance(NIR) spectra (3.7)
Note 1 to entry: An x-outlier can arise from a spectrum with instrumental faults or from a sample type that is radically
different from the other samples, or in prediction, a sample type not included in the original prediction sample set (3.25.
).
Note 2 to entry: NIR spectra outlier detection can be performed on either raw or preprocessedpre-processed data using
principal component analysis PCA (3.15 ) or Mahalanobis distance (3.20) analysis (3.20).
3.23
y-outlier
outlier (3.21) related to an error in the reference value (3.10) of the constituent content (3.8)
EXAMPLE An error in transcription or in the value obtained by the reference laboratory.
3.24
leverage
measure of how far a sample lies from the centre of the population space defined by a model
Note 1 to entry: Samples with high leverage have high influence on the model. Leverage is calculated by measuring the
distance between a projected point and the centre of the model.
3.25
prediction sample set
calibration sample set
samples with of known constituent contents (3.8) established by reference methods (3.9) to develop one or
several prediction models (3.13) using their NIRnear infrared spectra (3.7)
Note to entry: in some publication that prediction sample set could be referred as the calibration sample set.
3.26
validation sample set
samples with a known constituent contents (3.8) established by reference methods (3.9) to validate or prove a
prediction model (3.13)
Note 1 to entry: The validation set usually contains samples having the same characteristics as those selected in the
th
prediction sample set (3.25(3.25).). Often alternate or n samples (ranked in order of the constituent of interest) are
allocated to the prediction (3.25) and validation (3.26) data sets from the same pool of samples. None of the samples of
the validation sample set (3.26) shall be used in the prediction sample set (3.25).
3.27
test sample
when using or testing a prediction model (3.13,), any sample or set of samples, excluding those used to develop
the model
3.28
independent sample test set
test sample set that consists of samples that are from a different geographical region, industrial plant or have
been collected at different time (e.g. different harvest) than those used to create (3.25) and validate (3.26) a
prediction model (3.13). Independent test samples are never used to develop the prediction model (3.13).
Note 1 to entry: Independent test samples are never used to develop the prediction model.
Note 2 to entry: These samples form a true test of a prediction model.
3.29
monitoring sample set
set of samples that is used for the routine control of prediction models (3.13)
Note 1 to entry: This sample set shall be independent from the prediction sample set (3.25).
3.30
cross-validation
method of generating prediction statistics where, repeatedly, a subset of samples is removed from a prediction
sample set (3.25) A. , a model is calculated on the remaining samples, residuals (3.33) are calculated on the
validation subset. When, and when this process has been run a number of times, a calculation of prediction
statistics is made on all the residuals.
Note 1 to entry: Full cross-validation omits one sample at a time and is run n times (where there are n prediction
samples). Where a larger subset is removed, the cross-validation cycle is usually run at least eight times before the
statistics are calculated. Finally, a model is calculated using all the prediction samples.
Note 2 to entry: There are several disadvantages to the use of cross-validation methods to validate the near infrared
reflectance (3.3) model. First, cross-validation statistics tend to be optimistic when compared with those for an
independent test sample set. Second, if there is any duplication in the calibration data (e.g. the same sample scanned on
several instruments or at different times), it is necessary to always assign all copies of the same sample to the same cross-
validation segment, otherwise very optimistic statistics are produced.
3.31
overfitting
addition of too many regression terms in a multiple linear regression (3.18. )
Note 1 to entry: This is a modelling error where a prediction model (3.13) captures not only the essential chemical
information but also the noise and/or irrelevant variability in the spectra (3.7) of the material being analysed.
Note 1 2 to entry: A result of overfitting, when samples not in the prediction sample set (3.25) are predicted, is that
statistics such as root mean square error of prediction RMSEP(3.41) or SEP are much poorer than expected.
3.32 3. 32
z-score
performance criterion calculated by dividing the difference between the near infrared predicted result (3.11)
and the reference value (3.10) by a target value for the standard deviation, usually the standard deviation for
proficiency assessment
Note 1 to entry: This is a standardized measure of laboratory bias, calculated using the reference value and the standard
deviation of the validation sample set (3.26) or the independent sample set (3.25).
3.33
residual
difference between an observed value of the response variable and the corresponding predicted value (3.11)
of the response variable
Note 1 to entry: As defined by ISO 3534-3.
Note 2 to entry: For near infrared spectroscopy (3.1) data, a residual is the difference between a reference value (3.10)
and the value predicted by a regression model. Residuals are used in the calculation of regression statistics.
3.34
spectral residual
residual (3.33) after chemometric treatment (3.14) of a spectrum arising from spectral variation not described
by the model
3.35
bias
e
difference between the mean reference value (3.10(3.9), ), ȳ, and the mean value, x̄, forecasted by the NIR
prediction model (3.13) for a set of samples (3.25, 3.26, 3.27, 3.28)
3.36
bias confidence limit
BCL
Tb
value greater than which a bias (3.35) is significantly different from zero at the confidence level specified
Note 1 to entry: See also 7.3.
3.37
standard error of calibration
SEC
sSEC
for a prediction model (3.13,), an expression of the average difference between predicted values (3.11) and
reference (3.10) values (3.10) for samples used to develop the model
Note 1 to entry: As defined by standard error of cross-validation (3.38), root mean square error of cross-validation (3.39),
standard error of prediction (3.40), root mean square error of prediction (3.41Note 1 to entry: As defined by SECV
(3.38), RMSECV (3.39), SEP (3.40), RMSEP (3.41),), this illustration of the average difference refers to the square root of
the sum of squared residual values divided by the number of values corrected for degrees of freedom, where 68 % of the
errors are below this value.
3.38
standard error of cross-validation
SECV
sSECV
Usedused with the data from the prediction sample set (3.25), expression of the bias-corrected average
difference between predicted values (3.11) and reference (3.10) values (3.10) for the subset of samples selected
as prediction sample set (3.25) during the cross-validation (3.30) process
3.39
root mean square error of cross-validation
RMSECV
sRMSECV
expression of the average difference between predicted values (3.11) and reference (3.10) values (3.10) for the
subset of samples selected as prediction samples (3.25) during the cross-validation (3.30) process
Note 1 to entry: RMSECV includes any bias (3.35) in the predictions.
3.40
standard error of prediction or
SEP
standard error of prediction corrected for the bias
SEP
SEP(C)
sSEP
expression of the bias-corrected average difference between predicted values (3.11) and reference (3.10)
values (3.10) predicted by a prediction model (3.13) when applied to a set of samples not included in the
prediction model (3.13) development
Note 1 to entry: SEP shall only be calculated for the validation sample set (3.26) or the independent sample test set (3.28).
Note 1 2 to entry: The SEP covers a confidence interval of 68 % (multiplied with 1,96 an interval of 95 %).
3.41
root mean square error of prediction
RMSEP
sRMSEP
expression of the average difference between reference (3.10) values (3.10) and the predicted values (3.11) by
a prediction model (3.13) when applied to a set of samples not included in the development of the prediction
model (3.13)
Note 1 to entry: SEP shall only be calculated for the validation sample set (3.26) or the independent sample test set (3.28).
Note 2 to entry: RMSEP includes any bias (3.35) in the predictions.
3.42
unexplained error confidence limit
UECL
T
UE
maximum standard error of prediction SEP (3.40) limit significantly different from the standard error of
calibration (3.37) at the confidence limit specified
3.43 3.43
RSQ
r
xy
coefficient of determination, square of the multiple correlation coefficient between predicted values (3.11) and
reference (3.10) values (3.10)
Note 1 to entry: When expressed as a percentage, it represents the proportion of the variance explained by the regression
model.
3.44
slope
b
in a regression line, representation of the amount y increases per increase in x
3.45
intercept
a
in a regression line, value of y when x is zero
3.46
residual standard deviation
sres
SRES
expression of the average size of the difference between reference and fitted values after a slope (3.44) and
intercept (3.45) correction has been performed
3.47
covariance
𝒔𝒔
𝑦𝑦^𝑦𝑦
measure of how much two random variables vary together
Note 1 to entry: If, for a population of samples, an increase in x is matched by an increase in y, then the covariance
between the two variables will be positive. If an increase in x is matched by a decrease in y, then the covariance will be
negative. When values are uncorrelated then the covariance is zero.
3.48
Prediction prediction model robustness
ability of a NIR prediction model (3.13), built using multivariablemultivariate prediction techniques (3.14),
using a known sample set (3.25) to maintain accurate and reliable predictions across varying conditions such
as sample types, environmental factors, and time for any test sample (3.27)
3.49
ratio of performance to deviation
RPD
Statisticalstatistical parameter used to evaluate the predictive quality, robustness (3.48), of a prediction model
(3.13) during validation using a validation sample set (3.26) or testing using an independent test sample test
set (3.28)
4 Principle
NIR spectroscopy (3.1) is an indirect analytical technique: it requires developing prediction models (3.13)
(also known as “calibration models”) using samples of known composition determined by standard/reference
wet-chemical methods (3.9). These prediction models are based on the correlations of NIR spectra (3.7) of
samples with their reference data (3.10). Once a prediction model (3.13) is developed, quantitative
predictions for the composition of a sample can be attempted from the sample NIR spectra (3.7) alone. The
accuracy of prediction models (3.13) shall be validated by predicting the composition of “unknown” samples
(3.26, 3.28) using the NIR spectrometer (3.2) for which the prediction models (3.13) were developed and
comparing the predicted NIR results (3.11) with the reference results (3.10) obtained by the
standard/reference wet-chemistry methods (3.9) used for the prediction model development (3.14).
NOTE The term “calibration model” is well known and well understood by advanced NIR spectroscopy users. To
outline that NIR spectroscopy is a secondary form of measurement, the term “prediction model” is used in the document
instead of the term “calibration model”.
5 Apparatus
5.1 Near-infrared instruments (3.2),, using diffuse reflectance (3.3),, transmittance (3.4) or
transflectance (3.5) measurement covering the NIR wavelength region, 780 nm to 2 500 nm (25 000 cm−1 to
4 000 cm−1), or segments of this region or selected wavelengths (or wavenumbers). Some instruments span
a wider range of wavelengths (400 nm to 2 500 nm) and are able to measure in the visible range (400 nm to
800 nm). The optical principle may be dispersive (e.g. grating monochromators), interferometric or non-
thermal (e.g. light-emitting diodes, laser diodes, lasers). The instrument should be provided with a diagnostic
test system for testing photometric noise and reproducibility, wavelength or wavenumber accuracy and
wavelength or wavenumber precision (for scanning spectrophotometers).
A background measurement (reference or baseline measurement) shall be taken routinely before analysing a
sample set. It captures the spectral response of the system without the sample. It can be done using air, a
reference material provided by the manufacturer or an empty sample holder.
The scanning resolution (spectral resolution) of an NIR spectrometer is a key parameter that affects
the quality and interpretability of the spectral data. Higher scanning resolution improves the distinction
between overlapping absorption bands, usually leading to improve precision in the chemometric models.
However, higher scanning resolution means slower scans, more noise in the signal and larger data size scans.
For NIR spectrometers (3.2) measuring in transmittance (3.4),, the sample pathlength (sample thickness)
measurements should be optimized according to the manufacturer’s recommendation with respect to signal
intensity for obtaining linearity and maximum signal/noise ratio.
For NIR spectrometers measuring in reflectance (3.3),, reflective surface quality is important, a quartz window
or other appropriate material to eliminate drying effects can be used to cover the interacting sample surface
layer.
For NIR spectrometers measuring in transflectance (3.5),, both sample thickness and reflective surface quality
shall be optimized.
For all instruments, sampling effects can be reduced by scanning a sufficiently large sample volume or surface
to eliminate any significant influence by the non-homogeneity of chemical composition or physical properties
of the test sample. The sample size effect can be determined through validation of constituent predictions
using a representative set of samples of interest.
NOTE Some compounds need instruments with the visible range to be analysed. For example, chlorophyll is an
important parameter for canola seed quality. Chlorophyll content can only be predicted by including wavelengths in the
650 nm to 700 nm range (visible range).
5.2 Appropriate milling or grinding device, for preparing the sample (if needed).
NOTE Changes in grinding or milling conditions can affect the NIR measurements.
When a prediction model (3.13) has been developed for a constituent for a ground sample with a specific
particle size, it is recommended to:
— use the same particle size for all the samples analysed with the model; or
— ensure the prediction model (3.13) was developed using representative samples spanning the range of
particle sizes intended to be measured, and the validation of prediction model is conducted using
representative samples (3.26, 3.28) spanning all particle sizes.
6 Calibration and initial validation
6.1 General
Prior to any analyses with the NIR spectrometer (3.2),, run the system diagnostic and the manufacturer’s daily
check according to the manufacturer’s guidelines.
NIR spectroscopy (3.1) allows for the prediction of compounds of interest by the development of statistical
models, prediction models (3.13),, acquired via a process called “instrument calibration”. A number of
different multivariablemultivariate prediction techniques (3.14) can be applied to the NIR spectra (3.7)
obtained by the various NIR spectrometers (3.2) and therefore no specific procedure can be given for
calibration. The raw spectra of the samples can be subjected to some mathematical pre-treatments before
model development to reduce the effect of some of the samples’ physical variations (e.g. sample presentation)
on the scattering of the light.
EXAMPLE As an example, raw spectra (3.7) can be processed first with a standard normal variate (SNV) to remove
multiplicative interferences such as baseline shift. A Savitzky-Golay (SG) second derivative transformation can then be
applied to ensure that:
a) peak positions are maintained at the same place as in the original spectra;
b) scattering effects are removed from the background;
c) spectral resolution is improved, which assists in resolving overlapping peaks (in this case, the prediction models
will be developed using the second derivative of the scan and not the raw spectra).
The NIR prediction model (3.13) accuracy and robustness are strongly correlated with the samples chosen to
develop the prediction models (3.25),, the method (3.9)used to determine the constituent content (3.8) and
the methods chosen to develop the statistical prediction model (3.14).
The state of the sample varies as a function of the types of sample (e.g. intact seeds, grounds seeds, ground
meals) and the instrument. The sample presentation should be standardised standardized for all samples
across all stages of analysis, prediction, validation and routine analysis to ensure consistency and reliability.
As NIR spectroscopy (3.1), is highly sensitive to sample presentation, standardization is essential for achieving
accurate, repeatable, and meaningful results.
6.2 Reference methods (3.9)
Internationally accepted standardisedstandardized methods or well-validated methods should be used for the
determination of the constituents of interest (e.g. moisture, fat, protein, glucosinolate, other) by direct
analytical techniques. Oilseed ISO standardisedstandardized methods are given in bibliographythe
Bibliography as exampleexamples.
The standardisedstandardized/validated methods used for prediction should be in statistical control, i.e. for
any sample, the variability should consist of random variat
...


PROJET FINAL
Norme
internationale
ISO/TC 34/SC 2
Graines oléagineuses — Application
Secrétariat: AFNOR
de la spectrométrie dans le proche
Début de vote:
infrarouge
2025-11-13
Oilseeds — Application of near infrared spectrometry
Vote clos le:
2026-01-08
LES DESTINATAIRES DU PRÉSENT PROJET SONT
INVITÉS À PRÉSENTER, AVEC LEURS OBSERVATIONS,
NOTIFICATION DES DROITS DE PROPRIÉTÉ DONT ILS
AURAIENT ÉVENTUELLEMENT CONNAISSANCE ET À
FOURNIR UNE DOCUMENTATION EXPLICATIVE.
OUTRE LE FAIT D’ÊTRE EXAMINÉS POUR
ÉTABLIR S’ILS SONT ACCEPTABLES À DES FINS
INDUSTRIELLES, TECHNOLOGIQUES ET COM-MERCIALES,
AINSI QUE DU POINT DE VUE DES UTILISATEURS, LES
PROJETS DE NORMES
INTERNATIONALES DOIVENT PARFOIS ÊTRE CONSIDÉRÉS
DU POINT DE VUE DE LEUR POSSI BILITÉ DE DEVENIR DES
NORMES POUVANT
SERVIR DE RÉFÉRENCE DANS LA RÉGLEMENTATION
NATIONALE.
Numéro de référence
PROJET FINAL
Norme
internationale
ISO/TC 34/SC 2
Graines oléagineuses — Application
Secrétariat: AFNOR
de la spectrométrie dans le proche
Début de vote:
infrarouge
2025-11-13
Oilseeds — Application of near infrared spectrometry
Vote clos le:
2026-01-08
LES DESTINATAIRES DU PRÉSENT PROJET SONT
INVITÉS À PRÉSENTER, AVEC LEURS OBSERVATIONS,
NOTIFICATION DES DROITS DE PROPRIÉTÉ DONT ILS
AURAIENT ÉVENTUELLEMENT CONNAISSANCE ET À
FOURNIR UNE DOCUMENTATION EXPLICATIVE.
DOCUMENT PROTÉGÉ PAR COPYRIGHT
OUTRE LE FAIT D’ÊTRE EXAMINÉS POUR
ÉTABLIR S’ILS SONT ACCEPTABLES À DES FINS
© ISO 2025 INDUSTRIELLES, TECHNOLOGIQUES ET COM-MERCIALES,
AINSI QUE DU POINT DE VUE DES UTILISATEURS, LES
Tous droits réservés. Sauf prescription différente ou nécessité dans le contexte de sa mise en œuvre, aucune partie de cette
PROJETS DE NORMES
INTERNATIONALES DOIVENT PARFOIS ÊTRE CONSIDÉRÉS
publication ne peut être reproduite ni utilisée sous quelque forme que ce soit et par aucun procédé, électronique ou mécanique,
DU POINT DE VUE DE LEUR POSSI BILITÉ DE DEVENIR DES
y compris la photocopie, ou la diffusion sur l’internet ou sur un intranet, sans autorisation écrite préalable. Une autorisation peut
NORMES POUVANT
être demandée à l’ISO à l’adresse ci-après ou au comité membre de l’ISO dans le pays du demandeur.
SERVIR DE RÉFÉRENCE DANS LA RÉGLEMENTATION
NATIONALE.
ISO copyright office
Case postale 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Genève
Tél.: +41 22 749 01 11
E-mail: copyright@iso.org
Web: www.iso.org
Publié en Suisse Numéro de référence
ii
Sommaire Page
Avant-propos .iv
Introduction .v
1 Domaine d’application . 1
2 Références normatives . 1
3 Termes et définitions . 1
4 Principe. 10
5 Appareillage . 10
6 Étalonnage et validation initiale.11
6.1 Généralités .11
6.2 Méthodes de référence . 12
6.3 Développement du modèle de prédiction . 12
6.4 Validation croisée. 13
6.5 Valeurs aberrantes . 13
6.6 Validation des modèles de prédiction .14
6.6.1 Généralités .14
6.6.2 Validation externe .14
6.6.3 Correction du biais . 15
6.6.4 Ajustement de la pente . . 15
6.6.5 Expansion du jeu d’échantillons de prédiction . .16
6.7 Modifications des conditions de mesurage et d’utilisation des instruments .16
7 Statistiques pour le mesurage des performances .16
7.1 Généralités .16
7.2 Graphique présentant les résultats .17
7.3 Biais .18
7.4 Erreur-type de prédiction .19
7.5 Erreur quadratique moyenne de prédiction . 22
7.6 Pente . 22
7.7 Coefficient de corrélation . 23
7.8 Rapport performance-écart .24
8 Échantillonnage .25
9 Mode opératoire .25
9.1 Préparation de l’échantillon pour essai . 25
9.2 Mesurage . . 25
9.3 Évaluation des résultats . . 26
10 Vérification de la stabilité de l’instrument .26
10.1 Diagnostic des instruments . 26
10.2 Échantillon témoin . 26
10.3 Instruments en réseau. 26
11 Vérification des performances des modèles de prédiction .26
11.1 Généralités . 26
11.2 Cartes de contrôle fondées sur la différence entre les résultats de référence et les
résultats NIR (échantillons de validation) .27
12 Fidélité et exactitude .28
12.1 Répétabilité . 28
12.2 Justesse . 29
13 Rapport d’essai .29
Bibliographie .30

iii
Avant-propos
L’ISO (Organisation internationale de normalisation) est une fédération mondiale d’organismes nationaux
de normalisation (comités membres de l’ISO). L’élaboration des Normes internationales est en général
confiée aux comités techniques de l’ISO. Chaque comité membre intéressé par une étude a le droit de faire
partie du comité technique créé à cet effet. Les organisations internationales, gouvernementales et non
gouvernementales, en liaison avec l’ISO, participent également aux travaux. L’ISO collabore étroitement avec
la Commission électrotechnique internationale (IEC) en ce qui concerne la normalisation électrotechnique.
Les procédures utilisées pour élaborer le présent document et celles destinées à sa mise à jour sont
décrites dans les Directives ISO/IEC, Partie 1. Il convient, en particulier, de prendre note des différents
critères d’approbation requis pour les différents types de documents ISO. Le présent document
a été rédigé conformément aux règles de rédaction données dans les Directives ISO/IEC, Partie 2
(voir www.iso.org/directives).
L’ISO attire l’attention sur le fait que la mise en application du présent document peut entraîner l’utilisation
d’un ou de plusieurs brevets. L’ISO ne prend pas position quant à la preuve, à la validité et à l’applicabilité de
tout droit de brevet revendiqué à cet égard. À la date de publication du présent document, l’ISO n’avait pas
reçu notification qu’un ou plusieurs brevets pouvaient être nécessaires à sa mise en application. Toutefois,
il y a lieu d’avertir les responsables de la mise en application du présent document que des informations
plus récentes sont susceptibles de figurer dans la base de données de brevets, disponible à l’adresse
www.iso.org/brevets. L’ISO ne saurait être tenue pour responsable de ne pas avoir identifié de tels droits de
brevet et averti de leur existence.
Les appellations commerciales éventuellement mentionnées dans le présent document sont données pour
information, par souci de commodité, à l’intention des utilisateurs et ne sauraient constituer un engagement.
Pour une explication de la nature volontaire des normes, la signification des termes et expressions
spécifiques de l’ISO liés à l’évaluation de la conformité, ou pour toute information au sujet de l’adhésion de
l’ISO aux principes de l’Organisation mondiale du commerce (OMC) concernant les obstacles techniques au
commerce (OTC), voir www.iso.org/avant-propos.
Le présent document a été élaboré par le comité technique ISO/TC 34, Produits alimentaires, sous-comité SC 2,
Graines et fruits oléagineux et farines de graines oléagineuses.
Il convient que l’utilisateur adresse tout retour d’information ou toute question concernant le présent
document à l’organisme national de normalisation de son pays. Une liste exhaustive desdits organismes se
trouve à l’adresse www.iso.org/fr/members.html.

iv
Introduction
Le présent document a été élaboré en s’appuyant sur l’ISO 12099:2017 (préparée par le comité technique
ISO/TC 34, Produits alimentaires, sous-comité SC 10, Aliments des animaux) et l’ISO 21543:2020 (préparée par
le comité technique ISO/TC 34, Produits alimentaires, sous-comité SC 5, Lait et produits laitiers) ainsi que les
Références [19], [20], [21], [22], [23], [24] [25], [26], [27], [28] et [29].

v
PROJET FINAL Norme internationale ISO/FDIS 18419:2025(fr)
Graines oléagineuses — Application de la spectrométrie dans
le proche infrarouge
1 Domaine d’application
Le présent document spécifie un mode opératoire pour la prédiction par spectroscopie dans le proche
infrarouge (NIRS) de constituants tels que la teneur en eau, la matière grasse et les protéines, ainsi que
certains paramètres mineurs comme les glucosinolates totaux dans les graines oléagineuses et les tourteaux
de graines oléagineuses.
Les dosages sont réalisés par mesurage spectrométrique dans la région spectrale du proche infrarouge.
2 Références normatives
Le présent document ne contient aucune référence normative.
3 Termes et définitions
Pour les besoins du présent document, les termes et définitions suivants s’appliquent.
L’ISO et l’IEC tiennent à jour des bases de données terminologiques destinées à être utilisées en normalisation,
consultables aux adresses suivantes:
— ISO Online browsing platform: disponible à l’adresse https:// www .iso .org/ obp
— IEC Electropedia: disponible à l’adresse https:// www .electropedia .org/
3.1
spectroscopie dans le proche infrarouge
NIRS
technique analytique qui utilise la région du proche infrarouge du spectre électromagnétique, typiquement
−1 −1
780 nm à 500 nm (25 000 cm à 4 000 cm ), ou des segments de cette région ou des longueurs d’onde
sélectionnées pour analyser la teneur en constituants (3.8) d’un échantillon
3.2
spectromètre proche infrarouge
spectromètre NIR
instrument analytique conçu pour mesurer l’absorbance ou la réflectance de la lumière dans le proche
infrarouge (typiquement dans la plage de longueurs d’onde comprise entre 780 nm et 2 500 nm)
d’échantillons de graines oléagineuses (graines et tourteaux)
Note 1 à l'article: Lorsqu’il est utilisé dans les conditions spécifiées, le spectromètre NIR peut prédire les teneurs en
constituants (3.8) des graines oléagineuses et des tourteaux de graines oléagineuses en modélisant la relation entre
la composition en constituants de l’échantillon et ses spectres en proche infrarouge (3.7) via un modèle de prédiction
(3.13) à l’aide de techniques de prédiction multivariées (3.14).
Note 2 à l'article: Certains instruments couvrent une plage plus large de longueurs d’onde (de 400 nm à 2 500 nm) et
sont capables de réaliser des mesurages dans la plage visible (de 400 nm à 800 nm).
Note 3 à l'article: Des différences peuvent exister entre les spectromètres NIR de la même marque et du même modèle
en raison de plusieurs facteurs, même si l’équipement semble identique. Les différences, même infimes, concernant
l’alignement des miroirs, les réseaux ou les performances du détecteur peuvent affecter les résultats spectraux.

3.3
réflectance en proche infrarouge
NIR
R
type de spectroscopie dans le proche infrarouge (3.1) où le mesurage de base est l’absorption de la lumière
dans le proche infrarouge réfléchie de manière diffuse par la surface d’un échantillon ensuite recueillie par
un détecteur placé devant l’échantillon
3.4
transmission en proche infrarouge
NIR
T
type de spectroscopie dans le proche infrarouge (3.1) où le mesurage de base est l’absorption de la lumière
dans le proche infrarouge qui a traversé un échantillon et qui est ensuite recueillie par un détecteur placé
derrière l’échantillon
3.5
transflectance en proche infrarouge
type de spectroscopie dans le proche infrarouge (3.1) où le mesurage de base est l’absorption de la lumière dans
le proche infrarouge qui a traversé un échantillon et qui est ensuite reflétée par une surface réfléchissante
placée derrière l’échantillon
Note 1 à l'article: Le détecteur capture à la fois la lumière transmise (3.4) et réfléchie (3.3).
3.6
réseau de spectroscopie dans le proche infrarouge
réseau NIRS
nombre de spectromètres de proche infrarouge (3.2), fonctionnant selon les mêmes modèles de prédiction
(3.13), qui sont généralement standardisés de sorte à réduire le plus possible les différences de valeurs
prédites (3.11) pour un set d’échantillons pour essai (3.27)
3.7
spectres en proche infrarouge
spectres NIR
représentation graphique de l’absorbance, obtenue par réflectance en proche infrarouge (3.3), transmission en
proche infrarouge (3.4) ou transflectance en proche infrarouge (3.5), de rayonnement NIR par un échantillon
en tant que fonction d’une longueur d’onde ou d’un nombre d’onde
Note 1 à l'article: Elle capture les harmoniques moléculaires et les vibrations de combinaison avant tout associées aux
liaisons C-H, N-H, O-H et S-H.
Note 2 à l'article: Le terme «spectres» est préféré au terme «spectre», car, en général, les instruments NIR modernes
acquièrent et moyennent automatiquement plusieurs spectres individuels pour générer un résultat composite
représentatif.
Note 3 à l'article: In literature, the term “scans” is sometimes used in place of spectra.
3.8
teneur en constituants
fraction massique des substances des graines oléagineuses et des tourteaux de graines oléagineuses
déterminée à l’aide de méthodes de référence (3.9) ou prédites à l’aide du modèle de prédiction (3.13)
Note 1 à l'article: La teneur en eau, la matière grasse, les protéines, la teneur en glucosinolates totaux, etc., sont des
exemples de constituants pouvant être prédits par la spectrométrie en proche infrarouge (NIR). Pour obtenir des
exemples de méthodes appropriées, voir l’ISO 10565, l’ISO 10632, l’ISO 16634-1, l’ISO 1705, l’ISO 659, l’ISO 771,
l’ISO 9167 et l’ISO 10519. Les unités de mesure des paramètres qui seront prédits doivent correspondre aux unités
utilisées dans les méthodes de référence.
Note 2 à l'article: Il est possible de mettre au point et de valider des méthodes NIR pour d’autres constituants que ceux
indiqués dans la Note 1 à l’article, dès lors que le mode opératoire du présent document est respecté.

3.9
méthode de référence
méthode ou protocole produisant des résultats définitifs appelés valeur de référence (3.10), à laquelle les
résultats NIR prédits sont comparés
Note 1 à l'article: Ces méthodes de référence sont en général des techniques analytiques directes en mesure de
quantifier un ou plusieurs constituants de graines oléagineuses sans dépendre de mesurages intermédiaires ou
déductifs. Ces méthodes peuvent être des méthodes normalisées à l’échelle internationale ou des méthodes validées
en interne, reconnue par des experts ou par accord entre les parties.
Note 2 à l'article: La méthode de référence doit être une méthode validée, avec une exactitude, une répétabilité et une
reproductibilité connues.
Note 3 à l'article: Les méthodes normalisées à l’échelle internationale qui ont été élaborées et approuvées par une ou
plusieurs organisations de normalisation internationalement reconnues telles que l’ISO sont souvent utilisées comme
méthodes de référence.
3.10
valeur de référence
vrai
valeur certifiée ou déterminée de manière expérimentale d’un constituant de graines oléagineuses ou de
tourteaux de graines oléagineuses, obtenue à l’aide d’une méthode de référence (3.9)
3.11
valeur prédite
teneur en constituants (3.8) de graines oléagineuses ou de tourteaux de graines oléagineuses prédite par
un modèle de prédiction (3.13) à l’aide des spectres en proche infrarouge (3.7) de graines oléagineuses et de
tourteaux de graines oléagineuses produits par un spectromètre proche infrarouge (3.2)
3.12
standardisation de l’instrument
processus par lequel un groupe d’instruments de proche infrarouge est ajusté de sorte à prédire des teneurs
en constituants (3.8) semblables lors de l’utilisation du ou des mêmes modèles de prédiction (3.13) sur le ou
les mêmes échantillons
Note 1 à l'article: Un certain nombre de techniques peuvent être utilisées, mais celles-ci peuvent être définies de
manière générale soit comme des méthodes de prédiction préalable dans lesquelles les spectres des échantillons sont
ajustés pour réduire le plus possible les différences entre la réponse d’un instrument maître et de chaque instrument
du groupe, soit comme des méthodes de post-prédiction dans lesquelles une régression linéaire est utilisée pour
ajuster les valeurs prédites (3.11) produites par chaque instrument afin de les rendre aussi semblables que possible à
celles d’un instrument maître. Voir la Bibliographie pour plus d’informations.
3.13
modèle de prédiction
modèle d’étalonnage
équation de prédiction multivariée pour prédire la teneur en constituants (3.8) de graines oléagineuses et
de tourteaux de graines oléagineuses en fonction des données relatives à leurs spectres en proche infrarouge
(NIR) (3.7)
Note 1 à l'article: Ces modèles sont généralement développés à l’aide de techniques de prédiction multivariées (3.14)
telles que la régression des composantes principales (3.16), l’analyse de régression par la méthode des moindres carrés
partiels (3.17), la régression linéaire multiple (3.18) ou des méthodes non linéaires plus avancées.
Note 2 à l'article: Des modèles de prédiction peuvent également être développés pour prédire les propriétés physiques
de l’échantillon.
Note 3 à l'article: Le terme «modèle d’étalonnage» est bien connu et bien compris par les utilisateurs expérimentés
de la spectroscopie NIR. Afin d’indiquer que la spectroscopie NIR est une forme secondaire de mesurage, le terme
«modèle de prédiction» est utilisé dans le présent document à la place du terme «modèle d’étalonnage».

3.14
technique de prédiction multivariée
approche de modélisation statistique utilisée pour établir les relations quantitatives entre plusieurs
variables prédites (par exemple spectres en proche infrarouge (3.7)) et une ou plusieurs variables de réponse
(par exemple teneur en constituants (3.8))
Note 1 à l'article: Cette technique est communément appliquée en spectrométrie NIR, où les signaux qui se chevauchent
et les signaux corrélés exigent des méthodes telles que la régression des composantes principales (3.16), l’analyse de
régression par la méthode des moindres carrés partiels (3.17) ou la régression linéaire multiple (3.18) ou des méthodes
non linéaires plus avancées afin d’extraire des informations pertinentes pour développer un modèle de prédiction
(3.13).
Note 2 à l'article: Dans la littérature, les techniques de prédiction multivariée sont souvent regroupées sous
l'appellation «chimiométrie».
3.15
analyse en composantes principales
ACP
forme de compression des données qui, pour un set d’échantillons, fonctionne uniquement avec les données x,
par exemple spectres en proche infrarouge (3.7) pour la spectroscopie dans le proche infrarouge (3.1) et permet
de trouver les composantes principales (CP) (facteurs) selon une règle stipulant que chaque CP exprime la
variation maximale des données à tout moment et n’est pas corrélée avec toute autre CP
Note 1 à l'article: La première CP exprime autant que possible la variabilité des données spectrales d’origine des
échantillons. Son effet est ensuite soustrait des données x et une nouvelle CP est dérivée de sorte à exprimer, elle aussi,
autant que possible la variabilité des données restantes. Il est possible d’obtenir autant de CP qu’il y a de points de
données dans le spectre ou d’échantillons dans l’ensemble de données, mais il peut être démontré que les principaux
effets dans les spectres sont concentrés dans les premières CP et que, par conséquent, le nombre de données à prendre
en compte est considérablement réduit.
Note 2 à l'article: L’ACP produit deux nouveaux sets de variables à chaque étape:
a) les scores des CP représentent la réponse de chaque échantillon par rapport à chaque CP;
b) les coefficients de composante principale représentent l’importance relative de chaque point de données dans les
spectres d’origine vis-à-vis de la composante principale.
Note 3 à l'article: L’ACP a de nombreuses utilisations (par exemple dans l’interprétation spectrale), mais elle est le plus
largement utilisée dans le cadre de l’identification des valeurs aberrantes (3.21) spectrales.
3.16
régression des composantes principales
PCR
technique qui utilise les scores de chaque composante principale comme régresseurs dans une régression
linéaire multiple (3.18) par rapport aux valeurs y représentant la composition des échantillons
Note 1 à l'article: Comme chaque composante principale est orthogonale par rapport à toutes les autres composantes
principales, les scores forment un ensemble de données non corrélées avec de meilleures propriétés que les spectres
d’origine. Bien qu’il soit possible de sélectionner une combinaison de composantes principales pour la régression
en fonction de la corrélation entre chaque composante principale et le constituant étudié, la plupart des logiciels
disponibles dans le commerce forcent la régression à utiliser toutes les composantes principales jusqu’à la composante
principale la plus élevée sélectionnée pour le modèle («approche descendante»).
Note 2 à l'article: Lorsqu’ils sont utilisés en spectroscopie dans le proche infrarouge (3.1), les coefficients de régression
dans l’espace des composantes principales sont généralement convertis en un modèle de prédiction (3.13) au moyen de
tous les points de données dans l’espace de longueurs d’onde.

3.17
analyse de régression par la méthode des moindres carrés partiels
PLS
forme de compression des données qui utilise une règle pour dériver les facteurs, en vue de permettre à
chaque facteur de maximiser à tour de rôle la covariance (3.47) entre les données y et toutes les combinaisons
linéaires possibles des données x
Note 1 à l'article: La PLS est un équilibre entre la variance et la corrélation, chaque facteur étant influencé par les deux
effets. Les facteurs de PLS sont donc liés de manière plus directe à la variabilité des valeurs y que les composantes
principales. La PLS produit trois nouvelles variables: les charges pondérées (qui ne sont pas orthogonales les unes par
rapport aux autres), les charges et les scores qui sont tous deux orthogonaux.
Note 2 à l'article: Les modèles de PLS sont produits par régression des scores de la PLS par rapport aux valeurs y.
Comme pour la régression des composantes principales (3.16), lorsqu’ils sont utilisés en spectroscopie dans le proche
infrarouge (3.1), les coefficients de régression dans l’espace des PLS sont généralement convertis en un modèle de
prédiction (3.13) au moyen de tous les points de données dans l’espace de longueurs d’onde.
3.18
régression linéaire multiple
MLR
technique utilisant une combinaison de plusieurs variables x pour prédire une seule variable y
Note 1 à l'article: En spectroscopie dans le proche infrarouge (3.1), les valeurs x sont soit des valeurs d’absorbance à des
longueurs d’onde sélectionnées dans la réflectance en proche infrarouge (3.3) soit des variables dérivées telles que des
scores d’analyse en composantes principales (3.15) ou d’analyse de régression par la méthode des moindres carrés partiels
(3.17).
3.19
réseau neuronal artificiel
ANN
technique de modélisation non linéaire fondée sur l’architecture des systèmes neuronaux biologiques
Note 1 à l'article: Le réseau est initialement «entraîné» en fournissant un ensemble de données avec plusieurs valeurs x
(variables spectrales ou dérivées telles que les scores d’analyse en composantes principales) et valeurs de référence y.
Au cours du processus d’entraînement, l’architecture du réseau peut être modifiée et les neurones peuvent être
attribués à des coefficients de pondération pour les entrées et les sorties afin de produire les meilleures prédictions
possibles des valeurs de paramètres.
Note 2 à l'article: L’entraînement des réseaux neuronaux exige un grand volume de données.
3.20
distance de Mahalanobis
valeur globale h
distance dans l’espace des composantes principales (CP) entre un point de données et le centre de l’espace
des composantes principales
Note 1 à l'article: La distance de Mahalanobis est un mesurage non linéaire. Dans l’espace des PC, un set d’échantillons
donne généralement lieu à une distribution en forme de courbe. L’ellipsoïde qui représente le mieux la distribution de
probabilité du set peut être estimé en construisant la matrice de covariance (3.47) des échantillons. La distance de
Mahalanobis est simplement la distance entre le point d’essai et le centre de masse divisée par la largeur de l’ellipsoïde
dans la direction du point d’essai. Dans certains logiciels, la distance de Mahalanobis est appelée «valeur globale h» et
la détection des valeurs aberrantes dépend du nombre d’écarts-types de h séparant un échantillon du centre.
3.21
valeur aberrante
membre d’un ensemble de valeurs qui n’est pas cohérent avec les autres membres de cet ensemble
Note 1 à l'article: Tel que défini par l’ISO 5725-1.
Note 2 à l'article: Dans le cas de données de spectroscopie dans le proche infrarouge (3.1), les valeurs aberrantes sont
des points dans tout ensemble de données dont il peut être démontré statistiquement que des valeurs se situent
bien en dehors d’une distribution attendue. Les valeurs aberrantes sont normalement classées soit comme valeurs
aberrantes x (spectrales) (3.22), soit comme valeurs aberrantes y (données de référence) (3.23).

3.22
valeur aberrante x
valeur aberrante (3.21) relative aux spectres en proche infrarouge (NIR) (3.7)
Note 1 à l'article: Une valeur aberrante x peut provenir d’un spectre présentant des défauts instrumentaux ou d’un type
d’échantillon radicalement différent des autres échantillons ou, dans le cadre d’une prédiction, d’un type d’échantillon
non inclus dans le set d’échantillons de prédiction (3.25).
Note 2 à l'article: La détection des valeurs aberrantes des spectres en proche infrarouge (NIR) peut être réalisée sur
des données brutes ou pré-traitées au moyen d’une analyse en composantes principales (3.15) ou d’une analyse de la
distance de Mahalanobis (3.20).
3.23
valeur aberrante y
valeur aberrante (3.21) relative à une erreur dans la valeur de référence (3.10) de la teneur en constituants
(3.8)
EXEMPLE Une erreur dans la transcription ou dans la valeur obtenue par le laboratoire de référence.
3.24
effet de levier
mesure de la distance qui sépare un échantillon du centre de l’espace de population défini par un modèle
Note 1 à l'article: Les échantillons à fort effet de levier ont une influence élevée sur le modèle. L’effet de levier est
calculé en mesurant la distance entre un point projeté et le centre du modèle.
3.25
set d’échantillons de prédiction
set d’échantillons d’étalonnage
échantillons d’une teneur en constituants (3.8) connue, établie par des méthodes de référence (3.9) pour
développer un ou plusieurs modèles de prédiction (3.13) à l’aide de leurs spectres en proche infrarouge (3.7)
3.26
set d’échantillons de validation
échantillons d’une teneur en constituants (3.8) connue, établie par des méthodes de référence (3.9) pour
valider ou prouver un modèle de prédiction (3.13)
Note 1 à l'article: Le set de validation contient généralement des échantillons présentant les mêmes caractéristiques
ème
que ceux sélectionnés dans le set d’échantillons de prédiction (3.25). Souvent, des échantillons alternatifs ou n
échantillons (classés par ordre du constituant étudié) sont affectés aux ensembles de données de prédiction et de
validation provenant du même groupe d’échantillons. Aucun des échantillons du set d’échantillons de validation ne
doit être utilisé dans le set d’échantillons de prédiction.
3.27
échantillon pour essai
lors de l’utilisation ou de l’essai d’un modèle de prédiction (3.13), tout échantillon ou set d’échantillons,
à l’exclusion de ceux utilisés pour développer le modèle
3.28
set d’échantillons pour essai indépendants
set d’échantillons pour essai constitué d’échantillons provenant d’une région géographique différente, d’un
site industriel ou ayant été prélevés à un moment différent (par exemple une récolte différente) que ceux
utilisés pour créer et valider un modèle de prédiction (3.13)
Note 1 à l'article: Les échantillons pour essai indépendants ne sont jamais utilisés pour développer le modèle de
prédiction.
Note 2 à l'article: Ces échantillons constituent un véritable essai d’un modèle de prédiction.

3.29
set d’échantillons de surveillance
set d’échantillons utilisé pour le contrôle de routine des modèles de prédiction (3.13)
Note 1 à l'article: Ce set d’échantillons doit être indépendant du set d’échantillons de prédiction (3.25).
3.30
validation croisée
méthode de génération de statistiques de prédiction dans laquelle, de manière répétée, une sous-population
d’échantillons est éliminée d’un set d’échantillons de prédiction (3.25), le modèle étant calculé par rapport
aux échantillons restants et les résidus (3.33) à la sous-population de validation; lorsque ce processus a été
exécuté un certain nombre de fois, calcul des statistiques de prédiction par rapport à tous les résidus
Note 1 à l'article: La validation croisée complète omet un échantillon à la fois et est exécutée à n reprises (en présence
de n échantillons de prédiction). En cas d’élimination d’une sous-population plus vaste, le cycle de validation croisée est
généralement exécuté à huit reprises au moins avant calcul des statistiques. Enfin, un modèle est calculé en utilisant
tous les échantillons de prédiction.
Note 2 à l'article: L’utilisation de méthodes de validation croisée en vue de valider le modèle de réflectance en proche
infrarouge (3.3) présente plusieurs inconvénients. Tout d’abord, les statistiques de validation croisée ont tendance à
être optimistes par rapport à celles d’un set d’échantillons pour essai indépendant. Ensuite, en cas de doublon dans
les données d’étalonnage (par exemple, le même échantillon balayé par plusieurs instruments ou à des moments
différents), il est nécessaire de toujours assigner l’intégralité des copies du même échantillon au même segment de
validation croisée afin d’éviter de créer des statistiques très optimistes.
3.31
surajustement
ajout d’un trop grand nombre de termes de régression au sein d’une régression linéaire multiple (3.18)
Note 1 à l'article: Il s’agit d’une erreur de modélisation où un modèle de prédiction (3.13) capture non seulement les
informations chimiques essentielles, mais également le bruit et/ou une variabilité non pertinente dans les spectres
(3.7) du matériau analysé.
Note 2 à l'article: Lorsque des échantillons non inclus dans set d’échantillons de prédiction (3.25) sont prédits,
le surajustement entraîne une qualité nettement inférieure des statistiques telles que l’erreur quadratique moyenne de
prédiction (3.41) ou la SEP par rapport aux prévisions.
3.32
score z
critère de performance calculé en divisant la différence entre le résultat prédit (3.11) dans le proche
infrarouge et la valeur de référence (3.10) par une valeur cible pour l’écart-type, généralement l’écart-type
pour l’évaluation de l’aptitude
Note 1 à l'article: Il s’agit d’une mesure normalisée du biais de laboratoire, calculée à partir de la valeur de référence
value et de l’écart-type du set d’échantillons de validation (3.26) ou du set d’échantillons pour essai indépendants (3.25).
3.33
résidu
différence entre une valeur observée de la variable de réponse et la valeur prédite (3.11) de la variable de
réponse
Note 1 à l'article: Tel que défini par l’ISO 3534-3.
Note 2 à l'article: Pour les données de spectroscopie dans le proche infrarouge (3.1), un résidu est la différence entre une
valeur de référence (3.10) et la valeur prédite par un modèle de régression. Les résidus sont utilisés dans le calcul des
statistiques de régression.
3.34
résidu spectral
résidu (3.33) après traitement chimiométrique d’un spectre résultant d’une variation spectrale non décrite
par le modèle
3.35
biais
e
différence entre la valeur de référence (3.10) moyenne, ȳ, et la valeur de référence, x̄, prédite par le modèle de
prédictionNIR (3.13) pour un set d’échantillons
3.36
limite de confiance du biais
BCL
Tb
valeur supérieure à celle où un biais (3.35) est significativement différent de zéro au niveau de confiance
spécifié
Note 1 à l'article: Voir également 7.3.
3.37
erreur-type d’étalonnage
SEC
sSEC
dans le cas d’un modèle de prédiction (3.13), expression de la différence moyenne entre les valeurs prédites
(3.11) et les valeurs de référence (3.10) pour les échantillons utilisés pour développer le modèle
Note 1 à l'article: Tel que défini par l’erreur type de la validation croisée (3.38), l’erreur quadratique moyenne de la
validation croisée (3.39), l’erreur type de la prédiction (3.40) et l’erreur quadratique moyenne de prédiction (3.41), cette
illustration de la différence moyenne fait référence à la racine carrée de la somme des carrés des valeurs des résidus
divisée par le nombre de valeurs corrigées pour les degrés de liberté, où 68 % des erreurs sont inférieures à cette
valeur.
3.38
erreur type de la validation croisée
SECV
sSECV
utilisée avec des données du set d’échantillons de prédiction (3.25), expression de la différence moyenne
corrigée du biais entre les valeurs prédites (3.11) et les valeurs de référence (3.10) pour la sous-population
d’échantillons sélectionnés comme set d’échantillons de prédiction au cours du processus de validation
croisée (3.30)
3.39
erreur quadratique moyenne de la validation croisée
RMSECV
sRMSECV
expression de la différence moyenne entre les valeurs prédites (3.11) et les valeurs de référence (3.10) pour la
sous-population d’échantillons sélectionnés comme échantillons de prédiction (3.25) au cours du processus
de validation croisée (3.30)
Note 1 à l'article: La RMSECV inclut tout biais (3.35) dans les prédictions.
3.40
erreur-type de prédiction
SEP
erreur-type de prédiction corrigée du biais
SEP(C)
sSEP
expression de la différence moyenne corrigée du biais entre les valeurs prédites (3.11) et les valeurs de
référence (3.10) prédites par un modèle de prédiction (3.13) lorsqu’elle est appliquée à un set d’échantillons
non inclus dans le développement du modèle de prédiction
Note 1 à l'article: La SEP doit uniquement être calculée pour le set d’échantillons de validation (3.26) ou le set
d’échantillons pour essai indépendants (3.28).
Note 2 à l'article: La SEP couvre un intervalle de confiance de 68 % (multiplié par 1,96, soit un intervalle de 95 %).

3.41
erreur quadratique moyenne de prédiction
RMSEP
sRMSEP
expression de la différence moyenne entre les valeurs de référence (3.10) et les valeurs prédites (3.11) par un
modèle de prédiction (3.13) lorsqu’elle est appliquée à un set d’échantillons non inclus dans le développement
du modèle de prédiction
Note 1 à l'article: La SEP doit uniquement être calculée pour le set d’échantillons de validation (3.26) ou le set
d’échantillons pour essai indépendants (3.28).
Note 2 à l'article: La RMSEP inclut tout biais (3.35) dans les prédictions.
3.42
limite de confiance d’erreur inexpliquée
UECL
T
UE
limite maximale d’erreur-type de prédiction (3.40) significativement différente de l’erreur-type d’étalonnage
(3.37) à la limite de confiance spécifiée
3.43
RSQ
r
xy
coefficient de corrélation, carré du coefficient de corrélation multiple entre les valeurs prédites (3.11) et les
valeurs de référence (3.10)
Note 1 à l'article: Lorsqu’il est exprimé en pourcentage, il représente la proportion de la variance expliquée par le
modèle de régression.
3.44
pente
b
sur une droite de régression, représentation de l’augmentation de y en fonction de l’augmentation de x
3.45
ordonnée à l’origine
a
sur une droite de régression, valeur de y lorsque x est nul
3.46
écart-type résiduel
S
RES
expression de la taille moyenne de la différence entre les valeurs de référence et les valeurs ajustées après
correction de la pente (3.44) et de l’ordonnée à l’origine (3.45)
3.47
covariance
s
ˆyy
mesure de la variation commune de deux variables aléatoires
Note 1 à l'article: Si, pour une population d’échantillons, une augmentation de x est compensée par une augmentation
de y, alors la covariance entre les deux variables est positive. Si une augmentation de x est compensée pa
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...