Standard Practice for Probability of Detection Analysis for <emph type="bdit"><?Pub _font FamName="Times New Roman"?>â<?Pub /_font?></emph> Versus <emph type="bdit">a</emph> Data

SIGNIFICANCE AND USE
5.1 The POD analysis method described herein is based on well-known and well-established statistical methods. It shall be used to quantify the demonstrated POD for a specific set of examination parameters and known range of discontinuity sizes under the following conditions.  
5.1.1 The initial response from a nondestructive evaluation inspection system is measurable and can be classified as a continuous variable.  
5.1.2 Discontinuity size is the predictor variable and can be accurately quantified.  
5.1.3 The relationship between discontinuity size (a) and measured signal response (â) exists and is best described by a linear regression model with an error structure that is normally distributed with mean zero and constant variance, σ2. (Note that in linear regression, “linear” means linear with respect to the model coefficients. Though a quadratic model  does not have a linear shape when plotted, for example, it is classified as a linear model in regression analysis since it is linear with respect to the model coefficients.)  
5.2 This practice does not limit the use of a linear regression model with more than one predictor variable or other statistical models if justified as more appropriate for the â versus a data.  
5.3 This practice is not appropriate for data resulting from a POD examination on nondestructive evaluation systems that generate an initial response that is binary in nature (for example, hit/miss). Practice E2862 is appropriate for systems that generate a hit/miss-type response (for example, fluorescent penetrant).  
5.4 Prior to performing the analysis, it is assumed that the discontinuity of interest is clearly defined; the number and distribution of induced discontinuity sizes in the POD specimen set is known and well documented; the POD examination administration procedure (including data collection method) is well designed, well defined, under control, and unbiased (see X1.2.2 for more detail); the initial inspection system response is m...
SCOPE
1.1 This practice defines the procedure for performing a statistical analysis on Nondestructive Testing (NDT) â versus a data to determine the demonstrated probability of detection (POD) for a specific set of examination parameters. Topics covered include the standard â versus a regression methodology, POD curve formulation, validation techniques, and correct interpretation of results.  
1.2 Units—The values stated in inch-pound units are to be regarded as standard. The values given in parentheses are mathematical conversions to SI units that are provided for information only and are not considered standard.  
1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use.  
1.4 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.

General Information

Status
Published
Publication Date
14-Feb-2021
Technical Committee
E07 - Nondestructive Testing

Relations

Effective Date
01-Feb-2024
Effective Date
01-Nov-2023
Effective Date
01-Apr-2022
Effective Date
01-Dec-2019
Effective Date
01-Sep-2019
Effective Date
01-Apr-2019
Effective Date
01-Mar-2019
Effective Date
01-Jan-2018
Effective Date
01-Nov-2017
Effective Date
01-Oct-2017
Effective Date
01-Oct-2017
Effective Date
15-Jun-2017
Effective Date
01-Feb-2017
Effective Date
01-Nov-2016
Effective Date
01-Aug-2016

Overview

ASTM E3023-21: Standard Practice for Probability of Detection Analysis for â Versus a Data is an internationally recognized guideline developed by ASTM International. This standard outlines a rigorous approach for performing Probability of Detection (POD) analysis using continuous response data (â) as a function of known discontinuity sizes (a) in nondestructive testing (NDT). This practice is vital for quantifying how reliably an NDT system can detect flaws or defects of specified sizes, using statistical regression methods. The standard ensures uniformity and comparability in POD reporting and analysis, supporting quality assurance, risk management, and regulatory compliance across industries.

Key Topics

  • POD Analysis Methodology: The document describes statistical models, particularly linear regression, for evaluating the relationship between flaw size and measurable system response.
  • Data Requirements: POD analysis is only suitable when the system response is measurable as a continuous variable and the defect size can be reliably quantified.
  • Statistical Assumptions:
    • The relationship between â (signal response) and a (discontinuity size) is linear with respect to model coefficients.
    • The error structure of the regression is normally distributed with mean zero and constant variance.
  • Model Selection and Validation: Procedures for assessing the validity of regression models, handling censored data (values beyond detection or saturation limits), and verifying regression assumptions using diagnostic tools.
  • POD Curve Construction: Instructions for formulating and interpreting POD curves and their confidence bounds, which visually communicate the detection probability as a function of defect size.
  • Reporting Guidelines: Specifies the required details for reporting a POD analysis, including data used, model selected, noise and saturation thresholds, diagnostic assessments, and final POD results.

Applications

ASTM E3023-21 plays an essential role in sectors where nondestructive testing is critical for safety, performance, and reliability. Its practical applications include:

  • Aerospace Industry: For ensuring the integrity of airframe components and engines by determining detection limits of cracks, corrosion, or thickness loss.
  • Automotive Manufacturing: Quantifying the capability of NDT methods (e.g., eddy current, ultrasonic) for flaw detection in safety-critical components.
  • Energy Sector: Supporting pipeline, pressure vessel, or power generation inspections, especially where regulatory compliance and risk minimization are paramount.
  • Quality Assurance & Certification: Supplying a standardized methodology for system reliability demonstrations required by regulators or quality auditors.
  • Research and Development: Establishing reliable detection thresholds in the development and qualification of new NDT technologies or inspection protocols.

Related Standards

ASTM E3023-21 references and aligns with several key standards and documents, including:

  • ASTM E2862: Standard Practice for Probability of Detection Analysis for Hit/Miss Data (applicable when system response is binary, not continuous).
  • ASTM E3080: Practice for Regression Analysis with a Single Predictor Variable (for in-depth discussion of linear regression in similar contexts).
  • MIL-HDBK-1823A: U.S. Department of Defense handbook for NDT system reliability assessment, which includes guidance consistent with E3023.
  • ASTM E456: Terminology Relating to Quality and Statistics.
  • ASTM E2586: Practice for Calculating and Using Basic Statistics.
  • ASTM E2782: Guide for Measurement Systems Analysis (MSA).
  • ASTM E178: Practice for Dealing With Outlying Observations.

Adhering to ASTM E3023-21 ensures best practices, data traceability, and robust, repeatable assessments of nondestructive testing system performance, facilitating international trade, product quality, and operational safety.

Keywords: Probability of Detection, POD curve, â versus a data, nondestructive testing, NDT, linear regression, flaw detection, statistical analysis, ASTM E3023, quality assurance.

Buy Documents

Standard

ASTM E3023-21 - Standard Practice for Probability of Detection Analysis for <emph type="bdit"><?Pub _font FamName="Times New Roman"?>â<?Pub /_font?></emph> Versus <emph type="bdit">a</emph> Data

English language (13 pages)
sale 15% off
sale 15% off
Standard

REDLINE ASTM E3023-21 - Standard Practice for Probability of Detection Analysis for <emph type="bdit"><?Pub _font FamName="Times New Roman"?>â<?Pub /_font?></emph> Versus <emph type="bdit">a</emph> Data

English language (13 pages)
sale 15% off
sale 15% off

Get Certified

Connect with accredited certification bodies for this standard

IMP NDT d.o.o.

Non-destructive testing services. Radiography, ultrasonic, magnetic particle, penetrant, visual inspection.

SA Slovenia Verified

Inštitut za kovinske materiale in tehnologije

Institute of Metals and Technology. Materials testing, metallurgical analysis, NDT.

SA Slovenia Verified

Q Techna d.o.o.

NDT and quality assurance specialist. 30+ years experience. NDT personnel certification per ISO 9712, nuclear and thermal power plant inspections, QA/

SA Slovenia Verified

Sponsored listings

Frequently Asked Questions

ASTM E3023-21 is a standard published by ASTM International. Its full title is "Standard Practice for Probability of Detection Analysis for <emph type="bdit"><?Pub _font FamName="Times New Roman"?>â<?Pub /_font?></emph> Versus <emph type="bdit">a</emph> Data". This standard covers: SIGNIFICANCE AND USE 5.1 The POD analysis method described herein is based on well-known and well-established statistical methods. It shall be used to quantify the demonstrated POD for a specific set of examination parameters and known range of discontinuity sizes under the following conditions. 5.1.1 The initial response from a nondestructive evaluation inspection system is measurable and can be classified as a continuous variable. 5.1.2 Discontinuity size is the predictor variable and can be accurately quantified. 5.1.3 The relationship between discontinuity size (a) and measured signal response (â) exists and is best described by a linear regression model with an error structure that is normally distributed with mean zero and constant variance, σ2. (Note that in linear regression, “linear” means linear with respect to the model coefficients. Though a quadratic model does not have a linear shape when plotted, for example, it is classified as a linear model in regression analysis since it is linear with respect to the model coefficients.) 5.2 This practice does not limit the use of a linear regression model with more than one predictor variable or other statistical models if justified as more appropriate for the â versus a data. 5.3 This practice is not appropriate for data resulting from a POD examination on nondestructive evaluation systems that generate an initial response that is binary in nature (for example, hit/miss). Practice E2862 is appropriate for systems that generate a hit/miss-type response (for example, fluorescent penetrant). 5.4 Prior to performing the analysis, it is assumed that the discontinuity of interest is clearly defined; the number and distribution of induced discontinuity sizes in the POD specimen set is known and well documented; the POD examination administration procedure (including data collection method) is well designed, well defined, under control, and unbiased (see X1.2.2 for more detail); the initial inspection system response is m... SCOPE 1.1 This practice defines the procedure for performing a statistical analysis on Nondestructive Testing (NDT) â versus a data to determine the demonstrated probability of detection (POD) for a specific set of examination parameters. Topics covered include the standard â versus a regression methodology, POD curve formulation, validation techniques, and correct interpretation of results. 1.2 Units—The values stated in inch-pound units are to be regarded as standard. The values given in parentheses are mathematical conversions to SI units that are provided for information only and are not considered standard. 1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use. 1.4 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.

SIGNIFICANCE AND USE 5.1 The POD analysis method described herein is based on well-known and well-established statistical methods. It shall be used to quantify the demonstrated POD for a specific set of examination parameters and known range of discontinuity sizes under the following conditions. 5.1.1 The initial response from a nondestructive evaluation inspection system is measurable and can be classified as a continuous variable. 5.1.2 Discontinuity size is the predictor variable and can be accurately quantified. 5.1.3 The relationship between discontinuity size (a) and measured signal response (â) exists and is best described by a linear regression model with an error structure that is normally distributed with mean zero and constant variance, σ2. (Note that in linear regression, “linear” means linear with respect to the model coefficients. Though a quadratic model does not have a linear shape when plotted, for example, it is classified as a linear model in regression analysis since it is linear with respect to the model coefficients.) 5.2 This practice does not limit the use of a linear regression model with more than one predictor variable or other statistical models if justified as more appropriate for the â versus a data. 5.3 This practice is not appropriate for data resulting from a POD examination on nondestructive evaluation systems that generate an initial response that is binary in nature (for example, hit/miss). Practice E2862 is appropriate for systems that generate a hit/miss-type response (for example, fluorescent penetrant). 5.4 Prior to performing the analysis, it is assumed that the discontinuity of interest is clearly defined; the number and distribution of induced discontinuity sizes in the POD specimen set is known and well documented; the POD examination administration procedure (including data collection method) is well designed, well defined, under control, and unbiased (see X1.2.2 for more detail); the initial inspection system response is m... SCOPE 1.1 This practice defines the procedure for performing a statistical analysis on Nondestructive Testing (NDT) â versus a data to determine the demonstrated probability of detection (POD) for a specific set of examination parameters. Topics covered include the standard â versus a regression methodology, POD curve formulation, validation techniques, and correct interpretation of results. 1.2 Units—The values stated in inch-pound units are to be regarded as standard. The values given in parentheses are mathematical conversions to SI units that are provided for information only and are not considered standard. 1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use. 1.4 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.

ASTM E3023-21 is classified under the following ICS (International Classification for Standards) categories: 19.100 - Non-destructive testing. The ICS classification helps identify the subject area and facilitates finding related standards.

ASTM E3023-21 has the following relationships with other standards: It is inter standard links to ASTM E1316-24, ASTM E3080-23, ASTM E456-13a(2022)e1, ASTM E1316-19b, ASTM E3080-19, ASTM E2586-19e1, ASTM E1316-19, ASTM E1316-18, ASTM E3080-17, ASTM E456-13A(2017)e3, ASTM E456-13A(2017)e1, ASTM E1316-17a, ASTM E1316-17, ASTM E3080-16, ASTM E1316-16a. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

ASTM E3023-21 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.

Standards Content (Sample)


This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the
Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
Designation: E3023 − 21
Standard Practice for
Probability of Detection Analysis for â Versus a Data
This standard is issued under the fixed designation E3023; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope E3080 Practice for Regression Analysis with a Single Pre-
dictor Variable
1.1 This practice defines the procedure for performing a
2.2 Department of Defense Document:
statisticalanalysisonNondestructiveTesting(NDT) âversus a
MIL-HDBK-1823A Nondestructive Evaluation System Re-
data to determine the demonstrated probability of detection
liability Assessment
(POD) for a specific set of examination parameters. Topics
covered include the standard â versus a regression
3. Terminology
methodology, POD curve formulation, validation techniques,
and correct interpretation of results.
3.1 Definitions of Terms Specific to This Standard:
3.1.1 analyst, n—the person responsible for performing a
1.2 Units—The values stated in inch-pound units are to be
POD analysis on â versus a data resulting from a POD
regarded as standard. The values given in parentheses are
examination.
mathematical conversions to SI units that are provided for
information only and are not considered standard.
3.1.2 decision threshold, â ,n—the value of â above
dec
which the signal is interpreted as a find and below which the
1.3 This standard does not purport to address all of the
signal is interpreted as a miss.
safety concerns, if any, associated with its use. It is the
responsibility of the user of this standard to establish appro-
3.1.2.1 Discussion—A decision threshold is required to
priate safety, health, and environmental practices and deter-
create a POD curve. The decision threshold is always greater
mine the applicability of regulatory limitations prior to use.
than or equal to the noise threshold and is the value of â that
1.4 This international standard was developed in accor-
corresponds with the flaw size that can be detected with 50 %
dance with internationally recognized principles on standard-
POD.
ization established in the Decision on Principles for the
3.1.3 demonstrated probability of detection, n—the calcu-
Development of International Standards, Guides and Recom-
lated POD value resulting from the statistical analysis of the â
mendations issued by the World Trade Organization Technical
versus a data.
Barriers to Trade (TBT) Committee.
3.1.4 false call, n—the perceived detection of a discontinu-
ity that is identified as a find during a POD examination when
2. Referenced Documents
2 no discontinuity actually exists at the inspection site.
2.1 ASTM Standards:
3.1.4.1 Discussion—A synonym for “false call” is “false
E178 Practice for Dealing With Outlying Observations
positive.”
E456 Terminology Relating to Quality and Statistics
3.1.5 noise, n—signal response containing no useful target
E1316 Terminology for Nondestructive Examinations
characterization information.
E1325 Terminology Relating to Design of Experiments
E2586 Practice for Calculating and Using Basic Statistics
3.1.6 noise threshold, â ,n—the value of â below which
noise
E2782 Guide for Measurement Systems Analysis (MSA)
the signal is indistinguishable from noise.
E2862 Practice for Probability of Detection Analysis for
3.1.6.1 Discussion—The noise threshold is always less than
Hit/Miss Data
or equal to the decision threshold. The noise threshold is used
to determine left censored data.
This practice is under the jurisdiction of ASTM Committee E07 on Nonde-
3.1.7 probability of detection (POD), n—the fraction of
structive Testing and is the direct responsibility of Subcommittee E07.10 on
nominal discontinuity sizes expected to be found given their
Specialized NDT Methods.
Current edition approved Feb. 15, 2021. Published March 2021. Originally existence.
approved in 2015. Last previous edition approved in 2015 as E3023 – 15. DOI:
10.1520/E3023-21.
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM Available from Standardization Documents Order Desk, DODSSP, Bldg. 4,
Standards volume information, refer to the standard’s Document Summary page on Section D, 700 Robbins Ave., Philadelphia, PA 19111-5098, http://
the ASTM website. dodssp.daps.dla.mil.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E3023 − 21
3.1.8 regression analysis, n—a statistical procedure used to 4.3 Definitionsofstatisticalterminologyusedinthebodyof
characterize the association between two or more numerical this practice can be found in Annex A1.
variables for prediction of the response variable from the
4.4 A more general discussion of the POD analysis process
predictor variable. E456, E3080
can be found in Appendix X1.
3.1.8.1 Discussion—This practice focuses on (but is not
4.5 A mathematical overview of the underlying model
limited to) regression analysis with a single predictor variable.
commonly used with â versus a data resulting from a POD
Appendix X1 in this standard includes details on this topic as
examination can be found in Practice E3080.
appliedtoProbabilityofDetection.SeealsoPracticeE3080for
an overview of linear regression with a single predictor
variable. 5. Significance and Use
3.1.9 saturation threshold, â ,n—the value of â associated
5.1 The POD analysis method described herein is based on
sat
with the maximum output of the system or the largest value of
well-knownandwell-establishedstatisticalmethods.Itshallbe
â that the system can record.
used to quantify the demonstrated POD for a specific set of
3.1.9.1 Discussion—The saturation threshold is used to
examination parameters and known range of discontinuity
determine right censored data.
sizes under the following conditions.
5.1.1 The initial response from a nondestructive evaluation
3.2 Symbols:
inspection system is measurable and can be classified as a
3.2.1 a—discontinuity size.
continuous variable.
3.2.2 â—the measured signal response for a given disconti-
5.1.2 Discontinuity size is the predictor variable and can be
nuity size, a.
accurately quantified.
3.2.2.1 Discussion—The measured signal response is as-
5.1.3 The relationship between discontinuity size (a) and
sumed to be continuous in nature. Units depend on the NDT
measured signal response (â) exists and is best described by a
inspection system and can be, for example, scale divisions,
linear regression model with an error structure that is normally
number of contiguous illuminated pixels, or millivolts.
distributed with mean zero and constant variance, σ . (Note
3.2.3 a —the discontinuity size that can be detected with
that in linear regression, “linear” means linear with respect to
p
probability p.
the model coefficients. Though a quadratic model yˆ5β 1β ·x
0 1
1β ·x does not have a linear shape when plotted, for example,
3.2.3.1 Discussion—Eachdiscontinuitysizehasanindepen-
it is classified as a linear model in regression analysis since it
dent probability of being detected and corresponding probabil-
is linear with respect to the model coefficients.)
ityofbeingmissed.Forexample,beingabletodetectaspecific
discontinuity size with probability p does not guarantee that a 5.2 Thispracticedoesnotlimittheuseofalinearregression
larger size discontinuity will be found. model with more than one predictor variable or other statistical
3.2.4 a —the discontinuity size that can be detected with models if justified as more appropriate for the â versus a data.
p/c
probability p with a statistical confidence level of c.
5.3 This practice is not appropriate for data resulting from a
POD examination on nondestructive evaluation systems that
3.2.4.1 Discussion—According to the formula in MIL-
generate an initial response that is binary in nature (for
HDBK-1823A, a is a one-sided upper confidence bound on
p/c
example, hit/miss). Practice E2862 is appropriate for systems
a . a represents how large the true a could be given the
p p/c p
thatgenerateahit/miss-typeresponse(forexample,fluorescent
statistical uncertainty associated with limited sample data.
penetrant).
Hence a > a . Note that POD is equal to p for both a and
p/c p p/c
a . a is based solely on the observed relationship between the
p p
5.4 Prior to performing the analysis, it is assumed that the
â and a data and represents a snapshot in time, whereas a
p/c
discontinuity of interest is clearly defined; the number and
accounts for the uncertainty associated with limited sample
distribution of induced discontinuity sizes in the POD speci-
data.
men set is known and well documented; the POD examination
administration procedure (including data collection method) is
4. Summary of Practice
well designed, well defined, under control, and unbiased (see
X1.2.2 for more detail); the initial inspection system response
4.1 In general, the POD examination process is comprised
is measurable and continuous in nature; the inspection system
of a specimen set design, study design, examination
is calibrated; and the measurement error has been evaluated
administration, statistical analysis of examination data, docu-
and deemed acceptable. The analysis results are only valid if
mentation of analysis results, and specimen set maintenance.
the â versus a data are accurate and precise and the linear
This practice is focused only on and describes, step-by-step,
model adequately represents the â versus a data.
theprocessforanalyzingnondestructivetesting âversus adata
resulting from a POD examination, including minimum re-
5.5 The POD analysis method described herein is consistent
quirements for validating the resulting POD curve, and docu-
with the analysis method for â versus a data described in
menting the results.
MIL-HDBK-1823A and is included in several widely utilized
4.2 This practice also includes definitions and discussions PODsoftwarepackagestoperformaPODanalysison âversus
for results of interest (for example, a ) to provide for adata.Itisalsofoundinstatisticalsoftwarepackagesthathave
90/95
correct interpretation of results. linear regression analysis capability. This practice requires that
E3023 − 21
the analyst has access to either POD software or other software 6.6 Only â versus a data for induced discontinuities shall be
with linear regression analysis capability. used in the development of the linear regression model. False
call data shall not be included in the development of the linear
6. Procedure
model when using standard linear regression analysis methods.
6.1 The POD analysis objective shall be clearly defined by
6.7 Theanalystinconjunctionwiththeresponsibleengineer
the responsible engineer or by the customer.
shall determine the value of the noise threshold, â , and
noise
6.2 The analyst shall obtain the â versus a data resulting
saturation threshold, â , prior to performing the analysis.
sat
from the POD examination, which shall include at a minimum
6.7.1 The value of â is determined by performing a
noise
the documented known induced discontinuity sizes, the asso-
noise analysis. A noise analysis is typically accomplished by
ciated measured signal response, and any false calls.
assessing the distribution of measured signal responses from
6.3 The analyst shall also obtain specific information about
sites with no known discontinuity (false calls) or measured
the POD examination, which shall include at a minimum the
signal responses, or both, that are not influenced by the size of
specimen standard geometry (for example, flat panels), speci-
the discontinuities (noise). Details on performing a noise
menstandardmaterial(forexample,Nickel),examinationdate,
analysis can be found in MIL-HDBK-1823A.
number of inspectors, type of inspection method (for example,
6.8 The analyst shall select an appropriate linear regression
Eddy Current Inspection), pertinent information about the
model to establish the relationship between â and a. Selection
instrument and instructions for use (for example, settings,
ofalinearmodelmaybeaniterativeprocessasthesignificance
probe type, scan path), and pertinent comments from the
of the predictor variable(s) and the appropriateness of the
inspector(s) and test administrator.
selected model are typically assessed after the model has been
6.3.1 In general, the results of an experiment apply to the
developed.
conditions under which the experiment was conducted. Hence,
6.8.1 “Linear” refers to linear with respect to the model
the POD analysis results apply to the conditions under which
coefficients. For example, yˆ 5b 1b ·~x ! and yˆ 5b 1b ·x
i 0 1 i 0 1 1
the POD examination was conducted.
1b ·ln x are linear regression models. (For more detail, see
~ !
2 2
6.4 Prior to performing the analysis, the analyst shall
definition in Annex A1 and discussion in Practice E3080
conduct a preliminary review of the POD examination proce-
Annex A1.1.)
dure to identify any issues with the administration of the
6.8.2 In general, only significant and uncorrelated predictor
examination. The analyst shall identify and attempt to resolve
variables are included in a regression model. If more than one
any issues prior to conducting the POD analysis. Identified
predictor variable is being considered for inclusion in the
issues and their resolution shall be documented in the report.
model, a preliminary graphical analysis of the response vari-
Examples of examination administration issues and possible
able against each predictor variable may help identify which
resolutions are outlined in the following subsections.
predictor variables appear to influence the response and the
6.4.1 If problems or interruptions occurred during the POD
type of relationship (for example, direct, inverse, quadratic). In
examination that may bias the results, the POD examination
addition, a preliminary graphical analysis of all possible
should be re-administered.
pairings of predictor variables shall be performed to verify
6.4.2 If the examination procedure was poorly designed or
independence of the predictor variables. When plotted against
executed, or both, the validity of the resulting data is question-
each other, there should be no apparent relationship between
able. In this case, the examination procedure design and
any two predictor variables.
execution should be reevaluated. For design guidelines, see
6.8.3 The appropriateness of a selected model is determined
MIL-HDBK-1823A.
by how well the model fits the observed data and how well the
6.5 Prior to performing the analysis, the analyst shall
underlying regression analysis assumptions are met.
conduct a preliminary review of the â versus a data to identify
6.9 The analyst shall use software that has the appropriate
any data issues. The analyst shall identify and attempt to
linear regression analysis capabilities to perform a linear
resolve any issues prior to conducting the POD analysis.
regression analysis on the â versus a data.
Identifiedissuesandtheirresolutionshallbedocumentedinthe
6.9.1 If censored data are present, the analyst shall do the
report. Examples of data issues and possible resolutions are
following:
outlined in the following subsections.
6.9.1.1 Includeandidentifythecensoreddataintheanalysis
6.5.1 Any apparent outlying observations shall be reviewed
(according to the notation required by the software).
for correctness. If a typo is identified, the typo shall be
6.9.1.2 Use the method of maximum likelihood to estimate
corrected prior to performing the analysis. If the value is
the model coefficients.
correct, it shall be retained in the analysis and its influence on
6.9.1.3 Verify that convergence was achieved. If conver-
the â versus a model shall be evaluated during the model
diagnostic assessment. The analyst should also reference Prac- gence is not achieved, the resulting â versus a model shall not
be used to develop a POD curve.
tice E178.
6.5.2 POD cannot be modeled as a continuous function of 6.9.1.4 Check the number of iterations it took to converge,
discontinuity size if all the measured signal responses are provided that information on convergence and the number of
below the noise threshold or above the saturation threshold. If iterations it took to converge is included in the analysis
this occurs, the adequacy of the nondestructive testing system software output. If more than 20 iterations were needed to
should be evaluated. reach convergence, the model may not be reliable.
E3023 − 21
6.9.1.5 Include a statement in the report indicating that re-running the analysis to assess its influence on the â versus a
convergence was achieved and the number of iterations needed model. A data point is said to be influential (or have high
to achieve convergence. leverage) if its exclusion from the analysis has a relatively
largeeffectonthe âversus amodel.Bothanalysisresults(with
6.9.2 If no censored data are present, the method of maxi-
and without the outlying data) shall be included in the report
mum likelihood or the method of least squares shall be used.
along with a discussion of the impact to the resulting POD
6.10 If included in the analysis software output, the analyst
curve and confidence bound (if applicable).
shall assess the significance of the predictor variables in the
6.12.3 If the model includes more than one predictor
model. Only significant predictor variables should be included
variable, a graphical analysis shall be performed to verify
in the model. (See X1.2 for more detail.)
independence of the predictor variables. (This step may be
6.11 Once the â versus a model is estimated, the analyst
done during model selection as described in Appendix X1.)
shall use, at a minimum, the model diagnostic methods listed
6.13 The responsible engineer shall determine the value of
below to assess the underlying linear regression analysis
â thatismostappropriatewithrespecttoenduseofthePOD
dec
assumptions. The methods listed below shall be performed
analysis results. A value for the decision threshold is required
using only non-censored data. If available, other formal diag-
to create a POD curve.The value must be greater than or equal
nostic methods (noted in X1.2) should be used to assess the
to the value of the noise threshold. That is, â ≥ â .
dec noise
linear regression analysis assumptions.
6.14 The analyst shall use the decision threshold to deter-
6.11.1 There are three main underlying assumptions in a
mine a POD value for each discontinuity size given the
linear regression analysis: (1) residuals are normally distrib-
establishedrelationshipbetween âand a,theformulaforwhich
uted with mean 0 and constant variance, σ , (2) the residuals
can be found in Appendix X1. The resulting POD values shall
are independent, and (3) the relationship is in fact linear. The
be plotted against discontinuity size to produce a POD Curve.
residual is calculated as e =y – ŷ and represents the
i i i
difference between the observed result, y, and the predicted 6.14.1 PODcurvestendtobe s-shapedwhenasimplelinear
i
th
regression model is selected.
value, ŷ, for the i case. In general, the results of a linear
i
regressionanalysisarenotvalidunlesstheseassumptionshold.
6.14.2 If more than one predictor variable is included in the
At a minimum, the following analyses of the residuals shall be model, POD is a response surface rather than a single curve.
performed to verify the assumptions.
6.14.3 Theanalystshalldeterminethemostappropriateway
6.11.1.1 Ahistogram of the residuals shall be constructed to to plot the results.
assess the normality assumption and centering of the residuals.
6.15 If a c% level of confidence is specified by the respon-
Ahistogramoftheresidualsshouldberoughlybell-shapedand
sible engineer or the customer, the analyst shall put a c% lower
symmetric around zero. In general, bell shape and symmetry
confidence bound on the POD curve by calculating a c% lower
around zero are more important than strict normality since
confidence bound on the â versus a model fit. Methods for
traditional estimation procedures are typically only sensitive to
constructing a confidence bound around a regression fit can be
large departures from normality (particularly with respect to
found in MIL-HDBK-1823A as well as statistics text books on
skewness).
linear regression.
6.11.1.2 The constant variance and linearity assumptions
6.15.1 If, for example, the objective of the analysis is to
shall be verified by plotting the residuals (y-axis) against the
determine the discontinuity size that can be detected with 90 %
predicted values (x-axis). If the residuals fall in a horizontal
probability and 95 % confidence, denoted a , then the
90/95
band centered around zero, with no systematic preference for
analyst shall put a 95 % lower confidence bound on the POD
being positive or negative, then the assumption of constant
curve by calculating a 95 % lower confidence bound on the â
variance and a linear relationship holds. (See Fig. X1.2 in
versus a model fit. The formula for the 95 % lower confidence
Appendix X1.) In general, meeting the constant variance
bound on the POD curve, which is based on the 95 % lower
assumption is more important than meeting the normality
confidence bound around the regression fit, can be found in
assumption.
Appendix X1.
6.12 The analyst shall use at a minimum the methods listed
6.16 The analyst shall analyze any false call data and shall
below to assess the goodness-of-fit, influential points, and
report the false call rate.
multicollinarity among predictor variables. If available, more
6.16.1 The responsible engineer or the customer shall
formal methods (noted in Appendix X1) should be used.
clearly define what constitutes a false call.
6.12.1 A plot of predicted values versus actual values shall
6.16.2 Adistributional analysis of false call or noise data, or
beusedtoassessgoodness-of-fit.Theplottedpointsshouldfall
both, is typically performed to assess the false call rate, a
roughly on the y = x line. Plotted points deviating from the y =
discussion of which can be found in MIL-HDBK-1823A.
x line in a systematic way may be an indication of poor fit.
6.17 Acceptable false call rates shall be determined by the
6.12.2 The analyst shall assess the influence of data that
responsible engineer or by the customer.
appearstobeoutlyingontheestablished âversus amodel.The
histogram of the residuals and plot of the residuals versus
predicted values can help identify outlying values. The influ-
ence of a suspected outlying value shall at a minimum be
Neter, J, Kutner, M, Nachtsheim, C, Wasserman, W. Applied Linear Statistical
evaluated by removing the outlying value from the data and Models, The McGraw-Hill Companies.
E3023 − 21
7. Report 7.1.15 Summary of the noise analysis and rationale for
selection of the decision threshold.
7.1 AtaminimumthefollowinginformationaboutthePOD
7.1.16 A plot of the resulting POD curve and confidence
analysis shall be included in the report.
bound (if applicable).
7.1.1 The specimen standard geometry (for example, flat
7.1.17 Specific results of interest as required by the analysis
panels).
objective (for example, a ).
7.1.2 The specimen standard material (for example, nickel).
90/95
7.1.3 Examination date. 7.1.18 Any deviations from the POD examination proce-
7.1.4 Number of inspectors. dure or standard POD analysis.
7.1.5 Type of inspection (for example, Eddy Current).
7.1.18.1 If the POD examination was re-administered, the
7.1.6 Pertinent information about the instrument and in-
original results and rationale for re-administration shall be
structionsforuse(forexample,settings,probetype,scanpath).
documented in the report.
7.1.7 Any comments from the inspector(s) or test adminis-
7.1.18.2 If a discontinuity is removed from the analysis, the
trator.
specific discontinuity and rationale for removal shall be docu-
7.1.8 The documented known induced discontinuity sizes.
mented in the final report.
7.1.9 The associated measured signal responses, including
7.1.18.3 If the impact of outlying data was assessed, the
information about censored data.
results shall be included in the report along with an explana-
7.1.10 Any false calls.
tion.
7.1.11 The linear regression model describing the relation-
7.1.19 Summaryoffalsecallanalysis,includingadefinition
ship between the observed â versus a data and confidence
of what constitutes a false call, the false call rate, and the
bound (if applicable).
method used to estimate the false call rate.
7.1.12 A statement indicating that convergence was
7.1.20 Name of analyst and company responsible for the
achieved and the number of iterations to convergence, if
POD calculation.
maximum likelihood estimation was used.
7.1.13 A statement about the model diagnostic methods
8. Keywords
used and conclusions.
7.1.14 The estimate of the error around the regression fit 8.1 â versus a; eddy current inspection; eddy current POD;
(calculated as the square root of the mean square error, which linearregression;POD;PODanalysis;probabilityofdetection;
is typically included in the software output). regression
ANNEX
(Mandatory Information)
A1. TERMINOLOGY
A1.1 Definitions: The 95 % refers to the ability of the statistical method to
capture(orbound)thetrue a .Thatis,iftheexaminationwere
A1.1.1 a —the discontinuity size that can be detected with
repeatedoverandoverunderthesameconditions,thevaluefor
90 % probability.
a will be larger than the true a 95 % of the time. In
90/95 90
A1.1.1.1 Discussion—The value for a resulting from a
practice the POD examination will be conducted once. Using a
POD analysis is a single point estimate of the true value based
95 % confidence level implies a 95 % chance that the a
90/95
on the outcome of the POD examination. It represents the
value bounds the true a and a 5 % risk that the true a is
90 90
typical value and does not account for variability due to
actually larger than the a value.
90/95
sampling or inherent variability in the inspection system,
A1.1.3 a —the discontinuity size that can be detected
which is always present.
90/50
with 90 % probability with a statistical confidence level of
A1.1.2 a —the discontinuity size that can be detected
90/95
50 %.
with 90 % probability with a statistical confidence level of
A1.1.3.1 Discussion—Using a one-sided 50 % confidence
95 %.
bound implies a 50 % chance that the a value bounds the
90/50
A1.1.2.1 Discussion—The value for a resulting from a
true a and a 50 % risk that the true a is actually larger than
90 90
POD analysis is an estimate of the true a based on the
the a value. Given this, a is really the same as a .
90/50 90/50 90
outcome of the POD examination. If the examination were
A1.1.4 censored data, n—a censored data point is one in
repeated, the outcome is not expected to be exactly the same.
which the value is not known exactly.
Hence the estimate of a will not be the same. To account for
variability due to sampling, a statistical confidence bound with A1.1.4.1 Discussion—The two most common types of cen-
a 95 % level of confidence is often applied to the estimated soring encountered in an â versus a POD analysis are right-
value for a , resulting in an a value. POD is still 90 %. censored and left-censored.
90 90/95
E3023 − 21
A1.1.4.2 Discussion—A right-censored data point is one in A1.1.5 histogram, n—graphical representation of the fre-
th
which there is a lower bound y for the i response.That is, the quency distribution of a characteristic consisting of a set of
i
response value is not known exactly but is known to be some rectangles with area proportional to the frequency. E456,
value above y. In practice, right-censoring occurs when the E2586
i
signal generated by a large flaw saturates the system. For
A1.1.5.1 Discussion—While not required, equal bar or class
example, suppose that the maximum amplitude that can be
widths are recommended for histograms according to Practice
reported by an inspection system is 25. The underlying
E2586. A histogram provides information on the central
assumption is that the measured signal increases as flaw size tendency of the distribution, reveals the amount of variation in
increases. In some cases, the measured signal from a large flaw
the data, provides information on the shape of the distribution,
may be greater than or equal to 25. If the measured signal from and reveals potential outlying values.
alargeflawexceeds25,thentheexactmeasuredsignalissome
A1.1.6 linear regression model, n—any theoretical model
amplitude to the “right” of 25 but cannot be measured because
built of the form Y 5β 1β ·x 1β ·x .β ·x 1ε , where Y is
i 0 1 1 2 2 p21 p21 1 i
the inspection system cannot report values beyond 25. Note
the response for case i; x , x , …, x , are the predictor
1 2 p-1
that the censoring in this case is predetermined by the
variables; p is the number of regression coefficients; β , β , …,
0 1
limitations of the instrument electronics.
β are the regression coefficients; andε are the random errors
p-1 i
A1.1.4.3 Discussion—A left-censored data point is one in
that are assumed to be independently and identically distrib-
th
which there is an upper bound y for the i response. That is,
i
uted and follow a normal distribution with mean zero and
the response value is not known exactly but is known to be
const
...


This document is not an ASTM standard and is intended only to provide the user of an ASTM standard an indication of what changes have been made to the previous version. Because
it may not be technically possible to adequately depict all changes accurately, ASTM recommends that users consult prior editions as appropriate. In all cases only the current version
of the standard as published by ASTM is to be considered the official document.
Designation: E3023 − 15 E3023 − 21
Standard Practice for
Probability of Detection Analysis for â Versus a Data
This standard is issued under the fixed designation E3023; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope
1.1 This practice defines the procedure for performing a statistical analysis on Nondestructive Testing (NDT) â versus a data to
determine the demonstrated probability of detection (POD) for a specific set of examination parameters. Topics covered include
the standard â versus a regression methodology, POD curve formulation, validation techniques, and correct interpretation of
results.
1.2 Units—The values stated in inch-pound units are to be regarded as standard. The values given in parentheses are mathematical
conversions to SI units that are provided for information only and are not considered standard.
1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility
of the user of this standard to establish appropriate safety safety, health, and healthenvironmental practices and determine the
applicability of regulatory limitations prior to use.
1.4 This international standard was developed in accordance with internationally recognized principles on standardization
established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued
by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
2. Referenced Documents
2.1 ASTM Standards:
E178 Practice for Dealing With Outlying Observations
E456 Terminology Relating to Quality and Statistics
E1316 Terminology for Nondestructive Examinations
E1325 Terminology Relating to Design of Experiments
E2586 Practice for Calculating and Using Basic Statistics
E2782 Guide for Measurement Systems Analysis (MSA)
E2862 Practice for Probability of Detection Analysis for Hit/Miss Data
E3080 Practice for Regression Analysis with a Single Predictor Variable
2.2 Department of Defense Document:
MIL-HDBK-1823A Nondestructive Evaluation System Reliability Assessment
3. Terminology
3.1 Definitions of Terms Specific to This Standard:
This test method practice is under the jurisdiction of ASTM Committee E07 on Nondestructive Testing and is the direct responsibility of Subcommittee E07.10 on
Specialized NDT Methods.
Current edition approved June 15, 2015Feb. 15, 2021. Published August 2015March 2021. Originally approved in 2015. Last previous edition approved in 2015 as
E3023 – 15. DOI: 10.1520/E3023–15.10.1520/E3023-21.
For referenced ASTM standards, visit the ASTM website, www.astm.org, or contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM Standards
volume information, refer to the standard’s Document Summary page on the ASTM website.
Available from Standardization Documents Order Desk, DODSSP, Bldg. 4, Section D, 700 Robbins Ave., Philadelphia, PA 19111-5098, http://dodssp.daps.dla.mil.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E3023 − 21
3.1.1 analyst, n—the person responsible for performing a POD analysis on â versus a data resulting from a POD examination.
3.1.2 decision threshold, â , n—the value of â above which the signal is interpreted as a find and below which the signal is
dec
interpreted as a miss.
3.1.2.1 Discussion—
A decision threshold is required to create a POD curve. The decision threshold is always greater than or equal to the noise threshold
and is the value of â that corresponds with the flaw size that can be detected with 50%50 % POD.
3.1.3 demonstrated probability of detection, n—the calculated POD value resulting from the statistical analysis of the â versus a
data.
3.1.4 false call, n—– the perceived detection of a discontinuity that is identified as a find during a POD examination when no
discontinuity actually exists at the inspection site.
3.1.4.1 Discussion—
A synonym for “false call” is “false positive.”
3.1.5 noise, n—signal response containing no useful target characterization information.
3.1.6 noise threshold, â , n—the value of â below which the signal is indistinguishable from noise.
noise
3.1.6.1 Discussion—
The noise threshold is always less than or equal to the decision threshold. The noise threshold is used to determine left censored
data.
3.1.7 probability of detection, detection (POD), n—the fraction of nominal discontinuity sizes expected to be found given their
existence.
3.1.8 regression analysis, n—a statistical procedure used to characterize the association between two or more numerical variables
for prediction of the response variable from the predictor variable. E456, E3080
3.1.8.1 Discussion—
This practice focuses on (but is not limited to) regression analysis with a single predictor variable. Appendix X1 in this standard
includes details on this topic as applied to Probability of Detection. See also Practice E3080 for an overview of linear regression
with a single predictor variable.
3.1.9 saturation threshold, â , n—the value of â associated with the maximum output of the system or the largest value of â that
sat
the system can record.
3.1.9.1 Discussion—
The saturation threshold is used to determine right censored data.
3.2 Symbols:
3.2.1 a—discontinuity size.
3.2.2 â—the measured signal response for a given discontinuity size, a.
3.2.2.1 Discussion—
The measured signal response is assumed to be continuous in nature. Units depend on the NDT inspection system and can be, for
example, scale divisions, number of contiguous illuminated pixels, or millivolts.
3.2.3 a —the discontinuity size that can be detected with probability p.
p
3.2.3.1 Discussion—
Each discontinuity size has an independent probability of being detected and corresponding probability of being missed. For
example, being able to detect a specific discontinuity size with probability p does not guarantee that a larger size discontinuity will
be found.
3.2.4 a —the discontinuity size that can be detected with probability p with a statistical confidence level of c.
p/c
E3023 − 21
3.2.4.1 Discussion—
According to the formula in MIL-HDBK-1823A, a is a one-sided upper confidence bound on a .a represents how large the
p/c p p/c
true a could be given the statistical uncertainty associated with limited sample data. Hence a > a . Note that POD is equal to
p p/c p
p for both a and a .a is based solely on the observed relationship between the â and a data and represents a snapshot in time,
p/c p p
whereas a accounts for the uncertainty associated with limited sample data.
p/c
4. Summary of Practice
4.1 This practice In general, the POD examination process is comprised of a specimen set design, study design, examination
administration, statistical analysis of examination data, documentation of analysis results, and specimen set maintenance. This
practice is focused only on and describes, step-by-step, the process for analyzing nondestructive testing â versus a data resulting
from a POD examination, including minimum requirements for validating the resulting POD curve.curve, and documenting the
results.
4.2 This practice also includes definitions and discussions for results of interest (e.g., (for example, a ) to provide for correct
90/95
interpretation of results.
4.3 Definitions of statistical terminology used in the body of this practice can be found in Annex A1.
4.4 A more general discussion of the POD analysis process can be found in Appendix X1.
4.5 A mathematical overview of the underlying model commonly used with â versus a data resulting from a POD examination
can be found in Practice E3080.
5. Significance and Use
5.1 The POD analysis method described herein is based on well-known and well-established statistical methods. It shall be used
to quantify the demonstrated POD for a specific set of examination parameters and known range of discontinuity sizes under the
following conditions.
5.1.1 The initial response from a nondestructive evaluation inspection system is measurable and can be classified as a continuous
variable.
5.1.2 Discontinuity size is the predictor variable and can be accurately quantified.
5.1.3 The relationship between discontinuity size (a) and measured signal response (â) exists and is best described by a linear
regression model with an error structure that is normally distributed with mean zero and constant variance, σ . (Note that “linear”
refers to in linear regression, “linear” means linear with respect to the model coefficients. For example, Though a quadratic model
yˆ5β 1β ·x1β ·x is a linear model.)does not have a linear shape when plotted, for example, it is classified as a linear model in
0 1 2
regression analysis since it is linear with respect to the model coefficients.)
5.2 This practice does not limit the use of a linear regression model with more than one predictor variable or other statistical
models if justified as more appropriate for the â versus a data.
5.3 This practice is not appropriate for data resulting from a POD examination on nondestructive evaluation systems that generate
an initial response that is binary in nature (for example, hit/miss). Practice E2862 is appropriate for systems that generate a
hit/miss-type response (for example, fluorescent penetrant).
5.4 Prior to performing the analysis, it is assumed that the discontinuity of interest is clearly defined; the number and distribution
of induced discontinuity sizes in the POD specimen set is known and well documented; the POD examination administration
procedure (including data collection method) is well designed, well defined, under control, and unbiased; unbiased (see X1.2.2 for
more detail); the initial inspection system response is measurable and continuous in nature; the inspection system is calibrated; and
the measurement error has been evaluated and deemed acceptable. The analysis results are only valid if the â versus a data are
accurate and precise and the linear model adequately represents the â versus a data.
5.5 The POD analysis method described herein is consistent with the analysis method for â versus a data described in
E3023 − 21
MIL-HDBK-1823A and is included in several widely utilized POD software packages to perform a POD analysis on â versus a
data. It is also found in statistical software packages that have linear regression analysis capability. This practice requires that the
analyst has access to either POD software or other software with linear regression analysis capability.
6. Procedure
6.1 The POD analysis objective shall be clearly defined by the responsible engineer or by the customer.
6.2 The analyst shall obtain the â versus a data resulting from the POD examination, which shall include at a minimum the
documented known induced discontinuity sizes, the associated measured signal response, and any false calls.
6.3 The analyst shall also obtain specific information about the POD examination, which shall include at a minimum the specimen
standard geometry (e.g., (for example, flat panels), specimen standard material (e.g., (for example, Nickel), examination date,
number of inspectors, type of inspection method (e.g., (for example, Eddy Current Inspection), pertinent information about the
instrument and instructions for use (e.g., (for example, settings, probe type, scan path), and pertinent comments from the
inspector(s) and test administrator.
6.3.1 In general, the results of an experiment apply to the conditions under which the experiment was conducted. Hence, the POD
analysis results apply to the conditions under which the POD examination was conducted.
6.4 Prior to performing the analysis, the analyst shall conduct a preliminary review of the POD examination procedure to identify
any issues with the administration of the examination. The analyst shall identify and attempt to resolve any issues prior to
conducting the POD analysis. Identified issues and their resolution shall be documented in the report. Examples of examination
administration issues and possible resolutions are outlined in the following subsections.
6.4.1 If problems or interruptions occurred during the POD examination that may bias the results, the POD examination should
be re-administered.
6.4.2 If the examination procedure was poorly designed and/oror executed, or both, the validity of the resulting data is
questionable. In this case, the examination procedure design and execution should be reevaluated. For design guidelines, see
MIL-HDBK-1823A.
6.5 Prior to performing the analysis, the analyst shall conduct a preliminary review of the â versus a data to identify any data
issues. The analyst shall identify and attempt to resolve any issues prior to conducting the POD analysis. Identified issues and their
resolution shall be documented in the report. Examples of data issues and possible resolutions are outlined in the following
subsections.
6.5.1 Any apparent outlying observations shall be reviewed for correctness. If a typo is identified, the typo shall be corrected prior
to performing the analysis. If the value is correct, it shall be retained in the analysis and its influence on the â versus a model shall
be evaluated during the model diagnostic assessment. The analyst should also reference Practice E178.
6.5.2 POD cannot be modeled as a continuous function of discontinuity size if all the measured signal responses are below the
noise threshold or above the saturation threshold. If this occurs, the adequacy of the nondestructive testing system should be
evaluated.
6.6 Only â versus a data for induced discontinuities shall be used in the development of the linear regression model. False call
data shall not be included in the development of the linear model when using standard linear regression analysis methods.
6.7 The analyst in conjunction with the responsible engineer shall determine the value of the noise threshold, â , and saturation
noise
threshold, â , prior to performing the analysis.
sat
6.7.1 The value of â is determined by performing a noise analysis. A noise analysis is typically accomplished by assessing the
noise
distribution of measured signal responses from sites with no known discontinuity (false calls) and/oror measured signal responses
responses, or both, that are not influenced by the size of the discontinuities (noise). Details on performing a noise analysis can be
found in MIL-HDBK-1823A.
E3023 − 21
6.8 The analyst shall select an appropriate linear regression model to establish the relationship between â and a. Selection of a
linear model may be an iterative process as the significance of the predictor variable(s) and the appropriateness of the selected
model are typically assessed after the model has been developed.
6.8.1 “Linear” refers to linear with respect to the model coefficients. For example, yˆ 5b 1b ·~x ! and yˆ 5b 1b ·x 1b ·ln~x ! are linear
i 0 1 i 0 1 1 2 2
regression models. (For more detail, see definition in Annex A1 and discussion in Practice E3080 Annex A1.1.)
6.8.2 In general, only significant and uncorrelated predictor variables are included in a regression model. If more than one
predictor variable is being considered for inclusion in the model, a preliminary graphical analysis of the response variable against
each predictor variable may help identify which predictor variables appear to influence the response and the type of relationship
(for example, direct, inverse, quadratic). In addition, a preliminary graphical analysis of all possible pairings of predictor variables
shall be performed to verify independence of the predictor variables. When plotted against each other, there should be no apparent
relationship between any two predictor variables.
6.8.3 The appropriateness of a selected model is determined by how well the model fits the observed data and how well the
underlying regression analysis assumptions are met.
6.9 The analyst shall use software that has the appropriate linear regression analysis capabilities to perform a linear regression
analysis on the â versus a data.
6.9.1 If censored data are present, the analyst shall do the following:
6.9.1.1 Include and identify the censored data in the analysis (according to the notation required by the software).
6.9.1.2 Use the method of maximum likelihood to estimate the model coefficients.
6.9.1.3 Verify that convergence was achieved. If convergence is not achieved, the resulting â versus a model shall not be used to
develop a POD curve.
6.9.1.4 Check the number of iterations it took to converge, provided that information on convergence and the number of iterations
it took to converge is included in the analysis software output. If more than twenty20 iterations were needed to reach convergence,
the model may not be reliable.
6.9.1.5 Include a statement in the report indicating that convergence was achieved and the number of iterations needed to achieve
convergence.
6.9.2 If no censored data are present, the method of maximum likelihood or the method of least squares shall be used.
6.10 If included in the analysis software output, the analyst shall assess the significance of the predictor variables in the model.
Only significant predictor variables should be included in the model. (See X1.2 for more detail.)
6.11 Once the â versus a model is estimated, the analyst shall use, at a minimum, the model diagnostic methods listed below to
assess the underlying linear regression analysis assumptions. The methods listed below shall be performed using only non-censored
data. If available, other formal diagnostic methods (noted in Appendix X1X1.2) should be used to assess the linear regression
analysis assumptions.
6.11.1 There are three main underlying assumptions in a linear regression analysis: (1) residuals are normally distributed with
mean 0 and constant variance, σ , (2) the residuals are independent, and (3) the relationship is in fact linear. The residual is
th
calculated as e = y – ŷ and represents the difference between the observed result, y , and the predicted value, ŷ , for the i case.
i i i i i
In general, the results of a linear regression analysis are not valid unless these assumptions hold. At a minimum, the following
analyses of the residuals shall be performed to verify the assumptions.
6.11.1.1 A histogram of the residuals shall be constructed to assess the normality assumption and centering of the residuals. A
histogram of the residuals should be roughly bell-shaped and symmetric around zero. In general, bell shape and symmetry around
zero are more important than strict normality since traditional estimation procedures are typically only sensitive to large departures
from normality (particularly with respect to skewness).
E3023 − 21
6.11.1.2 The constant variance and linearity assumptions shall be verified by plotting the residuals (y-axis) against the predicted
values (x-axis). If the residuals fall in a horizontal band centered around zero, with no systematic preference for being positive or
negative, then the assumption of constant variance and a linear relationship holds. (See Fig. X1.2 in Appendix X1.) In general,
meeting the constant variance assumption is more important than meeting the normality assumption.
6.12 The analyst shall use at a minimum the methods listed below to assess the goodness-of-fit, influential points, and
multicollinarity among predictor variables. If available, more formal methods (noted in Appendix X1) should be used.
6.12.1 A plot of predicted values versus actual values shall be used to assess goodness-of-fit. The plotted points should fall roughly
on the y = x line. Plotted points deviating from the y = x line in a systematic way may be an indication of poor fit.
6.12.2 The analyst shall assess the influence of data that appears to be outlying on the established â versus a model. The histogram
of the residuals and plot of the residuals versus predicted values can help identify outlying values. The influence of a suspected
outlying value shall at a minimum be evaluated by removing the outlying value from the data and re-running the analysis to assess
its influence on the â versus a model. A data point is said to be influential (or have high leverage) if its exclusion from the analysis
has a relatively large effect on the â versus a model. Both analysis results (with and without the outlying data) shall be included
in the report along with a discussion of the impact to the resulting POD curve and confidence bound (if applicable).
6.12.3 If the model includes more than one predictor variable, a graphical analysis shall be performed to verify independence of
the predictor variables. (This step may be done during model selection as described in Appendix X1.)
6.13 The responsible engineer shall determine the value of â that is most appropriate with respect to end use of the POD analysis
dec
results. A value for the decision threshold is required to create a POD curve. The value must be greater than or equal to the value
of the noise threshold. That is, â ≥ â .
dec noise
6.14 The analyst shall use the decision threshold to determine a POD value for each discontinuity size given the established
relationship between â and a, the formula for which can be found in Appendix X1. The resulting POD values shall be plotted
against discontinuity size to produce a POD Curve.
6.14.1 POD curves tend to be s-shaped when a simple linear regression model is selected.
6.14.2 If more than one predictor variable is included in the model, POD is a response surface rather than a single curve.
6.14.3 The analyst shall determine the most appropriate way to plot the results.
6.15 If a c% level of confidence is specified by the responsible engineer or the customer, the analyst shall put a c% lower
confidence bound on the POD curve by calculating a c% lower confidence bound on the â versus a model fit. Methods for
constructing a confidence bound around a regression fit can be found in MIL-HDBK-1823A as well as statistics text books on
linear regression.
6.15.1 If, for example, the objective of the analysis is to determine the discontinuity size that can be detected with 90%90 %
probability and 95%95 % confidence, denoted a , then the analyst shall put a 95%95 % lower confidence bound on the POD
90/95
curve by calculating a 95%95 % lower confidence bound on the â versus a model fit. The formula for the 95%95 % lower
confidence bound on the POD curve, which is based on the 95%95 % lower confidence bound around the regression fit, can be
found in Appendix X1.
6.16 The analyst shall analyze any false call data and shall report the false call rate.
6.16.1 The responsible engineer or the customer shall clearly define what constitutes a false call.
6.16.2 A distributional analysis of false call or noise data, or both, is typically performed to assess the false call rate, a discussion
of which can be found in MIL-HDBK-1823A.
Neter, J, Kutner, M, Nachtsheim, C, Wasserman, W. Applied Linear Statistical Models, The McGraw-Hill Companies.
E3023 − 21
6.17 Acceptable false call rates shall be determined by the responsible engineer or by the customer.
7. Report
7.1 At a minimum the following information about the POD analysis shall be included in the report.
7.1.1 The specimen standard geometry (e.g., (for example, flat panels).
7.1.2 The specimen standard material (e.g., (for example, nickel).
7.1.3 Examination date.
7.1.4 Number of inspectors.
7.1.5 Type of inspection (e.g., (for example, Eddy Current).
7.1.6 Pertinent information about the instrument and instructions for use (e.g., (for example, settings, probe type, scan path).
7.1.7 Any comments from the inspector(s) or test administrator.
7.1.8 The documented known induced discontinuity sizes.
7.1.9 The associated measured signal responses, including information about censored data.
7.1.10 Any false calls.
7.1.11 The linear regression model describing the relationship between the observed â versus a data and confidence bound (if
applicable).
7.1.12 A statement indicating that convergence was achieved and the number of iterations to convergence, if maximum likelihood
estimation was used.
7.1.13 A statement about the model diagnostic methods used and conclusions.
7.1.14 The estimate of the error around the regression fit (calculated as the square root of the mean square error, which is typically
included in the software output).
7.1.15 Summary of the noise analysis and rationale for selection of the decision threshold.
7.1.16 A plot of the resulting POD curve and confidence bound (if applicable).
7.1.17 Specific results of interest as required by the analysis objective (e.g., (for example, a ).
90/95
7.1.18 Any deviations from the POD examination procedure or standard POD analysis.
7.1.18.1 If the POD examination was re-administered, the original results and rationale for re-administration shall be documented
in the report.
7.1.18.2 If a discontinuity is removed from the analysis, the specific discontinuity and rationale for removal shall be documented
in the final report.
7.1.18.3 If the impact of outlying data was assessed, the results shall be included in the report along with an explanation.
7.1.19 Summary of false call analysis, including a definition of what constitutes a false call, the false call rate, and the method
used to estimate the false call rate.
7.1.20 Name of analyst and company responsible for the POD calculation.
E3023 − 21
8. Keywords
8.1 a-hatâ vs.versus a;a; ahat vs. a; eddy current inspection; eddy current POD; Linear Regression;linear regression; POD; POD
analysis; probability of detection; regression
ANNEX
(Mandatory Information)
A1. TERMINOLOGY
A1.1 Definitions:
A1.1.1 a —the discontinuity size that can be detected with 90%90 % probability.
A1.1.1.1 Discussion—The value for a resulting from a POD analysis is a single point estimate of the true value based on the
outcome of the POD examination. It represents the typical value and does not account for variability due to sampling or inherent
variability in the inspection system, which is always present.
A1.1.2 a —the discontinuity size that can be detected with 90%90 % probability with a statistical confidence level of
90/95
95%.95 %.
A1.1.2.1 Discussion—The value for a resulting from a POD analysis is an estimate of the true a based on the outcome of the
90 90
POD examination. If the examination were repeated, the outcome is not expected to be exactly the same. Hence the estimate of
a will not be the same. To account for variability due to sampling, a statistical confidence bound with a 95%95 % level of
confidence is often applied to the estimated value for a , resulting in an a value. POD is still 90%.90 %. The 95%95 % refers
90 90/95
to the ability of the statistical method to capture (or bound) the true a . That is, if the examination were repeated over and over
under the same conditions, the value for a will be larger than the true a 95%95 % of the time. In practice the POD
90/95 90
examination will be conducted once. Using a 95%95 % confidence level implies a 95%95 % chance that the a value bounds
90/95
the true a and a 5%5 % risk that the true a is actually larger than the a value.
90 90 90/95
A1.1.3 a —the discontinuity size that can be detected with 90%90 % probability with a statistical confidence level of
90/50
50%.50 %.
A1.1.3.1 Discussion—Using a one-sided 50%50 % confidence bound implies a 50%50 % chance that the a value bounds the
90/50
true a and a 50%50 % risk that the true a is actually larger than the a value. Given this, a is really the same as a .
90 90 90/50 90/50 90
A1.1.4 censored data, n —A —a censored data point is one in which the value is not known exactly.
A1.1.4.1 Discussion—The two most common types of censoring encountered in an â versus a POD analysis are right- censored
and left-censored.
A1.1.4.2 Discussion—The two most common types of censoring encountered in an â versus a POD analysis are right-censored and
th
left-censored. A right-censored data point is one in which there is a lower bound y for the i response. That is, the exact response
i
value is somewhere in the interval (not known exactly but y , ∞). A left-censored data point is one in which there is an upper bound
i
E3023 − 21
th
is known to be some value yabove for the i response. That is, the exact response value is somewhere in the interval (–∞, y .].
i i
In practice, right-censoring occurs when the signal generated by a large flaw saturates the system. For example, suppose that the
maximum amplitude that can be reported by an inspection system is 25. The underlying assumption is that the measured signal
increases as flaw size increases. In some cases, the measured signal from a large flaw may be greater than or equal to 25. If the
measured signal from a large flaw exceeds 25, the response for that flaw is (25, ∞). In other words, the exact measured signal is
some amplitude to the “right” of then the exact measured signal is some amplitude to the “right” of 25 but cannot be measured
because the inspection system cannot report values beyond 25. Note that the censoring in this case is predetermined by the
limitations of the instrument electronics. Right-censored data is identified in an â versus a POD analysis by the saturation threshold.
Left-censoring occurs in practice when the inspection system cannot distinguish the signal generated by a small flaw from inherent
system noise or material noise, or both. For example, suppose that the noise threshold is 1 division. That is, any signal below 1
division is indistinguishable from noise. If the measured signal from a small flaw falls below 1, the response for that flaw is
recorded as (0,1). In other words, the exact measured signal is some amplitude to the “left” of 1, or within the noise. Note that
the censoring in this case is predetermined by inherent noise in the inspection system. Left-censored data is identified in an â versus
a POD analysis by the noise threshold.
th
A1.1.4.3 Discussion—A left-censored data point is one in which there is an upper bound y for the i response. That is, the
i
response value is not known exactly but is known to be some value below y . Left-censoring occurs in practice when the inspection
i
system cannot distinguish the signal generated by a small flaw from inherent system noise or material noise, or both. For example,
suppose that the noise threshold is 1 division. That is, any signal below 1 division is indistinguishable from noise. In some cases,
the measured signal from a small flaw may be less than or equal to 1. If the measured signal from a s
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...