ASTM E3080-17
(Practice)Standard Practice for Regression Analysis
Standard Practice for Regression Analysis
SIGNIFICANCE AND USE
4.1 Regression analysis is a statistical procedure that studies the statistical relationships between two or more variables Ref. (1, 2).3 In general, one of these variables is designated as a response variable and the rest of the variables are designated as predictor variables. Then the objective of the model is to predict the response from the predictor variables.
4.1.1 This standard considers a numerical response variable and only a single numerical predictor variable.
4.1.2 The regression model consists of: (1) a mathematical function that relates the mean values of the response variable distribution to fixed values of the predictor variable, and (2) a description of statistical distribution that describes the variability in the response variable at fixed levels of the predictor variable.
4.1.3 The regression procedure utilizes experimental or observational data to estimate the parameters defining a regression model and their precision. Diagnostic procedures are utilized to assess the resulting model fit and can suggest other models for improved prediction performance.
4.1.4 The regression model can be useful for developing process knowledge through description of the variable relationship, in making predictions of future values, and in developing control methods for the process generating values of the variables.
4.2 Section 5 in this standard deals with the simple linear regression model using a straight line mathematical relationship between the two variables where variability of the response variable over the range of values of the predictor variable is described by a normal distribution with constant variance. Appendix X1 provides supplemental information.
SCOPE
1.1 This practice covers regression analysis methodology for estimating, evaluating, and using the simple linear regression model to define the statistical relationship between two numerical variables.
1.2 The system of units for this practice is not specified. Dimensional quantities in the practice are presented only as illustrations of calculation methods. The examples are not binding on products or test methods treated.
1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use.
1.4 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
General Information
Relations
Standards Content (Sample)
NOTICE: This standard has either been superseded and replaced by a new version or withdrawn.
Contact ASTM International (www.astm.org) for the latest information
Designation: E3080 − 17 An American National Standard
Standard Practice for
1
Regression Analysis
This standard is issued under the fixed designation E3080; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope 3.1.2 degrees of freedom, n—the number of independent
data points minus the number of parameters that have to be
1.1 This practice covers regression analysis methodology
estimated before calculating the variance. E2586
for estimating, evaluating, and using the simple linear regres-
3.1.3 residual,n—observedvalueminusfittedvalue,whena
sion model to define the statistical relationship between two
numerical variables. model is used.
3.1.4 predictor variable, X, n—a variable used to predict a
1.2 The system of units for this practice is not specified.
response variable using a regression model.
Dimensional quantities in the practice are presented only as
illustrations of calculation methods. The examples are not
3.1.4.1 Discussion—Also called an independent or explana-
binding on products or test methods treated.
tory variable.
1.3 This standard does not purport to address all of the
3.1.5 regression analysis, n—a statistical procedure used to
safety concerns, if any, associated with its use. It is the
characterize the association between two numerical variables
responsibility of the user of this standard to establish appro-
for prediction of the response variable from the predictor
priate safety, health, and environmental practices and deter-
variable.
mine the applicability of regulatory limitations prior to use.
3.1.6 response variable, Y, n—a variable predicted from a
1.4 This international standard was developed in accor-
regression model.
dance with internationally recognized principles on standard-
3.1.6.1 Discussion—Also called a dependent variable.
ization established in the Decision on Principles for the
3.1.7 sample correlation coeffıcient, r, n—a dimensionless
Development of International Standards, Guides and Recom-
mendations issued by the World Trade Organization Technical measure of association between two variables estimated from
the data.
Barriers to Trade (TBT) Committee.
3.1.8 sample covariance, s ,n—an estimate of the associa-
xy
2. Referenced Documents
tion of the response variable and predictor variable calculated
2
from the data.
2.1 ASTM Standards:
E178 Practice for Dealing With Outlying Observations
3.2 Definitions of Terms Specific to This Standard:
E456 Terminology Relating to Quality and Statistics
3.2.1 intercept, n—of a regression model, β , the value of
0
E2586 Practice for Calculating and Using Basic Statistics
the response variable when the predictor variable is zero.
3.2.2 regression model parameter, n—a descriptive constant
3. Terminology
defining a regression model that is to be estimated.
3.1 Definitions—Unless otherwise noted, terms relating to
3.2.3 residual standard deviation, n—of a regression model,
quality and statistics are as defined in Terminology E456.
σ, the square root of the residual variance.
2
3.1.1 coeffıcient of determination, r,n—square of the
2
3.2.4 residual variance, n—of a regression model, σ , the
correlation coefficient.
variance of the residuals (see residual).
3.2.5 slope, n—of a regression model, β , the incremental
1
1 change in the response variable due to a unit change in the
This practice is under the jurisdiction ofASTM Committee E11 on Quality and
Statistics and is the direct responsibility of Subcommittee E11.10 on Sampling / predictor variable.
Statistics.
3.3 Symbols:
Current edition approved Nov. 1, 2017. Published January 2018. Originally
approved in 2019. Last previous edition approved in 2016 as E3080 – 16. DOI:
b = intercept estimate (5.2.2)
10.1520/E3080-17.
0
2
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
b = slope estimate (5.2.2)
1
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
β = intercept parameter in model (5.1.2)
0
Standards volume information, refer to the standard’s Document Summary page on
β = slope parameter in model (5.1.2)
1
the ASTM website.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
1
---------------------- Page: 1 ----------------------
E3080 − 17
3
(1, 2). In general, one of these variables is designated as a
E = general point estimate of a parameter (5.4.2)
responsevariableandtherestofthevariablesaredesignatedas
e = residual for data point i (5.2.5)
i
predictor variables. Then the objective of the model is to
ε = residual parameter in model (5.1.3)
F = F statistic (X1.3.2)
...
This document is not an ASTM standard and is intended only to provide the user of an ASTM standard an indication of what changes have been made to the previous version. Because
it may not be technically possible to adequately depict all changes accurately, ASTM recommends that users consult prior editions as appropriate. In all cases only the current version
of the standard as published by ASTM is to be considered the official document.
Designation: E3080 − 16 E3080 − 17 An American National Standard
Standard Practice for
1
Regression Analysis
This standard is issued under the fixed designation E3080; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope
1.1 This practice covers regression analysis methodology for estimating, evaluating, and using the simple linear regression
model to define the statistical relationship between two numerical variables.
1.2 The system of units for this practice is not specified. Dimensional quantities in the practice are presented only as illustrations
of calculation methods. The examples are not binding on products or test methods treated.
1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility
of the user of this standard to establish appropriate safety safety, health, and healthenvironmental practices and determine the
applicability of regulatory limitations prior to use.
1.4 This international standard was developed in accordance with internationally recognized principles on standardization
established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued
by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
2. Referenced Documents
2
2.1 ASTM Standards:
E178 Practice for Dealing With Outlying Observations
E456 Terminology Relating to Quality and Statistics
E2282 Guide for Defining the Test Result of a Test Method
E2586 Practice for Calculating and Using Basic Statistics
3. Terminology
3.1 Definitions—Unless otherwise noted, terms relating to quality and statistics are as defined in Terminology E456.
3.1.1 characteristic, n—a property of items in a sample or population which, when measured, counted, or otherwise observed,
helps to distinguish among the items. E2282
2
3.1.1 coeffıcient of determination, r , n—square of the correlation coefficient.
3.1.3 confidence interval, n—an interval estimate [L, U] with the statistics L and U as limits for the parameter θ and with
confidence level 1 – α, where Pr(L ≤ θ ≤ U) ≥ 1 – α. E2586
3.1.3.1 Discussion—
The confidence level, 1 – α, reflects the proportion of cases that the confidence interval [L, U] would contain or cover the true
parameter value in a series of repeated random samples under identical conditions. Once L and U are given values, the resulting
confidence interval either does or does not contain it. In this sense “confidence” applies not to the particular interval but only to
the long run proportion of cases when repeating the procedure many times.
3.1.4 confidence level, n—the value, 1 – α, of the probability associated with a confidence interval, often expressed as a
percentage. E2586
3.1.4.1 Discussion—
1
This practice is under the jurisdiction of ASTM Committee E11 on Quality and Statistics and is the direct responsibility of Subcommittee E11.10 on Sampling / Statistics.
Current edition approved Nov. 1, 2016Nov. 1, 2017. Published November 2016January 2018. Originally approved in 2019. Last previous edition approved in 2016 as
E3080 – 16. DOI: 10.1520/E3080-16.10.1520/E3080-17.
2
For referenced ASTM standards, visit the ASTM website, www.astm.org, or contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM Standards
volume information, refer to the standard’s Document Summary page on the ASTM website.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
1
---------------------- Page: 1 ----------------------
E3080 − 17
α is generally a small number. Confidence level is often 95 % or 99 %.
3.1.5 correlation coeffıcient, n—for a population,ρ, a dimensionless measure of association between two variables X and Y,
equal to the covariance divided by the product of σ times σ .
X Y
3.1.6 correlation coeffıcient, n—for a sample, r, the estimate of the parameter ρ from the data.
3.1.7 covariance, n—of a population, cov(X, Y), for two variables, X and Y, the expected value of (X – μ )(Y – μ ).
X Y
3.1.8 covariance, n—of a sample; the estimate of the parameter cov(X,Y) from the data.
3.1.9 dependent variable, n—a variable to be predicted using an equation.
3.1.2 degrees of freedom, n—the number of independent data points minus the number of parameter
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.