Standard Guide for Applying Statistics to Analysis of Corrosion Data

SIGNIFICANCE AND USE
Corrosion test results often show more scatter than many other types of tests because of a variety of factors, including the fact that minor impurities often play a decisive role in controlling corrosion rates. Statistical analysis can be very helpful in allowing investigators to interpret such results, especially in determining when test results differ from one another significantly. This can be a difficult task when a variety of materials are under test, but statistical methods provide a rational approach to this problem.
Modern data reduction programs in combination with computers have allowed sophisticated statistical analyses on data sets with relative ease. This capability permits investigators to determine if associations exist between many variables and, if so, to develop quantitative expressions relating the variables.
Statistical evaluation is a necessary step in the analysis of results from any procedure which provides quantitative information. This analysis allows confidence intervals to be estimated from the measured results.
SCOPE
1.1 This guide presents briefly some generally accepted methods of statistical analyses which are useful in the interpretation of corrosion test results.
1.2 This guide does not cover detailed calculations and methods, but rather covers a range of approaches which have found application in corrosion testing.
1.3 Only those statistical methods that have found wide acceptance in corrosion testing have been considered in this guide.

General Information

Status
Historical
Publication Date
30-Apr-2004
Current Stage
Ref Project

Relations

Buy Standard

Guide
ASTM G16-95(2004) - Standard Guide for Applying Statistics to Analysis of Corrosion Data
English language
14 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


NOTICE: This standard has either been superseded and replaced by a new version or withdrawn.
Contact ASTM International (www.astm.org) for the latest information.
Designation:G16–95 (Reapproved 2004)
Standard Guide for
Applying Statistics to Analysis of Corrosion Data
ThisstandardisissuedunderthefixeddesignationG16;thenumberimmediatelyfollowingthedesignationindicatestheyearoforiginal
adoptionor,inthecaseofrevision,theyearoflastrevision.Anumberinparenthesesindicatestheyearoflastreapproval.Asuperscript
epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope 3.2 Modern data reduction programs in combination with
computers have allowed sophisticated statistical analyses on
1.1 This guide presents briefly some generally accepted
data sets with relative ease. This capability permits investiga-
methods of statistical analyses which are useful in the inter-
tors to determine if associations exist between many variables
pretation of corrosion test results.
and, if so, to develop quantitative expressions relating the
1.2 This guide does not cover detailed calculations and
variables.
methods, but rather covers a range of approaches which have
3.3 Statistical evaluation is a necessary step in the analysis
found application in corrosion testing.
of results from any procedure which provides quantitative
1.3 Only those statistical methods that have found wide
information. This analysis allows confidence intervals to be
acceptance in corrosion testing have been considered in this
estimated from the measured results.
guide.
4. Errors
2. Referenced Documents
4.1 Distributions—In the measurement of values associated
2.1 ASTM Standards:
withthecorrosionofmetals,avarietyoffactorsacttoproduce
E178 Practice for Dealing With Outlying Observations
measured values that deviate from expected values for the
E380 Practice for Use of the International System of Units
conditions that are present. Usually the factors which contrib-
(SI) (the Modernized Metric System)
utetotheerrorofmeasuredvaluesactinamoreorlessrandom
E691 Practice for Conducting an Interlaboratory Study to
way so that the average of several values approximates the
Determine the Precision of a Test Method
expected value better than a single measurement. The pattern
G46 Guide for Examination and Evaluation of Pitting
in which data are scattered is called its distribution, and a
Corrosion
variety of distributions are seen in corrosion work.
3. Significance and Use 4.2 Histograms—A bar graph called a histogram may be
used to display the scatter of the data. A histogram is
3.1 Corrosion test results often show more scatter than
constructed by dividing the range of data values into equal
many other types of tests because of a variety of factors,
intervals on the abscissa axis and then placing a bar over each
including the fact that minor impurities often play a decisive
interval of a height equal to the number of data points within
role in controlling corrosion rates. Statistical analysis can be
thatinterval.Thenumberofintervalsshouldbefewenoughso
very helpful in allowing investigators to interpret such results,
that almost all intervals contain at least three points, however
especially in determining when test results differ from one
there should be a sufficient number of intervals to facilitate
anothersignificantly.Thiscanbeadifficulttaskwhenavariety
visualization of the shape and symmetry of the bar heights.
of materials are under test, but statistical methods provide a
Twenty intervals are usually recommended for a histogram.
rational approach to this problem.
Because so many points are required to construct a histogram,
it is unusual to find data sets in corrosion work that lend
themselves to this type of analysis.
This guide is under the jurisdiction ofASTM Committee G01 on Corrosion of
Metals and is the direct responsibility of Subcommittee G01.05 on Laboratory
4.3 Normal Distribution—Many statistical techniques are
Corrosion Tests.
based on the normal distribution. This distribution is bell-
Current edition approved May 1, 2004. Published May 2004. Originally
´1 shapedandsymmetrical.Useofanalysistechniquesdeveloped
approved in 1971. Last previous edition approved in 1999 as G16–95 (1999) .
for the normal distribution on data distributed in another
DOI: 10.1520/G0016-95R04.
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
mannercanleadtogrosslyerroneousconclusions.Thus,before
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
attempting data analysis, the data should either be verified as
Standards volume information, refer to the standard’s Document Summary page on
being scattered like a normal distribution, or a transformation
the ASTM website.
Withdrawn. should be used to obtain a data set which is approximately
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.
G16–95 (2004)
normally distributed. Transformed data may be analyzed sta- nonlinearity, a transformation may be used to obtain a new,
tistically and the results transformed back to give the desired transformed data set that may be normally distributed. Al-
results, although the process of transforming the data back can though it is sometimes possible to guess at the type of
createproblemsintermsofnothavingsymmetricalconfidence distributionbylookingatthehistogram,andthusdeterminethe
intervals. exact transformation to be used, it is usually just as easy to use
4.4 Normal Probability Paper—If the histogram is not a computer to calculate a number of different transformations
confirmatory in terms of the shape of the distribution, the data and to check each for the normality of the transformed data.
may be examined further to see if it is normally distributed by Some transformations based on known non-normal distribu-
constructing a normal probability plot as described as follows tions, or that have been found to work in some situations, are
(1). listed as follows:
4.4.1 It is easiest to construct a normal probability plot if
y =logxy = exp x
y = =xy = x
normalprobabilitypaperisavailable.Thispaperhasonelinear
−1
y =1/xy =sin x/n
=
axis, and one axis which is arranged to reflect the shape of the
cumulative area under the normal distribution. In practice, the
where:
“probability” axis has 0.5 or 50% at the center, a number
y = transformed datum,
approaching 0 percent at one end, and a number approaching x = original datum, and
n = number of data points.
1.0 or 100% at the other end. The marks are spaced far apart
in the center and close together at the ends. A normal
Time to failure in stress corrosion cracking usually is best
probability plot may be constructed as follows with normal
fitted with a log x transformation (2, 3).
probability paper.
Once a set of transformed data is found that yields an
approximately straight line on a probability plot, the statistical
NOTE 1—Data that plot approximately on a straight line on the
procedures of interest can be carried out on the transformed
probability plot may be considered to be normally distributed. Deviations
data. Results, such as predicted data values or confidence
from a normal distribution may be recognized by the presence of
deviationsfromastraightline,usuallymostnoticeableattheextremeends intervals, must be transformed back using the reverse transfor-
of the data.
mation.
4.6 Unknown Distribution—If there are insufficient data
4.4.1.1 Number the data points starting at the largest nega-
tive value and proceeding to the largest positive value. The points, or if for any other reason, the distribution type of the
data cannot be determined, then two possibilities exist for
numbersofthedatapointsthusobtainedarecalledtheranksof
the points. analysis:
4.4.1.2 Ploteachpointonthenormalprobabilitypapersuch 4.6.1 Adistribution type may be hypothesized based on the
that when the data are arranged in order: y (1), y (2), y (3), .,
behavior of similar types of data. If this distribution is not
these values are called the order statistics; the linear axis normal, a transformation may be sought which will normalize
reflectsthevalueofthedata,whiletheprobabilityaxislocation
that particular distribution. See 4.5 above for suggestions.
is calculated by subtracting 0.5 from the number (rank) of that
Analysis may then be conducted on the transformed data.
pointanddividingbythetotalnumberofpointsinthedataset.
4.6.2 Statistical analysis procedures that do not require any
specific data distribution type, known as non-parametric meth-
NOTE 2—Occasionally two or more identical values are obtained in a
ods, may be used to analyze the data. Non-parametric tests do
setofresults.Inthiscase,eachpointmaybeplotted,oracompositepoint
may be located at the average of the plotting positions for all the identical not use the data as efficiently.
values.
4.7 Extreme Value Analysis—In the case of determining the
4.4.2 If normal probability paper is not available, the probability of perforation by a pitting or cracking mechanism,
the usual descriptive statistics for the normal distribution are
location of each point on the probability plot may be deter-
mined as follows: not the most useful. In this case, Guide G46 should be
consulted for the procedure (4).
4.4.2.1 Mark the probability axis using linear graduations
from 0.0 to 1.0.
4.8 SignificantDigits—PracticeE380shouldbefollowedto
4.4.2.2 Foreachpoint,subtract0.5fromtherankanddivide determine the proper number of significant digits when report-
the result by the total number of points in the data set. This is
ing numerical results.
the area to the left of that value under the standardized normal
4.9 Propagation of Variance—If a calculated value is a
distribution. The cumulative distribution function is the num-
function of several independent variables and those variables
ber, always between 0 and 1, that is plotted on the probability
have errors associated with them, the error of the calculated
axis.
valuecanbeestimatedbyapropagationofvariancetechnique.
4.4.2.3 Thevalueofthedatapointdefinesitslocationonthe
See Refs. (5) and (6) for details.
other axis of the graph.
4.10 Mistakes—Mistakes either in carrying out an experi-
4.5 Other Probability Paper—If the histogram is not sym-
mentorincalculationsarenotacharacteristicofthepopulation
metrical and bell-shaped, or if the probability plot shows
and can preclude statistical treatment of data or lead to
erroneous conclusions if included in the analysis. Sometimes
mistakes can be identified by statistical methods by recogniz-
ing that the probability of obtaining a particular result is very
Theboldfacenumbersinparenthesesrefertothelistofreferencesattheendof
this guide. low.
G16–95 (2004)
4.11 Outlying Observations—See Practice E178 for proce- dimensionsofvariancearesquareofunits.Aprocedureknown
dures for dealing with outlying observations. as analysis of variance (ANOVA) has been developed for data
sets involving several factors at different levels in order to
5. Central Measures
estimate the effects of these factors. (See Section 9.)
5.1 It is accepted practice to employ several independent
6.3 Standard Deviation—Standard deviation, s, is defined
(replicate) measurements of any experimental quantity to
as the square root of the variance. It has the property of having
improvetheestimateofprecisionandtoreducethevarianceof
the same dimensions as the average value and the original
the average value. If it is assumed that the processes operating
measurements from which it was calculated and is generally
tocreateerrorinthemeasurementarerandominnatureandare
used to describe the scatter of the observations.
as likely to overestimate the true unknown value as to
6.3.1 Standard Deviation of the Average—The standard
underestimate it, then the average value is the best estimate of
deviation of an average, Sx¯, is different from the standard
the unknown value in question. The average value is usually
deviation of a single measured value, but the two standard
indicated by placing a bar over the symbol representing the
deviations are related as in (Eq 2):
measured variable.
S
NOTE 3—In this standard, the term “mean” is reserved to describe a
Sx¯ 5 (2)
n
=
central measure of a population, while average refers to a sample.
5.2 If processes operate to exaggerate the magnitude of the
error either in overestimating or underestimating the correct
where:
measurement, then the median value is usually a better
n = the total number of measurements which were used to
estimate. calculate the average value.
5.3 If the processes operating to create error affect both the
When reporting standard deviation calculations, it is impor-
probability and magnitude of the error, then other approaches
tant to note clearly whether the value reported is the standard
must be employed to find the best estimation procedure. A
deviationoftheaverageorofasinglevalue.Ineithercase,the
qualified statistician should be consulted in this case.
number of measurements should also be reported. The sample
5.4 Incorrosiontesting,itisgenerallyobservedthataverage
estimate of the standard deviation is s.
values are useful in characterizing corrosion rates. In cases of
6.4 Coeffıcient of Variation—The population coefficient of
penetration from pitting and cracking, failure is often defined
variation is defined as the standard deviation divided by the
as the first through penetration and in these cases, average
mean.Thesamplecoefficientofvariationmaybecalculatedas
penetration rates or times are of little value. Extreme value
S/ x¯ and is usually reported in percent. This measure of
analysis has been used in these cases, see Guide G46.
variability is particularly useful in cases where the size of the
5.5 Whentheaveragevalueiscalculatedandreportedasthe
errors is proportional to the magnitude of the measured value
only result in experiments when several replicate runs were
made, information on the scatter of data is lost. so that the coefficient of variation is approximately constant
over a wide range of values.
6. Variability Measures
6.5 Range—The range is defined as the difference between
6.1 Severalmeasuresofdistributionvariabilityareavailable
the maximum and minimum values in a set of replicate data
which can be useful in estimating confidence intervals and
values. The range is non-parametric in nature, that is, its
making predictions from the observed data. In the case of
calculation makes no assumption about the distribution of
normal distribution, a number of procedures are available and
error. In cases when small numbers of replicate values are
can be handled with computer programs. These measures
involved and the data are normally distributed, the range, w,
include the following: variance, standard deviation, and coef-
can be used to estimate the standard deviation by the relation-
ficient of variation. The
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.