ASTM D6582-00(2005)e1
(Guide)Standard Guide for Ranked Set Sampling: Efficient Estimation of a Mean Concentration in Environmental Sampling (Withdrawn 2012)
Standard Guide for Ranked Set Sampling: Efficient Estimation of a Mean Concentration in Environmental Sampling (Withdrawn 2012)
SIGNIFICANCE AND USE
Ranked set sampling is cost-effective, unbiased, more precise and more representative of the population than simple random sampling under a variety of conditions (1).3
Ranked set sampling (RSS) can be used when:
4.2.1 The population is likely to have stratification in concentrations of contaminant.
4.2.2 There is an auxiliary variable.
4.2.3 The auxiliary variable has strong correlation with the primary variable.
4.2.4 The auxiliary variable is either quick or inexpensive to measure, relative to the primary variable.
This guide provides a ranked set sampling method only under the rule of equal allocation. This guide is intended for those who manage, design, and implement sampling and analysis plans for management of wastes and contaminated media. This guide can be used in conjunction with the DQO process (see Practice D 5792).
SCOPE
1.1 This guide describes ranked set sampling, discusses its relative advantages over simple random sampling, and provides examples of potential applications in environmental sampling.
1.2 Ranked set sampling is useful and cost-effective when there is an auxiliary variable, which can be inexpensively measured relative to the primary variable, and when the auxiliary variable has correlation with the primary variable. The resultant estimation of the mean concentration is unbiased, more precise than simple random sampling, and more representative of the population under a wide variety of conditions.
This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety and health practices and determine the applicability of regulatory limitations prior to use.
WITHDRAWN RATIONALE
This guide describes ranked set sampling, discusses its relative advantages over simple random sampling, and provides examples of potential applications in environmental sampling.
Formerly under the jurisdiction of Committee D34 on Waste Management, this guide was withdrawn without replacement in January 2012 because of limited use by industry.
General Information
Relations
Standards Content (Sample)
NOTICE: This standard has either been superseded and replaced by a new version or withdrawn.
Contact ASTM International (www.astm.org) for the latest information
´1
Designation:D6582–00(Reapproved2005)
Standard Guide for
Ranked Set Sampling: Efficient Estimation of a Mean
Concentration in Environmental Sampling
This standard is issued under the fixed designation D6582; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
´ NOTE—Editorial changes were made in 5.1, 5.12.7 and Table 2 in August 2006.
1. Scope samples. This ranking may mimic the rankings of the samples
with respect to the values of the primary variable when there is
1.1 This guide describes ranked set sampling, discusses its
correlation between the auxiliary variable and the primary
relative advantages over simple random sampling, and pro-
variable. Auxiliary information may include visual inspection,
vides examples of potential applications in environmental
inexpensive quick measurement, knowledge of operational
sampling.
history, previous site data, or any other similar information.
1.2 Ranked set sampling is useful and cost-effective when
3.1.2 data quality objectives (DQO) process, n—a quality
there is an auxiliary variable, which can be inexpensively
managementtoolbasedonthescientificmethodanddeveloped
measured relative to the primary variable, and when the
by the U.S. Environmental Protection Agency (EPA) to facili-
auxiliary variable has correlation with the primary variable.
tate the planning of environmental data collection activities.
Theresultantestimationofthemeanconcentrationisunbiased,
(D5792)
more precise than simple random sampling, and more repre-
3.1.3 equal allocation, n—this occurs when the number of
sentative of the population under a wide variety of conditions.
sets in ranked set sampling is an integer multiple of the size of
1.3 This standard does not purport to address all of the
the set.
safety concerns, if any, associated with its use. It is the
3.1.4 primary variable, n—the primary characteristic or
responsibility of the user of this standard to establish appro-
measurement of interest.
priate safety and health practices and determine the applica-
3.1.5 ranked set sampling, n—a sampling method in which
bility of regulatory limitations prior to use.
samples are ranked by the use of auxiliary information on the
2. Referenced Documents samples and only a subset of the samples are selected for the
measurement of the primary variable.
2.1 ASTM Standards:
3.1.6 representative sample, n—asamplecollectedinsucha
D5792 Practice for Generation of Environmental Data Re-
mannerthatitreflectsoneormorecharacteristicsofinterest(as
lated to Waste Management Activities: Development of
defined by the project objectives) of a population from which
Data Quality Objectives
it is collected. (D6044)
D6044 GuideforRepresentativeSamplingforManagement
3.1.6.1 Discussion—Arepresentativesamplecanbeasingle
of Waste and Contaminated Media
sample, a collection of samples, or one or more composite
3. Terminology
samples. A single sample can be representative only when the
population is highly homogeneous. (D6044)
3.1 Definitions:
3.1.1 auxiliary variable, n—the secondary characteristic or
4. Significance and Use
measurement of interest.
4.1 Ranked set sampling is cost-effective, unbiased, more
3.1.1.1 Discussion—In ranked set sampling, information
precise and more representative of the population than simple
contained in an auxiliary variable is useful for ranking the
random sampling under a variety of conditions (1).
4.2 Ranked set sampling (RSS) can be used when:
This guide is under the jurisdiction of ASTM Committee D34 on Waste
4.2.1 The population is likely to have stratification in
Management and is the direct responsibility of Subcommittee D34.01.01 on
concentrations of contaminant.
Planning for Sampling.
4.2.2 There is an auxiliary variable.
Current edition approved July 1, 2005. Published August 2005. Originally
approved in 2000. Last previous edition approved in 2000 as D6582-00. DOI:
10.1520/D6582-00R05E01.
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
Standards volume information, refer to the standard’s Document Summary page on The boldface numbers in parentheses refer to the list of references at the end of
the ASTM website. this standard.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.
´1
D6582–00 (2005)
4.2.3 The auxiliary variable has strong correlation with the hand, an instrument-based quick-test may be capable of a
primary variable. larger m (see 5.14 for ranking criteria).
4.2.4 Theauxiliaryvariableiseitherquickorinexpensiveto
5.7.4 Calculate the needed number of replicates, r (the
measure, relative to the primary variable.
number of times the ranked sets are to be repeated). Divide n
4.3 This guide provides a ranked set sampling method only
by m and round it up to whole number to obtain the needed r.
under the rule of equal allocation. This guide is intended for Namely, r = n/m and round up to whole number.
those who manage, design, and implement sampling and
5.7.5 Randomly select a total of m r samples from the
analysis plans for management of wastes and contaminated
population, for example, by simple random sampling design,
media. This guide can be used in conjunction with the DQO
and randomly divide them into r replicates, with m (m times
process (see Practice D5792).
m) samples in each replicate.
NOTE 1—In practice, the m r samples may not be taken all at once.
5. Ranked Set Sampling (RSS)
More often, m random samples may be taken from a geographical
5.1 Environmental sampling typically requires the identifi-
sub-area of the population and are then ranked according to the auxiliary
cation of the locations where the samples are to be collected. variable. This is repeated m times to obtain the first replicate of m (m
times m) samples. This entire process is repeated r times to obtain the
Subsequent analyses of these samples to quantify the charac-
needed r replicates.
teristics of interest allow inference on the population mean
concentration from the sample data.
5.7.6 Startwiththefirstreplicateofm (mtimesm)samples.
5.2 A simple random sampling (SRS) approach is one
Arrange these samples into m sets of size m (an m by m
sampling design that can be used. In this case, a set of random
matrix).
samples is identified and collected from a population and all of
5.7.7 For each of the m sets in this replicate, rank the
these samples are analyzed (for the primary variable).
samples within each set by using the auxiliary measurement on
5.3 Ranked set sampling (RSS) is similar to SRS in the
the samples. When the observations on the auxiliary variable
identification and collection of the samples, but only a subset
cannotbedistinguishedfromeachother,theseobservationsare
of the samples are selected for analysis. The selection is done
called “ties.”Ties can be broken arbitrarily (namely, arbitrarily
by ranking the samples using auxiliary information on the
assigning one rank to one sample and a succeeding rank to the
samples and selecting a subset based on the rankings of the
other).
samples.
5.7.8 Select samples for the measurement on the primary
5.4 As can be seen from the steps described below, RSS is
variableasfollows.Inset i,selectandmeasurethesamplewith
in fact a “stratified random sampling at the sample level,”
rank i, i = 1, 2, ., m. Completion of this step leads to a total
meaning that stratification of the population is induced after
of m samples to be analyzed for the primary variable, out of a
sampling and no construction of the strata is needed before
total of m samples collected.
sampling. Increased precision of stratified random sampling in
5.7.9 Repeat steps 5.7.6 through 5.7.8 for r times to obtain
the estimation of the population mean, relative to SRS, is well
a total of m 3r=n samples to be analyzed and measured for
known, especially when the population is stratified by concen-
the primary variable.
trations.
5.8 Sincethenumberofsets(m)instep5.7.6equalsthesize
5.5 Increased precision of RSS relative to SRS means that
of the set (m), this is called equal allocation. RSS under
the same precision can be achieved with fewer samples
unequal allocation tends to have additional gains in precision,
analyzed for the primary variable under RSS. RSS is therefore
relative to equal allocation; but, this gain is, in general, not
more cost-effective than SRS. When the objective is to
large compared to the gain against SRS, and is not covered in
minimizesamplingandanalyticalcosts,thenumberofsamples
this guide.
can be determined so that RSS has precision equal to that of
5.9 The value of n can be the total number of samples for
SRS at a lower cost.
which the budget can afford to analyze.
5.6 The actual steps to conduct RSS are given below.
5.10 The rounding up in step 5.7.4 may cause the total
5.7 Steps in Ranked Set Sampling (RSS):
number of analyses for the primary variable to exceed n.When
5.7.1 Determine the total number of sample analyses (n)
this is the case, there are two options:
agreed to by the stakeholders. A planning process, such as a
5.10.1 Obtain buy-in from the stakeholders to accept the
data quality objectives (DQO) process (Practice D5792), may
slightly higher total number of sample analyses, or
be used to determine this number.
5.10.2 Trydifferentvaluesof mand rtogetthetotalnumber
5.7.2 Determine the primary variable and the auxiliary
of analyses as close to n as possible.
variable of interest.
5.11 Estimation of Mean and Standard Error of the Mean:
5.7.3 Determine the size of the set, m. Study the auxiliary
measurement and determine its capability in ranking the 5.11.1 In 5.7,if n = 12, m=3,and r = 4, the data on the
samples. For example, if the auxiliary measurement is visual primary variable obtained from the steps in that section may be
inspection, its capability in ranking the samples may be summarized as in Table 1. The true mean concentration of the
somewhat limited. Namely, it may be capable of ranking 3–4 characteristic of interest is estimated by the arithmetic sample
samples, but may have difficulty in ranking greater than 5 or 6 mean of the measured samples. For the hypothetical example
samples based on visual inspection; thus, the preferred size of in Table 1 (and assuming normal distribution of the data), the
the set (m) in ranked set sampling is about 3 or 4. On the other mean (M) is estimated as follows:
´1
D6582–00 (2005)
TABLE 1 Sample Values on Primary Variable
5.12.2.6 Take the first replicate of 9 samples and arrange
Set Replicate Value them into 3 by 3 matrix; each row is called a set with set size
of 3 (three samples in that set).
11 X
2 X
12 5.12.2.7 Rankthethreesampleswithineachofthethreesets
3 X
(namely, assigning ranks to the samples) according to the
4 X
ranking of the soil coloration, giving a rank of 1 for lightest
21 X
coloration, a rank of 3 for the darkest coloration.
2 X
5.12.2.8 In the first set, select the sample with rank 1 for the
3 X
4 X measurement of the primary variable. In set 2, select the
sample with rank 2 for the measurement of the primary
31 X
variable.And so forth for the third set.After this step, a total of
2 X
3 X m = 3 samples have been chosen for the measurement of the
4 X
primary variable.
5.12.2.9 Repeat steps 5.12.2.6 through 5.12.2.8 for four
times to obtain a total of m 3 r=3 3 4 = 12 samples.
5.12.3 After steps 5.12.2.1 through 5.12.2.5, the 36 samples
M 5 ~X 1 X 1 X 1 X 1 X 1 X 1 . 1 X !/12 .
11 12 13 14 21 22 34
to be taken from the population may appear graphically as in
(1)
Fig. 1. The samples in Fig. 1 are arbitrarily numbered from 1
The standard error of the mean (S ) is estimated as follows:
M
through 36.
2 2 2 2
5.12.4 Aftersteps5.12.2.1through5.12.2.9,therankingson
S 5 @~X – X ! 1 ~X – X ! 1 ~X – X ! 1 ~X – X ! 1
$
M 11 1. 12 1. 13 1. 14 1.
2 2 2
the auxiliary variable and the measured values on the primary
~X – X ! 1 ~X – X ! 1 ~X – X ! 1 . 1
21 2. 22 2. 23 2.
2 2 1/2
variable may appear as in Table 2. Each row of three samples
~X – X ! #/~m r~r–1!% (2)
34 3.
can be called a cluster, and they are so designated in Table 2.
where:
These clusters can be marked in Fig. 1.
th
X = the value of the primary variable from the i set and
ij
5.12.5 Table 3 is a summary of the data on the primary
th
the j replicate, and
variable in Table 2. Note that the sample values of the primary
X = the average of set i.
i
variableandthebold-facedrankdatainTable2happentohave
the same ordering in all the replicates, except replicate 4,
NOTE 2—The numerator of Eq 2 represents the squared differences
between a value and its set average. implying good correlation between the auxiliary variable and
the primary variable.
5.11.2 Giventheseestimates,inferenceaboutthepopulation
5.12.6 When the data of the primary variable in Table 3
mean concentration can be made from the sample data (with
follow a normal distribution, the sample mean (M) and
some typical assumptions about the underlying statistical
standard error of the mean (S ) can be calculated as follows:
M
distribution of the data). This includes the use of confidence
M 5 ~9110112115115116120114117118123120!/12 5 15.75,
limits to estimate the population mean.
(3)
5.12 An Illustration of RSS:
5.12.1 An illustration of the steps in 5.7 and 5.11 is given in
2 2 2 2
5.12.2.1 through 5.12.2.9. The objective of this example is to
S 5 $@~9–11.5! 1 ~10–11.5! 1 ~12–11.5! 1 ~15–11.5!
M
2 2 2 2
estimate the mean of total petroleum hydrocarbon (TPH)
1 ~15–16.25! 1 ~16–16.25! 1 ~20–16.25! 1 ~14–16.25!
2 2 2 2 2
concentration in the soil of a 1-acre site, down to the depth of
1 17–19.5 1 18–19.5 1 23–19.5 1 20–19.5 /[ 3 4
~ ! ~ ! ~ ! ~ ! # ~ !~ !
1/2
one inch from the surface. Assume that the stakeholders agree
~4–1!
#%
1/2
that,duetocostandotherconsiderations,atotalof12analyses
5 ~62.75/81!
are the limit. Further assume that coloration on the surface of 5 0.76. (4)
the soil is positively correlated with TPH concentration, with
5.12.7 One- or two-sided confidence limits (CL) can be
darker color indicating higher concentration.
calculated from the sample mean and sample standard error of
5.12.2 The steps to carry out ranked set sampling are as
follows:
5.12.2.1 n = 12, the desired total number of analyses for the
primary variable.
5.12.2.2 The primary variable = TPH concentration and the
auxiliary variable = soil coloration, where coloration is ob-
served in-situ
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.