ASTM E2139-05(2018)
(Test Method)Standard Test Method for Same-Different Test
Standard Test Method for Same-Different Test
SIGNIFICANCE AND USE
5.1 This overall difference test method is used when the test objective is to determine whether a sensory difference exists or does not exist between two samples. It is also known as the simple difference test.
5.2 The test is appropriate in situations where samples have extreme intensities, give rapid sensory fatigue, have long lingering flavors, or cannot be consumed in large quantities, or a combination thereof.
5.3 The test is also appropriate for situations where the stimulus sites are limited to two (for example, two hands, each side of the face, two ears).
5.4 The test provides a measure of the bias where judges perceive two same products to be different.
5.5 The test has the advantage of being a simple and intuitive task.
SCOPE
1.1 This test method describes a procedure for comparing two products.
1.2 This test method does not describe the Thurstonian modeling approach to this test.
1.3 This test method is sometimes referred to as the simple-difference test.
1.4 A same-different test determines whether two products are perceived to be the same or different overall.
1.5 The procedure of the test described in this test method consists of presenting a single pair of samples to each assessor. The presentation of multiple pairs would require different statistical treatment and it is outside of the scope of this test method.
1.6 This test method is not attribute-specific, unlike the directional difference test.
1.7 This test method is not intended to determine the magnitude of the difference; however, statistical methods may be used to estimate the size of the difference.
1.8 This test method may be chosen over the triangle or duo-trio tests where sensory fatigue or carry-over are a concern, or where a simpler task is needed.
1.9 This standard may involve hazardous materials, operations, and equipment. This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use.
1.10 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
General Information
Relations
Standards Content (Sample)
This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the
Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
Designation: E2139 − 05 (Reapproved 2018)
Standard Test Method for
Same-Different Test
This standard is issued under the fixed designation E2139; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision.Anumber in parentheses indicates the year of last reapproval.A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope 2. Referenced Documents
1.1 This test method describes a procedure for comparing 2.1 ASTM Standards:
two products. E253Terminology Relating to Sensory Evaluation of Mate-
rials and Products
1.2 This test method does not describe the Thurstonian
E456Terminology Relating to Quality and Statistics
modeling approach to this test.
E1871GuideforServingProtocolforSensoryEvaluationof
1.3 This test method is sometimes referred to as the simple-
Foods and Beverages
difference test. 2
2.2 ASTM Publications:
1.4 A same-different test determines whether two products
Manual 26Sensory Testing Methods, 2nd Edition
are perceived to be the same or different overall. STP 758Guidelines for the Selection and Training of Sen-
sory Panel Members
1.5 The procedure of the test described in this test method
STP 913Guidelines for Physical Requirements for Sensory
consistsofpresentingasinglepairofsamplestoeachassessor.
Evaluation Laboratories
The presentation of multiple pairs would require different
2.3 ISO Standard:
statistical treatment and it is outside of the scope of this test
ISO 5495Sensory Analysis—Methodology—Paired Com-
method.
parison
1.6 This test method is not attribute-specific, unlike the
directional difference test.
3. Terminology
1.7 This test method is not intended to determine the
3.1 For definition of terms relating to sensory analysis, see
magnitude of the difference; however, statistical methods may
Terminology E253, and for terms relating to statistics, see
be used to estimate the size of the difference.
Terminology E456.
1.8 This test method may be chosen over the triangle or
3.2 Definitions of Terms Specific to This Standard:
duo-trio tests where sensory fatigue or carry-over are a
3.2.1 α (alpha) risk—probability of concluding that a per-
concern, or where a simpler task is needed.
ceptible difference exists when, in reality, one does not (also
1.9 This standard may involve hazardous materials, known as Type I Error or significance level).
operations, and equipment. This standard does not purport to
3.2.2 β (beta) risk—probability of concluding that no per-
address all of the safety concerns, if any, associated with its
ceptible difference exists when, in reality, one does (also
use. It is the responsibility of the user of this standard to
known as Type II Error).
establish appropriate safety, health, and environmental prac-
3.2.3 chi-square test—statistical test used to test hypotheses
tices and determine the applicability of regulatory limitations
on frequency counts and proportions.
prior to use.
3.2.4 ∆ (delta)—test sensitivity parameter established prior
1.10 This international standard was developed in accor-
to testing and used along with the selected values of α, β, and
dance with internationally recognized principles on standard-
an estimated value of p to determine the number of assessors
ization established in the Decision on Principles for the
needed in a study. Delta (∆) is the minimum difference in
Development of International Standards, Guides and Recom-
proportions that the researcher wants to detect, where the
mendations issued by the World Trade Organization Technical
difference is ∆ = p − p . ∆ is not a standard measure of
2 1
Barriers to Trade (TBT) Committee.
1 2
ThistestmethodisunderthejurisdictionofASTMCommitteeE18onSensory For referenced ASTM standards, visit the ASTM website, www.astm.org, or
Evaluation and is the direct responsibility of Subcommittee E18.04 on Fundamen- contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
tals of Sensory. Standards volume information, refer to the standard’s Document Summary page on
Current edition approved Aug. 1, 2018. Published August 2018. Originally the ASTM website.
approved in 2005. Last previous edition approved in 2011 as E2139–05 (2011). Available fromAmerican National Standards Institute (ANSI), 25 W. 43rd St.,
DOI: 10.1520/E2139-05R18. 4th Floor, New York, NY 10036, http://www.ansi.org.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E2139 − 05 (2018)
sensory difference. The same value of ∆ may correspond to 5.3 The test is also appropriate for situations where the
different sensory differences for different values of p (see 9.5 stimulussitesarelimitedtotwo(forexample,twohands,each
for an example). side of the face, two ears).
3.2.5 Fisher’s Exact Test (FET)—statisticaltestoftheequal-
5.4 The test provides a measure of the bias where judges
ity of two independent binomial proportions. perceive two same products to be different.
3.2.6 p —proportion of assessors in the population who
5.5 The test has the advantage of being a simple and
would respond different to the matched sample pair. Based on intuitive task.
experiencewithusingthesame-differenttestandpossiblywith
the same type of products, the user may have a priori 6. Apparatus
knowledge about the value of p .
6.1 Carry out the test under conditions that prevent contact
3.2.7 p —proportion of assessors in the population who
between assessors until the evaluations have been completed,
would respond different to the unmatched sample pair. for example, booths that comply with STP 913.
3.2.8 power 1-β (beta) risk—probability of concluding that
6.2 For food and beverage tests, sample preparation and
a perceptible difference exists when, in reality, one of size ∆
serving sizes should comply with Practice E1871, or see Refs
does.
(1) or (2).
3.2.9 product—material to be evaluated.
7. Definition of Hypotheses
3.2.10 sample—unit of product prepared, presented, and
7.1 This test can be characterized by a two-by-two table of
evaluated in the test.
probabilities according to the sample pair that the assessors in
3.2.11 sensitivity—termusedtosummarizetheperformance
the population would receive and their responses, as follows:
characteristics of this test. The sensitivity of the test is defined
Assessor Would Receive
by the four values selected for α, β, p , and ∆.
Matched Pair Unmatched Pair
(AA or BB) (AB or BA)
Assessor’s Same: 1 − p 1− p
1 2
4. Summary of Test Method
Response
Different: p p =(= p + ∆)
1 2 1
Total: 1 1
4.1 Clearly define the test objective in writing.
where p and p are the probabilities of responding different
1 2
4.2 Choosethenumberofassessorsbasedonthesensitivity
for those who would receive the matched pairs and the
desired for the test. The sensitivity of the test is in part related
unmatched pairs, respectively.
to two competing risks: the risk of declaring a difference when
7.2 To determine whether the samples are perceptibly dif-
there is none (that is, α-risk), and the risk of not declaring a
ferent with a given sensitivity, the following one-sided statis-
differencewhenthereisone(thatis, β-risk).Acceptablevalues
tical hypothesis is tested:
of α and β vary depending on the test objective. The values
should be agreed upon by all parties affected by the results of
H : p = p
o 1 2
H : p < p
the test. a 1 2
7.3 The hypothesis test can be expressed in terms of the
4.3 The two products of interest (A and B) are selected.
minimumdetectabledifference ∆(H : ∆=0versus H : ∆>0).
o a
Assessors are presented with one of four possible pairs of
Delta (∆) will equal 0 and p will equal p if there is no
1 2
samples: A/A, B/B, A/B, and B/A. The total number of same
detectable difference between the samples. This test addresses
pairs(A/AandB/B)usuallyequalsthetotalnumberofdifferent
whether or not ∆ is greater than 0. Thus, the hypothesis is
pairs (A/B and B/A). The assessor’s task is to categorize the
one-sided because it is not of interest in this test to consider
given pair of samples as same or different.
that responding different to the matched pair could be more
4.4 The data are summarized in a two-by-two table where
likely than responding different to the unmatched pair.
the columns show the type of pair received (same or different)
and the rows show the assessor’s response (same or different).
8. Assessors
A Fisher’s Exact Test (FET) is used to determine whether the
8.1 Allassessorsmustbefamiliarwiththemechanicsofthe
samplesareperceptiblydifferent.Otherstatisticalmethodsthat
same-different test (the format, the task, and the procedure of
approximate the FET can sometimes be used.
evaluation).Greatertestsensitivity,ifneeded,maybeachieved
through selection of assessors who demonstrate above average
5. Significance and Use
individual sensitivity (see STP 758).
5.1 Thisoveralldifferencetestmethodisusedwhenthetest
8.2 In order to perform this test, assessors do not require
objectiveistodeterminewhetherasensorydifferenceexistsor
special sensory training on the samples in question. For
does not exist between two samples. It is also known as the
example, they do not need to be able to recognize any specific
simple difference test.
attribute.
5.2 The test is appropriate in situations where samples have
extreme intensities, give rapid sensory fatigue, have long
lingering flavors, or cannot be consumed in large quantities, or
Theboldfacenumbersinparenthesesrefertothelistofreferencesattheendof
a combination thereof. this standard.
E2139 − 05 (2018)
8.3 The assessors must be sampled from a homogeneous 0.70−0.50=0.20and β=0.10or90%power.Thenumberof
populationthatiswell-defined.Thepopulationmustbechosen assessors needed in this case is 224 (Table A1.1).
onthebasisofthetestobjective.Definingcharacteristicsofthe
population can be, for example, training level, gender, experi- 10. Procedure
ence with the product, and so forth.
10.1 Determine the number of assessors needed for the test
as well as the population that they should represent (for
9. Number of Assessors
example, assessors selected for a specific sensory sensitivity).
9.1 Choose all the sensitivity parameters that are needed to
10.2 It is critical to the validity of the test that assessors
choose the number of assessors for the test. Choose the α-risk
cannot identify the samples from the way in which they are
andthe β-risk.Basedonexperience,choosetheexpectedvalue
presented.Oneshouldavoidanysubtledifferencesintempera-
for p . Choose ∆, p − p , the minimum difference in propor-
1 2 1
ture or appearance, especially color, caused by factors such as
tions that the researcher wants to detect. The most commonly
the time sequence of preparation. It may be possible to mask
used values for α-risk, β-risk, p and ∆ are α = 0.05, β = 0.20,
color differences using light filters, subdued illumination or
p = 0.3, and ∆ = 0.3. These values can be adjusted on a
colored vessels. Prepare samples out of sight and in an
case-by-case basis to reflect the sensitivity desired versus the
identical manner: same apparatus, same vessels, same quanti-
number of assessors.
ties of product (see Practice E1871). The samples may be
9.2 Having defined the required sensitivity (α-risk, β-risk,
prepared in advance; however, this may not be possible for all
p , and ∆), determine the corresponding sample size from
types of products. It is essential that the samples cannot be
Table A1.1 (see Ref (3)). This is done by first finding the
recognized from the way they are presented.
section of the table with a p value corresponding to the
10.3 Prepare serving order worksheet and ballot in advance
proportion of assessors in the population who would respond
ofthetesttoensureabalancedorderofsamplepresentationof
different to the matched sample pair. Second, locate the total
the two products, A and B. One of four possible pairs (A/A,
sample size from the intersection of the desired α, p (or ∆),
B/B,A/B,andB/A)isassignedtoeachassessor.Makesurethis
and β values. In the case of the most commonly used values
assignment is done randomly. Design the test so that the
listedin9.1,TableA1.1indicatesthat84assessorsareneeded.
numberof samepairsequalsthenumberof differentpairs.The
Thesamplesize nisbasedonthenumberofsameanddifferent
presentation order of the different pairs should be balanced as
samples being equal. The sample sizes listed are the total
much as possible. Serving order worksheets should also
sample size rounded up to the nearest number evenly divisible
include the identification of the samples for each set.
by4sincetherearefourpossiblecombinationsofthesamples.
Todeterminethenumberofsameanddifferentpairstoprepare,
10.4 Prepare the response ballots in a way consistent with
divide n by two.
the product you are evaluating. For example, in a taste test,
give the following instructions: (1) you will receive two
9.3 If the user has no prior experience with the same-
samples. They may be the same or different; (2) evaluate the
differenttestandhasnospecificexpectationforthevalueof p ,
samples from left to right; and (3) determine whether they are
thentwooptionsareavailable.Eitheruse p =0.3andproceed
the same or different.
as indicated in 9.2, or use the last section of Table A1.1. This
10.4.1 Theresearchercanchoosetoaddaninstructiontothe
sectiongivessamplesizesthatarethelargestrequired,given α,
ballot indicating whether the assessor may re-evaluate the
β, and ∆, regardless of p .
samples or not.
9.4 Ofteninpractice,thenumberofassessorsisdetermined
10.4.2 The ballot should also identify the assessor and date
by practical conditions (for example, duration of the
of test, as well as a ballot number that must be related to the
experiment,numberofavailableassessors,quantityofproduct,
sample set identification on the worksheet.
and so forth). However, increasing the number of assessors
10.4.3 A section soliciting comments may be included
increases the likelihood of detecting small differences. Thus,
following the initial forced-choice question.
one should expect to use larger numbers of assessors when
10.4.4 The example of a ballot is provided in Fig. X2.2.
trying to demonstrate that products are similar compared to
when one is trying to demonstrate that they are different.
10.5 When possible, present both samples at the same time,
9.4.1 When the number of assessors is fixed, the power of
along with the response ballot. In some instances, the samples
the test (1-β) may be calculated by establishing a value for p ,
may be pres
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.