Standard Test Method for Same-Different Test

SIGNIFICANCE AND USE
This overall difference test method is used when the test objective is to determine whether a sensory difference exists or does not exist between two samples. It is also known as the simple difference test.
The test is appropriate in situations where samples have extreme intensities, give rapid sensory fatigue, have long lingering flavors, or cannot be consumed in large quantities, or a combination thereof.
The test is also appropriate for situations where the stimulus sites are limited to two (for example, two hands, each side of the face, two ears).
The test provides a measure of the bias where judges perceive two same products to be different.
The test has the advantage of being a simple and intuitive task.
SCOPE
1.1 This test method describes a procedure for comparing two products.
1.2 This test method does not describe the Thurstonian modeling approach to this test.
1.3 This test method is sometimes referred to as the simple-difference test.
1.4 A same-different test determines whether two products are perceived to be the same or different overall.
1.5 The procedure of the test described in this test method consists of presenting a single pair of samples to each assessor. The presentation of multiple pairs would require different statistical treatment and it is outside of the scope of this test method.
1.6 This test method is not attribute-specific, unlike the directional difference test.
1.7 This test method is not intended to determine the magnitude of the difference; however, statistical methods may be used to estimate the size of the difference.
1.8 This test method may be chosen over the triangle or duo-trio tests where sensory fatigue or carry-over are a concern, or where a simpler task is needed.
1.9 This standard may involve hazardous materials, operations, and equipment. This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety and health practices and determine the applicability of regulatory limitations prior to use.

General Information

Status
Historical
Publication Date
31-Jul-2011
Technical Committee
Drafting Committee
Current Stage
Ref Project

Relations

Buy Standard

Standard
ASTM E2139-05(2011) - Standard Test Method for Same-Different Test
English language
13 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


NOTICE: This standard has either been superseded and replaced by a new version or withdrawn.
Contact ASTM International (www.astm.org) for the latest information
Designation: E2139 − 05 (Reapproved 2011)
Standard Test Method for
Same-Different Test
This standard is issued under the fixed designation E2139; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision.Anumber in parentheses indicates the year of last reapproval.A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope E253Terminology Relating to Sensory Evaluation of Mate-
rials and Products
1.1 This test method describes a procedure for comparing
E456Terminology Relating to Quality and Statistics
two products.
E1871GuideforServingProtocolforSensoryEvaluationof
1.2 This test method does not describe the Thurstonian
Foods and Beverages
modeling approach to this test.
2.2 ASTM Publications:
1.3 This test method is sometimes referred to as the simple-
Manual 26Sensory Testing Methods, 2nd Edition
difference test.
STP 758Guidelines for the Selection and Training of Sen-
sory Panel Members
1.4 A same-different test determines whether two products
STP 913Guidelines for Physical Requirements for Sensory
are perceived to be the same or different overall.
Evaluation Laboratories
1.5 The procedure of the test described in this test method
2.3 ISO Standard:
consistsofpresentingasinglepairofsamplestoeachassessor.
ISO 5495Sensory Analysis—Methodology—Paired Com-
The presentation of multiple pairs would require different
parison
statistical treatment and it is outside of the scope of this test
method.
3. Terminology
1.6 This test method is not attribute-specific, unlike the
3.1 For definition of terms relating to sensory analysis, see
directional difference test.
Terminology E253, and for terms relating to statistics, see
Terminology E456.
1.7 This test method is not intended to determine the
magnitude of the difference; however, statistical methods may
3.2 Definitions of Terms Specific to This Standard:
be used to estimate the size of the difference.
3.2.1 α (alpha) risk—probability of concluding that a per-
1.8 This test method may be chosen over the triangle or ceptible difference exists when, in reality, one does not (also
duo-trio tests where sensory fatigue or carry-over are a known as Type I Error or significance level).
concern, or where a simpler task is needed.
3.2.2 β (beta) risk—probability of concluding that no per-
1.9 This standard may involve hazardous materials, ceptible difference exists when, in reality, one does (also
operations, and equipment. This standard does not purport to known as Type II Error).
address all of the safety concerns, if any, associated with its
3.2.3 chi-square test—statistical test used to test hypotheses
use. It is the responsibility of the user of this standard to
on frequency counts and proportions.
establish appropriate safety and health practices and deter-
3.2.4 ∆ (delta)—test sensitivity parameter established prior
mine the applicability of regulatory limitations prior to use.
to testing and used along with the selected values of α, β, and
an estimated value of p to determine the number of assessors
2. Referenced Documents
needed in a study. Delta (∆) is the minimum difference in
2.1 ASTM Standards:
proportions that the researcher wants to detect, where the
difference is ∆ = p − p . ∆ is not a standard measure of
2 1
sensory difference. The same value of ∆ may correspond to
ThistestmethodisunderthejurisdictionofASTMCommitteeE18onSensory
different sensory differences for different values of p (see 9.5
Evaluation and is the direct responsibility of Subcommittee E18.04 on Fundamen-
for an example).
tals of Sensory.
Current edition approved Aug. 1, 2011. Published August 2011. Originally
3.2.5 Fisher’s Exact Test (FET)—statisticaltestoftheequal-
approved in 2005. Last previous edition approved in 2005 as E2139–05. DOI:
ity of two independent binomial proportions.
10.1520/E2139-05R11.
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
Standards volume information, refer to the standard’s Document Summary page on Available fromAmerican National Standards Institute (ANSI), 25 W. 43rd St.,
the ASTM website. 4th Floor, New York, NY 10036, http://www.ansi.org.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E2139 − 05 (2011)
3.2.6 p —proportion of assessors in the population who 5.5 The test has the advantage of being a simple and
would respond different to the matched sample pair. Based on intuitive task.
experiencewithusingthesame-differenttestandpossiblywith
the same type of products, the user may have a priori 6. Apparatus
knowledge about the value of p .
6.1 Carry out the test under conditions that prevent contact
3.2.7 p —proportion of assessors in the population who
between assessors until the evaluations have been completed,
would respond different to the unmatched sample pair.
for example, booths that comply with STP 913.
3.2.8 power 1-β (beta) risk—probability of concluding that
6.2 For food and beverage tests, sample preparation and
a perceptible difference exists when, in reality, one of size ∆
serving sizes should comply with Practice E1871, or see Refs
does.
(1) or (2).
3.2.9 product—material to be evaluated.
7. Definition of Hypotheses
3.2.10 sample—unit of product prepared, presented, and
7.1 This test can be characterized by a two-by-two table of
evaluated in the test.
probabilities according to the sample pair that the assessors in
3.2.11 sensitivity—termusedtosummarizetheperformance
the population would receive and their responses, as follows:
characteristics of this test. The sensitivity of the test is defined
Assessor Would Receive
by the four values selected for α, β, p , and ∆.
Matched Pair Unmatched Pair
(AA or BB) (AB or BA)
Assessor’s Same: 1 − p 1− p
4. Summary of Test Method 1 2
Response Different: p p =(= p + ∆)
1 2 1
4.1 Clearly define the test objective in writing. Total: 1 1
where p and p are the probabilities of responding different
1 2
4.2 Choosethenumberofassessorsbasedonthesensitivity
for those who would receive the matched pairs and the
desired for the test. The sensitivity of the test is in part related
unmatched pairs, respectively.
to two competing risks: the risk of declaring a difference when
there is none (that is, α-risk), and the risk of not declaring a
7.2 To determine whether the samples are perceptibly dif-
differencewhenthereisone(thatis, β-risk).Acceptablevalues
ferent with a given sensitivity, the following one-sided statis-
of α and β vary depending on the test objective. The values
tical hypothesis is tested:
should be agreed upon by all parties affected by the results of
H : p = p
o 1 2
the test. H : p < p
a 1 2
7.3 The hypothesis test can be expressed in terms of the
4.3 The two products of interest (A and B) are selected.
minimumdetectabledifference ∆(H : ∆=0versus H : ∆>0).
Assessors are presented with one of four possible pairs of o a
Delta (∆) will equal 0 and p will equal p if there is no
samples: A/A, B/B, A/B, and B/A. The total number of same 1 2
detectable difference between the samples. This test addresses
pairs(A/AandB/B)usuallyequalsthetotalnumberof different
whether or not ∆ is greater than 0. Thus, the hypothesis is
pairs (A/B and B/A). The assessor’s task is to categorize the
one-sided because it is not of interest in this test to consider
given pair of samples as same or different.
that responding different to the matched pair could be more
4.4 The data are summarized in a two-by-two table where
likely than responding different to the unmatched pair.
the columns show the type of pair received (same or different)
and the rows show the assessor’s response (same or different).
8. Assessors
A Fisher’s Exact Test (FET) is used to determine whether the
8.1 Allassessorsmustbefamiliarwiththemechanicsofthe
samplesareperceptiblydifferent.Otherstatisticalmethodsthat
same-different test (the format, the task, and the procedure of
approximate the FET can sometimes be used.
evaluation).Greatertestsensitivity,ifneeded,maybeachieved
through selection of assessors who demonstrate above average
5. Significance and Use
individual sensitivity (see STP 758).
5.1 Thisoveralldifferencetestmethodisusedwhenthetest
8.2 In order to perform this test, assessors do not require
objectiveistodeterminewhetherasensorydifferenceexistsor
special sensory training on the samples in question. For
does not exist between two samples. It is also known as the
example, they do not need to be able to recognize any specific
simple difference test.
attribute.
5.2 The test is appropriate in situations where samples have
8.3 The assessors must be sampled from a homogeneous
extreme intensities, give rapid sensory fatigue, have long
populationthatiswell-defined.Thepopulationmustbechosen
lingering flavors, or cannot be consumed in large quantities, or
onthebasisofthetestobjective.Definingcharacteristicsofthe
a combination thereof.
population can be, for example, training level, gender, experi-
5.3 The test is also appropriate for situations where the
ence with the product, and so forth.
stimulus sites are limitedtotwo(forexample,twohands,each
side of the face, two ears).
5.4 The test provides a measure of the bias where judges
Theboldfacenumbersinparenthesesrefertothelistofreferencesattheendof
perceive two same products to be different. this standard.
E2139 − 05 (2011)
9. Number of Assessors versus the expected p =50% with an α-risk of 5%, then ∆ =
0.70−0.50=0.20and β=0.10or90%power.Thenumberof
9.1 Choose all the sensitivity parameters that are needed to
assessors needed in this case is 224 (Table A1.1).
choose the number of assessors for the test. Choose the α-risk
andthe β-risk.Basedonexperience,choosetheexpectedvalue
10. Procedure
for p . Choose ∆, p − p , the minimum difference in propor-
1 2 1
10.1 Determine the number of assessors needed for the test
tions that the researcher wants to detect. The most commonly
as well as the population that they should represent (for
used values for α-risk, β-risk, p and ∆ are α = 0.05, β = 0.20,
example, assessors selected for a specific sensory sensitivity).
p = 0.3, and ∆ = 0.3. These values can be adjusted on a
10.2 It is critical to the validity of the test that assessors
case-by-case basis to reflect the sensitivity desired versus the
cannot identify the samples from the way in which they are
number of assessors.
presented.Oneshouldavoidanysubtledifferencesintempera-
9.2 Having defined the required sensitivity (α-risk, β-risk,
ture or appearance, especially color, caused by factors such as
p , and ∆), determine the corresponding sample size from
the time sequence of preparation. It may be possible to mask
Table A1.1 (see Ref (9)). This is done by first finding the
color differences using light filters, subdued illumination or
section of the table with a p value corresponding to the
colored vessels. Prepare samples out of sight and in an
proportion of assessors in the population who would respond
identical manner: same apparatus, same vessels, same quanti-
different to the matched sample pair. Second, locate the total
ties of product (see Practice E1871). The samples may be
sample size from the intersection of the desired α, p (or ∆),
prepared in advance; however, this may not be possible for all
and β values. In the case of the most commonly used values
types of products. It is essential that the samples cannot be
listedin9.1,TableA1.1indicatesthat84assessorsareneeded.
recognized from the way they are presented.
Thesamplesize nisbasedonthenumberofsameanddifferent
10.3 Prepare serving order worksheet and ballot in advance
samples being equal The sample sizes listed are the total
ofthetesttoensureabalancedorderofsamplepresentationof
sample size rounded up to the nearest number evenly divisible
the two products, A and B. One of four possible pairs (A/A,
by4sincetherearefourpossiblecombinationsofthesamples.
B/B,A/B,andB/A)isassignedtoeachassessor.Makesurethis
Todeterminethenumberofsameanddifferentpairstoprepare,
assignment is done randomly. Design the test so that the
divide n by two.
numberof samepairsequalsthenumberof differentpairs.The
9.3 If the user has no prior experience with the same-
presentation order of the different pairs should be balanced as
differenttestandhasnospecificexpectationforthevalueof p ,
much as possible. Serving order worksheets should also
thentwooptionsareavailable.Eitheruse p =0.3andproceed
include the identification of the samples for each set.
as indicated in 9.2, or use the last section of Table A1.1. This
10.4 Prepare the response ballots in a way consistent with
section gives samples sizes that are the largest required, given
the product you are evaluating. For example, in a taste test,
α, β, and ∆, regardless of p .
give the following instructions: (1) you will receive two
9.4 Ofteninpractice,thenumberofassessorsisdetermined
samples. They may be the same or different; (2) evaluate the
by practical conditions (for example, duration of the
samples from left to right; and (3) determine whether they are
experiment,numberofavailableassessors,quantityofproduct,
the same or different.
and so forth) However, increasing the number of assessors
10.4.1 Theresearchercanchoosetoaddaninstructiontothe
increases the likelihood of detecting small differences. Thus,
ballot indicating whether the assessor may re-evaluate the
one should expect to use larger numbers of assessors when
samples or not.
trying to demonstrate that products are similar compared to
10.4.2 The ballot should also identify the assessor and date
when one is trying to demonstrate that they are different.
of test, as well as a ballot number that must be related to the
9.4.1 When the number of assessors is fixed, the power of
sample set identification on the worksheet.
the test (1-β) may be calculated by establishing a value for p ,
10.4.3 A section soliciting comments may be included
defining the required sensitivity for α-risk and the ∆, locating
following the initial forced-choice question.
the number of assessors nearest the fixed amount, and then
10.4.4 The example of a ballot is provided in Fig. X2.2.
following up the column to the listed β-risk.
10.5 When possible, present both samples at the same time,
9.5 If a researcher wants to be 90% certain of detecting along with the response ballot. In some instances, the samples
response proportions of p = 60 % versus the expected may be presented sequentially if required by the type of
p =40% with an α-risk of 5%, then ∆ = 0.60 − 0.40 = 0.20 productorthewaytheyneedtobepresented,orboth.Thismay
a
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.