ASTM E2943-15(2021)
(Guide)Standard Guide for Two-Sample Acceptance and Preference Testing With Consumers
Standard Guide for Two-Sample Acceptance and Preference Testing With Consumers
SIGNIFICANCE AND USE
5.1 Acceptance and preference are the key measurements taken in consumer product testing as either a new product idea is developed into testable prototypes or existing products are evaluated for potential improvements, cost reductions, or other business reasons. Developing products that are preferred overall, or liked as well as, or better, on average, compared to a standard or a competitor, among a defined target consumer group, is usually the main goal of the product development process. Thus, it is necessary to test the consumer acceptability or the preference of a product or prototype compared to other prototypes or potential products, a standard product, or other products in the market. The researcher, with input from her/his stakeholders, has the responsibility to choose appropriate comparison products and scaling or test methods to evaluate them. In the case of a new-to-the-world product, there may or may not be a relevant product for comparison. In this case, a benchmark score or rating may be used to determine acceptability. A product or prototype that is acceptable to the target consumer is one that meets a minimum criterion for liking, and a product that is preferred over an existing product has the potential to be chosen more often than the less-preferred product by the consumer in the marketplace, when all other factors are equal.
5.2 The external validity (the extent to which the results of a study can be generalized) of both acceptance and preference measures to manage decision risk at all stages of the development cycle is dependent on the ability of the researcher to generalize the results from the respondent sample to the target population at large. This depends both upon the sample of respondents and the way the test is constructed. Within the context of a single test, acceptance measures tell the relative hedonic status of the two samples, quantitatively, as well as where on the hedonic continuum each of the samples falls, that is, “disliked,”...
SCOPE
1.1 This guide covers acceptance and preference measures when each is used in an unbranded, two-sample, product test. Each measure, acceptance, and preference, may be used alone or together in a single test or separated by time. This guide covers how to establish a product’s hedonic or choice status based on sensory attributes alone, rather than brand, positioning, imagery, packaging, pricing, emotional-cultural responses, or other nonsensory aspects of the product. The most commonly used measures of acceptance and preference will be covered, that is, product liking overall as measured by the nine-point hedonic scale and preference measured by choice, either two-alternative forced choice or two-alternative with a “no preference” option.
1.2 Three of the biggest challenges in measuring a product’s hedonic (overall liking or acceptability) or choice status (preference selection) are determining how many respondents and who to include in the respondent sample, setting up the questioning sequence, and interpreting the data to make product decisions.
1.3 This guide covers:
1.3.1 Definition of each type of measure,
1.3.2 Discussion of the advantages and disadvantages of each,
1.3.3 When to use each,
1.3.4 Practical considerations in test execution,
1.3.5 Risks associated with each,
1.3.6 Relationship between the two when administered in the same test, and
1.3.7 Recommended interpretations of results for product decisions.
1.4 The intended audience for this guide is the sensory consumer professional or marketing research professional (“the researcher”) who is designing, executing, and interpreting data from product tests with acceptance or choice measures, or both.
1.5 Only two-sample product tests will be covered in this guide. However, the issues and recommended practices raised in this guide often apply to multi-sample tests as well. Detailed coverage of execution tactics, optional types of s...
General Information
Relations
Standards Content (Sample)
This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the
Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
Designation: E2943 − 15 (Reapproved 2021)
Standard Guide for
Two-Sample Acceptance and Preference Testing With
Consumers
This standard is issued under the fixed designation E2943; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
INTRODUCTION
This guide is intended to be used by sensory consumer and marketing research professionals
(referredtoasthe“researcher”or“researchprofessional”)asanaidtounderstandingissuesassociated
with and to conducting two-sample acceptance and preference tests with consumers. This guide
includes a general summary of considerations and practices for conducting hedonic tests followed by
specific considerations and practices for both acceptance and preference testing, including pros and
cons of each method. Final sections consider the incorporation of both acceptance and preference
testing into the research plan and discuss potential lack of linkage in output/results between them. A
flowchart outlining summary of these methods and references for further reading are also included.
1. Scope 1.3.3 When to use each,
1.3.4 Practical considerations in test execution,
1.1 This guide covers acceptance and preference measures
when each is used in an unbranded, two-sample, product test.
1.3.5 Risks associated with each,
Each measure, acceptance, and preference, may be used alone
1.3.6 Relationship between the two when administered in
or together in a single test or separated by time. This guide
the same test, and
covers how to establish a product’s hedonic or choice status
1.3.7 Recommended interpretations of results for product
based on sensory attributes alone, rather than brand,
decisions.
positioning, imagery, packaging, pricing, emotional-cultural
1.4 The intended audience for this guide is the sensory
responses, or other nonsensory aspects of the product. The
consumerprofessionalormarketingresearchprofessional(“the
most commonly used measures of acceptance and preference
will be covered, that is, product liking overall as measured by researcher”) who is designing, executing, and interpreting data
fromproducttestswithacceptanceorchoicemeasures,orboth.
the nine-point hedonic scale and preference measured by
choice, either two-alternative forced choice or two-alternative
1.5 Only two-sample product tests will be covered in this
with a “no preference” option.
guide. However, the issues and recommended practices raised
1.2 Threeofthebiggestchallengesinmeasuringaproduct’s
in this guide often apply to multi-sample tests as well. Detailed
hedonic (overall liking or acceptability) or choice status
coverage of execution tactics, optional types of scales, various
(preference selection) are determining how many respondents
approaches to data analysis, and extensive discussions of the
and who to include in the respondent sample, setting up the
reliability and validity of these measures are all outside of the
questioning sequence, and interpreting the data to make prod-
scope of this guide.
uct decisions.
1.6 Units—The values stated in SI units are to be regarded
1.3 This guide covers:
as the standard. No other units of measurement are included in
1.3.1 Definition of each type of measure,
this standard.
1.3.2 Discussion of the advantages and disadvantages of
1.7 This standard does not purport to address all of the
each,
safety concerns, if any, associated with its use. It is the
responsibility of the user of this standard to establish appro-
priate safety, health, and environmental practices and deter-
This guide is under the jurisdiction of ASTM Committee E18 on Sensory
Evaluation and is the direct responsibility of Subcommittee E18.04 on Fundamen- mine the applicability of regulatory limitations prior to use.
tals of Sensory.
1.8 This international standard was developed in accor-
Current edition approved Jan. 1, 2021. Published April 2021. Originally
dance with internationally recognized principles on standard-
approved in 2014. Last previous edition approved in 2015 as E2943 – 15. DOI:
10.1520/E2943-15R21. ization established in the Decision on Principles for the
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E2943 − 15 (2021)
Development of International Standards, Guides and Recom- (SD). The individual statements are either clearly favorable or
mendations issued by the World Trade Organization Technical clearly unfavorable (2 and 3).
Barriers to Trade (TBT) Committee.
3.2.6 P ,n—used in forced choice preference measures; a
max
test sensitivity parameter established before testing and used
2. Referenced Documents
along with the selected values of α and β to determine the
2.1 ASTM Standards:
number of respondents needed in a study.
E253 Terminology Relating to Sensory Evaluation of Mate-
3.2.6.1 Discussion—P is the proportion of common re-
max
rials and Products
sponses that the researcher wants the test to be able to detect
E456 Terminology Relating to Quality and Statistics
with a probability of 1 –β. For example, if a researcher wants
E1871 Guide for Serving Protocol for Sensory Evaluation of
to have a 90 % confidence level of detecting a 60:40 split in
Foods and Beverages
preference, then P = 60 % and β = 0.10.
max
E1958 Guide for Sensory Claim Substantiation
3.2.7 risk, n—possible consequences to the researcher’s
E2263 Test Method for Paired Preference Test
client when the test leads to an incorrect conclusion.
E2299 Guide for Sensory Evaluation of Products by Chil-
3.2.7.1 Discussion—Risk around decisions made based on
dren and Minors
research test results can be grouped into two types, loosely
called a “false positive” (when the test detects a difference that
3. Terminology
does not exist) and a “false negative” when the study does not
3.1 Definitions:
detect a true difference. In the case of a false positive, the
3.1.1 For definitions of terms relating to sensory analysis,
company spends development time and resources on an alter-
see Terminology E253.
native that does not deliver the intended effect. In the case of
3.1.2 For terms relating to statistics, see Terminology E456.
a false negative, the product developer or the company will
3.2 Definitions of Terms Specific to This Standard:
miss a product opportunity and waste resources developing
3.2.1 α (alpha) risk, n—probability of concluding that a
alternatives.
difference in liking or preference exists, when, in reality, one
3.2.8 sequential monadic, adj—refers to the presentation or
does not.
ordering in which respondents evaluate products or stimuli.
3.2.1.1 Discussion—Also known as Type I error or signifi-
3.2.8.1 Discussion—In a sequential monadic test, the re-
cance level.
spondent is presented with one product at a time to evaluate.
3.2.2 β (beta) risk, n—probability of concluding that no
3.2.9 sign test, n—statistical hypothesis test that can be used
difference in liking or preference exists, when, in reality, one
to compare two samples or a sample with a standard.
does.
3.2.9.1 Discussion—Noassumptionismadeabouttheshape
3.2.2.1 Discussion—Also known as Type II error.
or parameters of the population frequency distribution with the
3.2.3 hedonic continuum, n—hypothesized underlying con-
sign test and only the sign of the difference is considered.
tinuous dimension measured by acceptance scales.
3.2.10 student’s t test, n—statistical hypothesis test used to
3.2.3.1 Discussion—It is presumed to run from strong dis-
compare the means of two samples or a sample mean to a
liking through a neutral region and onto strong liking.
standard value.
3.2.4 labeled affective magnitude scale, n—labeled magni-
3.2.10.1 Discussion—It is appropriate when the measure of
tudescale(LMS)isahybridscalingtechniqueusingaverbally
interest is normally distributed in small samples and, more
labeled line with quasi-logarithmic spacing between each label
generally, for continuous, unbounded, symmetric measure-
and the scale consists of a vertical line, which is marked with
ments when the sample size is larger. Assumptions include no
verbal anchors describing different intensities (for example,
ties in the data.
“weak,” “strong”).
3.2.11 Type I error, n—see alpha risk.
3.2.4.1 Discussion—Typically, subjects are instructed to
place a mark on the line where their perceived intensity of
3.2.12 Type II error, n—see beta risk.
sensation lies, with the upper limit of the scale being the
3.2.13 Wilcoxon-Mann-Whitney test, WMW, n—rank-based
strongest imaginable sensation (1).
independent sampling alternative to the student’s t-test that is
3.2.5 Likert scale, n—attitude scales that can be constructed
appropriate when the data are measured on a common continu-
in an “agree-disagree” format (2).
ous scale that is not normally distributed.
3.2.5.1 Discussion—The Likert-type scale calls for a graded
3.2.13.1 Discussion—In these situations, it can be more
response to each statement. The response is usually expressed
efficient (increased statistical power to find a difference at a
in terms of the following five categories: strongly agree (SA),
given sample size) than a student’s t-test. Like the students
agree (A), undecided (U), disagree (D), and strongly disagree
t-test, it requires the assumption that the data have no ties.
4. Summary of Guide
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
4.1 This guide covers the similarities and differences be-
Standards volume information, refer to the standard’s Document Summary page on
tween acceptance and preference measures when used alone
the ASTM website.
and together in a two-sample test (see Fig. 1). The two
The boldface numbers in parentheses refer to a list of references at the end of
this standard. measures provide different information about respondents’
E2943 − 15 (2021)
subjective responses to products and should be deployed to identification, control, measurement, and tracking of variables
meet different research or business objectives. Acceptance that may influence results across tests (for example, production
measures are recommended when there is a need to obtain location, sample age, and storage conditions) are the respon-
information on intensity of liking/disliking and determine the sibility of the researcher.
relative hedonic status of two products. Preference measures
5.3 While measures of acceptance and preference are both
are recommended when there is a need to obtain information
subjective responses to products, and can be somewhat related,
on choice behavior or determine an ordinal relationship be-
they provide different information. A product may be “accept-
tweentwoproducts.Correctsamplingofrespondentsiscritical
able” but still not be preferred by the consumer over other
in both types of test. The researcher shall carefully prepare the
alternatives, and conversely, a product may be preferred over
researchlearningplanandthoroughlyreviewtheprosandcons
another but still not be acceptable to the consumer. These two
of the specific research design chosen (that is, measuring
terms, therefore, should not be used interchangeably. When a
acceptance, measuring preference, measuring both) against the
bipolar hedonic scale with multipoint options is used, the
decision risks associated with each measurement. Acceptance
researcher should specifically refer to “liking,” “acceptance,”
and preference measures, while imperfect, continue to be
or “hedonic ratings.” When preference measures are used, the
extremely useful in managing the risk in developing and
researchershouldreferto,“preference,”“productselection,”or
delivering new products to the marketplace.
“choice.” Research professionals themselves should be precise
in their usage of the terms “acceptance” and “liking,” to refer
5. Significance and Use
only to scaling of liking. These researchers should use the
5.1 Acceptance and preference are the key measurements terms “preference” and “choice” to refer to two (“PreferA” or
taken in consumer product testing as either a new product idea “Prefer B”) or three-choice (“Prefer A” or “Prefer B” or “No
is developed into testable prototypes or existing products are Preference”) response options given in a preference test. In
evaluated for potential improvements, cost reductions, or other addition to having different meanings, the two measures also
business reasons. Developing products that are preferred do not always provide similar results.This guide will cover the
overall, or liked as well as, or better, on average, compared to similarities and differences in information each provides, some
a standard or a competitor, among a defined target consumer guidelines around implementation, and interpretation of find-
group, is usually the main goal of the product development ings. This guide will thus give users an understanding of the
process.Thus, it is necessary to test the consumer acceptability issues at hand when planning, designing, implementing, and
or the preference of a product or prototype compared to other interpreting results from acceptance and preference tests with
prototypes or potential products, a standard product, or other consumers.
products in the market. The researcher, with input from her/his
5.4 While both measures are commonly used to provide
stakeholders, has the responsibility to choose appropriate
information for product development decisions and evaluating
comparison products and scaling or test methods to evaluate
a product’s competitive status, it is important to remember that
them. In the case of a new-to-the-world product, there may or
pricing, positioning, competitive options, product availability,
may not be a relevant product for comparison. In this case, a
and other marketplace factors also impact a product’s success.
benchmark score or rating may be used to determine accept-
ability. A product or prototype that is acceptable to the target
6. Hedonic Testing—Steps in Planning and Conducting
consumer is one that meets a minimum criterion for liking, and
an Acceptance or Preference Test
a product that is preferred over an existing product has the
6.1 Decide on the Key Question to be Answered: Liking or
potential to be chosen more often than the less-preferred
Choice or Both—Before planning and implementing a test, the
product by the consumer in the marketplace, wh
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.