ASTM F2889-11(2020)
(Practice)Standard Practice for Assessing Language Proficiency
Standard Practice for Assessing Language Proficiency
SIGNIFICANCE AND USE
4.1 Intended Use:
4.1.1 This practice is intended to serve the language test developer, test provider, and language test user communities in their ability to provide useful, timely, reliable, and reproducible tests of language proficiency for general communication purposes. This practice expands the testing capacity of the United States by leveraging commercial and existing government test development and delivery capability through standardization of these processes. This practice is intended to be used by contract officers, program managers, supervisors, managers, and commanders. It is also intended to be used by test developers, those who select and evaluate tests, and users of test scores.
4.1.2 Furthermore, the intent of this practice is to encourage the use of expert teams to assist contracting officers, contracting officer representatives, test developers, and contractors/vendors in meeting the testing needs being addressed. Users of this practice are encouraged to focus on meeting testing needs and not to interpret this practice as limiting innovation in any way.
4.2 Compliance with the Practice:
4.2.1 Compliance with this practice requires adherence to all sections of this practice. Exceptions are allowed only in specific cases in which a particular section of this practice does not apply to the type or intended use of a test. Exceptions shall be documented and justified to the satisfaction of the customer. Nothing in this practice should be construed as contradicting existing federal and state laws nor allowing for deviation from established U.S. Government policies on testing.
SCOPE
1.1 Purpose—This practice describes best practices for the development and use of language tests in the modalities of speaking, listening, reading, and writing for assessing ability in accordance with the Interagency Language Roundtable (ILR)2 scale. This practice focuses on testing language proficiency in use of language for communicative purposes.
1.2 Limitations—This practice is not intended to address testing and test development in the following specialized areas: Translation, Interpretation, Audio Translation, Transcription, other job-specific language performance tests, or Diagnostic Assessment.
1.2.1 Tests developed under this practice should not be used to address any of the above excluded purposes (for example, diagnostics).
1.3 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
General Information
Relations
Standards Content (Sample)
This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the
Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
Designation:F2889 −11 (Reapproved 2020)
Standard Practice for
1
Assessing Language Proficiency
This standard is issued under the fixed designation F2889; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope 3.1.1 achievement test, n—an instrument designed to mea-
sure what a person has learned within or up to a given time
1.1 Purpose—This practice describes best practices for the
based on a sampling of what has been covered in the syllabus.
development and use of language tests in the modalities of
3.1.2 adaptive test, n—form of individually tailored testing
speaking,listening,reading, and writing for assessingabilityin
2
in which test items are selected from an item bank where test
accordance with the Interagency Language Roundtable (ILR)
items are stored in rank order with respect to their item
scale. This practice focuses on testing language proficiency in
difficulty and presented to test takers during the test on the
use of language for communicative purposes.
basis of their responses to previous items, until it is determined
1.2 Limitations—This practice is not intended to address
that sufficient information regarding test takers’ abilities has
testingandtestdevelopmentinthefollowingspecializedareas:
been collected. The opposite of a fixed-form test.
Translation, Interpretation, Audio Translation, Transcription,
3.1.3 authentic texts, n—texts not created for language
other job-specific language performance tests, or Diagnostic
learning purposes that are taken from newspapers, magazines,
Assessment.
etc., and tapes of natural speech taken from ordinary radio or
1.2.1 Tests developed under this practice should not be used
television programs, etc.
to address any of the above excluded purposes (for example,
3.1.4 calibration, n—the process of determining the scale of
diagnostics).
a test or tests.
1.3 This international standard was developed in accor-
3.1.4.1 Discussion—Calibration may involve anchoring
dance with internationally recognized principles on standard-
items from different tests to a common difficulty scale (the
ization established in the Decision on Principles for the
theta scale). When a test is constructed from calibrated items
Development of International Standards, Guides and Recom-
then scores on the test indicate the candidates’ ability, that is,
mendations issued by the World Trade Organization Technical
their location on the theta scale.
Barriers to Trade (TBT) Committee.
3.1.5 cognitive lab, n—a method for eliciting feedback from
2. Referenced Documents examinees with regard to test items.
3 3.1.5.1 Discussion—Small numbers of examinees take the
2.1 ASTM Standards:
test, or subsets of the items on the test, and provide extensive
F1562 Guide for Use-Oriented Foreign Language Instruc-
feedback on the items by speaking their thought processes
tion
aloud as they take the test, answering questionnaires about the
F2089 Practice for Language Interpreting
items, being interviewed by researchers, or other methods
F2575 Guide for Quality Assurance in Translation
intended to obtain in-depth information about items. These
examinees should be similar to the examinees for whom the
3. Terminology
test is intended. For tests scored by raters, similar techniques
3.1 Definitions:
are used with raters to obtain information on rubric function-
ing.
3.1.6 computer adaptive test, n—a test administered by a
1
This practice is under the jurisdiction of ASTM Committee F43 on Language
computer in which the difficulty level of the next item to be
Services and Products and is the direct responsibility of Subcommittee F43.04 on
presented to test takers is estimated on the basis of their
Language Testing.
Current edition approved April 1, 2020. Published April 2020. Originally
responses to previous items and adapted to match their
approved in 2005. Last previous edition approved in 2011 as F2889 – 11. DOI:
abilities.
10.1520/F2889-11R20.
2
3.1.7 construct, n—the knowledge, skill or ability that is
Interagency Language Roundtable, Language Skill Level Descriptors (http://
www.govtilr.org/Skills/ILRscale1.htm).
being tested.
3
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
3.1.7.1 Discussion—The construct provides the basis for a
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
given test or test task and for interpreting scores derived from
Standards volume information, refer to the standard’s Documen
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.