Standard Practice for Assessing Language Proficiency

SIGNIFICANCE AND USE
Intended Use:
This practice is intended to serve the language test developer, test provider, and language test user communities in their ability to provide useful, timely, reliable, and reproducible tests of language proficiency for general communication purposes. This practice expands the testing capacity of the United States by leveraging commercial and existing government test development and delivery capability through standardization of these processes. This practice is intended to be used by contract officers, program managers, supervisors, managers, and commanders. It is also intended to be used by test developers, those who select and evaluate tests, and users of test scores.
Furthermore, the intent of this practice is to encourage the use of expert teams to assist contracting officers, contracting officer representatives, test developers, and contractors/vendors in meeting the testing needs being addressed. Users of this practice are encouraged to focus on meeting testing needs and not to interpret this practice as limiting innovation in any way.
Compliance with the Practice:
Compliance with this practice requires adherence to all sections of this practice. Exceptions are allowed only in specific cases in which a particular section of this practice does not apply to the type or intended use of a test. Exceptions shall be documented and justified to the satisfaction of the customer. Nothing in this practice should be construed as contradicting existing federal and state laws nor allowing for deviation from established U.S. Government policies on testing.
SCOPE
1.1 Purpose—This practice describes best practices for the development and use of language tests in the modalities of speaking, listening, reading, and writing for assessing ability according to the Interagency Language Roundtable (ILR) scale. This practice focuses on testing language proficiency in use of language for communicative purposes.
1.2 Limitations—This practice is not intended to address testing and test development in the following specialized areas: Translation, Interpretation, Audio Translation, Transcription, other job-specific language performance tests, or Diagnostic Assessment.
1.2.1 Tests developed under this practice should not be used to address any of the above excluded purposes (for example, diagnostics).

General Information

Status
Historical
Publication Date
30-Apr-2011
Current Stage
Ref Project

Relations

Buy Standard

Standard
ASTM F2889-11 - Standard Practice for Assessing Language Proficiency
English language
24 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

NOTICE: This standard has either been superseded and replaced by a new version or withdrawn.
Contact ASTM International (www.astm.org) for the latest information
Designation:F2889 −11
Standard Practice for
1
Assessing Language Proficiency
This standard is issued under the fixed designation F2889; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope that sufficient information regarding test takers’ abilities has
been collected. The opposite of a fixed-form test.
1.1 Purpose—This practice describes best practices for the
development and use of language tests in the modalities of 3.1.3 authentic texts, n—texts not created for language
speaking, listening, reading, and writing for assessing ability learning purposes that are taken from newspapers, magazines,
2
according to the Interagency Language Roundtable (ILR) etc., and tapes of natural speech taken from ordinary radio or
scale. This practice focuses on testing language proficiency in television programs, etc.
use of language for communicative purposes.
3.1.4 calibration, n—the process of determining the scale of
1.2 Limitations—This practice is not intended to address a test or tests.
testingandtestdevelopmentinthefollowingspecializedareas: 3.1.4.1 Discussion—Calibration may involve anchoring
Translation, Interpretation, Audio Translation, Transcription,
items from different tests to a common difficulty scale (the
other job-specific language performance tests, or Diagnostic theta scale). When a test is constructed from calibrated items
Assessment.
then scores on the test indicate the candidates’ ability, i.e. their
1.2.1 Tests developed under this practice should not be used location on the theta scale.
to address any of the above excluded purposes (for example,
3.1.5 cognitive lab, n—a method for eliciting feedback from
diagnostics).
examinees with regard to test items.
3.1.5.1 Discussion—Small numbers of examinees take the
2. Referenced Documents
test, or subsets of the items on the test, and provide extensive
3
2.1 ASTM Standards:
feedback on the items by speaking their thought processes
F1562 Guide for Use-Oriented Foreign Language Instruc-
aloud as they take the test, answering questionnaires about the
tion
items, being interviewed by researchers, or other methods
F2089 Practice for Language Interpreting
intended to obtain in-depth information about items. These
F2575 Guide for Quality Assurance in Translation
examinees should be similar to the examinees for whom the
test is intended. For tests scored by raters, similar techniques
3. Terminology
are used with raters to obtain information on rubric function-
3.1 Definitions:
ing.
3.1.1 achievement test, n—an instrument designed to mea-
3.1.6 computer adaptive test, n—a test administered by a
sure what a person has learned within or up to a given time
computer in which the difficulty level of the next item to be
based on a sampling of what has been covered in the syllabus.
presented to test takers is estimated on the basis of their
3.1.2 adaptive test, n—form of individually tailored testing
responses to previous items and adapted to match their
in which test items are selected from an item bank where test
abilities.
items are stored in rank order with respect to their item
3.1.7 construct, n—the knowledge, skill or ability that is
difficulty and presented to test takers during the test on the
being tested.
basis of their responses to previous items, until it is determined
3.1.7.1 Discussion—The construct provides the basis for a
given test or test task and for interpreting scores derived from
1
This practice is under the jurisdiction of ASTM Committee F43 on Language
this task.
Services and Products and is the direct responsibility of Subcommittee F43.04 on
Language Testing.
3.1.8 constructed response, adj—a type of item or test task
Current edition approved May 1, 2011. Published June 2011. DOI: 10.1520/
that requires test takers to respond to a series of open-ended
F2889-11.
2 questions by writing, speaking, or doing something rather than
Interagency Language Roundtable, Language Skill Level Descriptors (http://
www.govtilr.org/Skills/ILRscale1.htm).
choose answers from a ready-made list.
3
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
3.1.8.1 Discussion—The most commonly used types of
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
constructed-response items include fill-in, short-answer, and
Standards volume information, refer to the standard’s Document Summary page on
the ASTM website. performance assessment.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
1

---------------------- Page: 1 ----------------------
F2889−11
3.1.9
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.