ASTM E1947-98(2022)
(Specification)Standard Specification for Analytical Data Interchange Protocol for Chromatographic Data
Standard Specification for Analytical Data Interchange Protocol for Chromatographic Data
ABSTRACT
This specification covers an analytical data interchange protocol for chromatographic data representation and a software vehicle to affect the transfer of chromatographic data between instrument data systems. This protocol, which is designed to benefit users of analytical instruments and increase laboratory productivity and efficiency, provides a standardized format for the creation of raw data files or results files in the ".cdf" extension. The contents of the file include typical header in formation like instrument, column, detector, and operator description followed by raw or processed data, or both. Once data have been written or converted to this protocol, they can be read and processed by software packages that support the protocol. The end purpose of this protocol is intended to (1) transfer data between various vendors' instrument systems, (2) provide LIMS communications, (3) link data to document processing applications, (4) link data to spreadsheet applications, and ( 5) archive analytical data, or a combination thereof.
SCOPE
1.1 This specification covers a standardized format for chromatographic data representation and a software vehicle to effect the transfer of chromatographic data between instrument data systems. This specification provides protocol designed to benefit users of analytical instruments and increase laboratory productivity and efficiency.
1.2 The protocol in this specification provides a standardized format for the creation of raw data files or results files. This standard format has the extension “.cdf” (derived from NetCDF). The contents of the file include typical header information like instrument, column, detector, and operator description followed by raw or processed data, or both. Once data have been written or converted to this protocol, they can be read and processed by software packages that support the protocol.
1.3 The software transfer vehicle used for the protocol in this specification is NetCDF, which was developed by the Unidata Program and is funded by the Division of Atmospheric Sciences of the National Science Foundation.2
1.4 The protocol in this specification is intended to (1) transfer data between various vendors' instrument systems, (2) provide LIMS communications, (3) link data to document processing applications, (4) link data to spreadsheet applications, and (5) archive analytical data, or a combination thereof. The protocol is a consistent, vendor independent data format that facilitates the analytical data interchange for these activities.
1.5 The protocol consists of:
1.5.1 This specification on chromatographic data, which gives the full definitions for each one of the generic chromatographic data elements used in implementation of the protocol. It defines the analytical information categories, which are a convenient way for sorting analytical data elements to make them easier to standardize.
1.5.2 Guide E1948 on chromatographic data, which gives the full details on how to implement the content of the protocol using the public-domain NetCDF data interchange system. It includes a brief introduction to using NetCDF. It is intended for software implementors, not those wanting to understand the definitions of data in a chromatographic dataset.
1.5.3 NetCDF User’s Guide.
1.6 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
General Information
Relations
Standards Content (Sample)
This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the
Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
Designation: E1947 −98 (Reapproved 2022)
Standard Specification for
Analytical Data Interchange Protocol for Chromatographic
Data
This standard is issued under the fixed designation E1947; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope convenient way for sorting analytical data elements to make
them easier to standardize.
1.1 This specification covers a standardized format for
1.5.2 Guide E1948 on chromatographic data, which gives
chromatographic data representation and a software vehicle to
thefulldetailsonhowtoimplementthecontentoftheprotocol
effect the transfer of chromatographic data between instrument
using the public-domain NetCDF data interchange system. It
data systems. This specification provides protocol designed to
includesabriefintroductiontousingNetCDF.Itisintendedfor
benefit users of analytical instruments and increase laboratory
software implementors, not those wanting to understand the
productivity and efficiency.
definitions of data in a chromatographic dataset.
1.2 The protocol in this specification provides a standard-
1.5.3 NetCDF User’s Guide.
ized format for the creation of raw data files or results files.
1.6 This international standard was developed in accor-
This standard format has the extension “.cdf” (derived from
dance with internationally recognized principles on standard-
NetCDF). The contents of the file include typical header
ization established in the Decision on Principles for the
information like instrument, column, detector, and operator
Development of International Standards, Guides and Recom-
description followed by raw or processed data, or both. Once
mendations issued by the World Trade Organization Technical
data have been written or converted to this protocol, they can
Barriers to Trade (TBT) Committee.
be read and processed by software packages that support the
protocol.
2. Referenced Documents
2.1 ASTM Standards:
1.3 The software transfer vehicle used for the protocol in
this specification is NetCDF, which was developed by the E1948 Guide for Analytical Data Interchange Protocol for
Chromatographic Data
UnidataProgramandisfundedbytheDivisionofAtmospheric
Sciences of the National Science Foundation. 2.2 Other Standard:
NetCDF User’s Guide
1.4 The protocol in this specification is intended to (1)
2.3 ISO Standards:
transfer data between various vendors’ instrument systems, (2)
ISO 2014-1976 (E) Writing of Calendar Dates in All-
provide LIMS communications, (3) link data to document
Numeric Form
processing applications, (4) link data to spreadsheet
ISO 3307-1975 (E) Information Interchange—Represen-
applications, and (5) archive analytical data, or a combination
tations of Time of the Day
thereof. The protocol is a consistent, vendor independent data
ISO 4031-1978 (E) Information Interchange—Represen-
format that facilitates the analytical data interchange for these
tations of Local Time Differentials
activities.
3. Terminology
1.5 The protocol consists of:
1.5.1 This specification on chromatographic data, which
3.1 Definitions for Administrative Information Class—
gives the full definitions for each one of the generic chromato-
These definitions are for those data elements that are imple-
graphic data elements used in implementation of the protocol.
mented in the protocol. See Table 1.
It defines the analytical information categories, which are a
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
This specification is under the jurisdiction of ASTM E13 on Molecular Standards volume information, refer to the standard’s Document Summary page on
Spectroscopy and Separation Science and is the direct responsibility of E13.15 on the ASTM website.
Analytical Data. Available from Russell K. Rew, Unidata Program Center, University Corpora-
Current edition approved Nov. 1, 2022. Published November 2022. Originally tion for Atmospheric Research, P. O. Box 3000, Boulder, CO 80307-3000,
approved in 1998. Last previous edition approved in 2014 as E1947 – 98 (2014). http://www2.ucar.edu.
DOI: 10.1520/E1947-98R22. Available from International Organization for Standardization (ISO), ISO
For more information on the NetCDF standard, contact Unidata at http:// Central Secretariat, Chemin de Blandonnet 8, CP 401, 1214 Vernier, Geneva,
www.unidata.ucar.edu. Switzerland, https://www.iso.org.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E1947 − 98 (2022)
TABLE 1 Administrative Information Class
from UTC by as much -1200 hours (west of the Greenwich
Meridian) and by as much as +1300 hours (east of the
NOTE 1—Particular analytical information categories (C1, C2, C3, C4,
Greenwich Meridian). When the time differential factor equals
or C5) are assigned to each data element under the Category column. The
meaning of this category assignment is explained in Section 5.
zero, this indicates a zero hour, zero minute, and zero second
difference from Greenwich Mean Time.
NOTE 2—The Required column indicates whether a data element is
required, and if required, for which categories. For example, M1234 3.1.5.3 Discussion—An example of a value for this date
indicates that that particular data element is required for any dataset that
element would be: 1991,08,01,12:30:23-0500 or
includes information from Category 1, 2, 3, or 4. M4 indicates that a data
19910801123023-0500. In human terms this is 12:30 PM on
element is only required for Category 4 datasets.
August 1, 1991 in NewYork City. Note that the -0500 hours is
NOTE 3—Unless otherwise specified, data elements are generally
5 full hours time behind Greenwich Mean Time. The ISO
recorded to be their actual test values, instead of the nominal values that
standards permit the use of separators as shown, if they are
were used at the initiation of a test.
required to facilitate human understanding. However, separa-
Data Element Name Datatype Category Required
tors are not required and consequently shall not be used to
dataset-completeness string C1 M12345
protocol-template-revision string C1 M12345 separate date and time for interchange among data processing
netCDF-revision string C1 M12345
systems.
languages string C5 . . .
3.1.5.4 Discussion—The numerical value for the month of
administrative-comments string C1 or C2 . . .
dataset-origin string C1 M5
the year is used, because this eliminates problems with the
dataset-owner string C1 . . .
different month abbreviations used in different human lan-
dataset-date-time-stamp string C1 . . .
guages.
injection-date-time-stamp string C1 M12345
experiment-title string C1 . . .
3.1.6 dataset-origin—name of the organization, address,
operator-name string C1 M5
separation-experiment-type spring C1 . . .
telephone number, electronic mail nodes, and names of indi-
company-method-name string C1 . . .
vidual contributors, including operator(s), and any other infor-
company-method-ID string C1 . . .
mation as appropriate. This is where the dataset originated.
pre-experiment-program-name string C5 . . .
post-experiment-program- string C5 . . .
3.1.7 dataset-owner—name of the owner of a proprietary
name
source-file-reference string C5 M5 dataset. The person or organization named here is responsible
error-log string C5 . . .
for this field’s accuracy. Copyrighted data should be indicated
here.
3.1.8 error-log—information that serves as a log for failures
3.1.1 administrative-comments—comments about the data-
of any type, such as instrument control, data acquisition, data
set identification of the experiment. This free test field is for
processing or others.
anything in this information class that is not covered by the
other data elements in this class.
3.1.9 experiment-title—user-readable, meaningful name for
the experiment or test that is given by the scientist.
3.1.2 company-method-ID—internal method ID of the
sample analysis method used by the company.
3.1.10 injection-date-time-stamp—indicates the absolute
3.1.3 company-method-name—internal method name of the time of sample injection relative to Greenwich Mean Time.
sample analysis method used by the company. Expressed as the synthetic datetime given in the form:
+
YYYYMMDDhhmmss ⁄- ffff. See dataset-date-time-stamp for
3.1.4 dataset-completeness—indicates which analytical in-
details of the ISO standard definition of a date-time-stamp.
formation categories are contained in the dataset. The string
shouldexactlylistthecategoryvalues,asappropriate,asoneor
3.1.11 languages—optional list of natural (human) lan-
more of the following “C1+C2+C3+C4+C5,” in a string
guages and programming languages delineated for processing
separated by plus (+) signs. This data element is used to check
by language tools.
for completeness of the analytical dataset being transferred.
3.1.11.1 ISO-639-language—indicated a language symbol
3.1.5 dataset-date-time-stamp—indicates the absolute time
and country code from Annex B and D of the ISO-639
of dataset creation relative to Greenwich Mean Time. Ex-
Standard.
pressed as the synthetic datetime given in the form:
3.1.11.2 other-language—indicates the languages and dia-
YYYYMMDDhhmmss6ffff.
lect using a user-readable name; applies only for those lan-
3.1.5.1 Discussion—This is a synthesis of ISO 2014-1976
guages and dialects not covered by ISO 639 (such as program-
(E), ISO 3307-1975 (E), and ISO 4031-1978 (E), which
ming language).
compensates for local time variations.
3.1.5.2 Discussion—The time differential factor (ffff) ex-
3.1.12 NetCDF-revision—current revision level of the
presses the hours and minutes between local time and the
NetCDF data interchange system software being used for data
Coordinated Universal Time (UTC or Greenwich Mean Time,
transfer.
as disseminated by time signals), as defined in ISO 3307-1975
3.1.13 operator-name—name of the person who ran the
(E). The time differential factor (ffff) is represented by a
experiment or test that generated the current dataset.
four-digit number preceded by a plus (+) or a minus (-) sign,
indicating the number of hour and minutes that local time 3.1.14 post-test-program-name—name of the program or
differs from the UTC. Local times vary throughout the world subroutine that is run after the analytical test is finished.
E1947 − 98 (2022)
TABLE 2 Sample-Description Information Class
3.1.15 pre-test-program-name—name of the program or
subroutine that is run before the analytical test is finished. Date Element Name Datatype Category Required
sample-ID-comments string C5 . . .
3.1.16 protocol-template-revision—revision level of the
sample-ID string C1 . . .
template being used by implementors. This needs to be
sample-name string C1 . . .
sample-type string C1 . . .
included to tell users which revision of E1947 should be
sample-injection-volume floating-point C3 . . .
referenced for the exact definitions of terms and data elements
sample-amount floating-point C3 . . .
used in a particular dataset.
3.1.17 separation-experiment-type—name of the separation
TABLE 3 Detection-Method Information Class
experiment type. Select one of the types shown in the follow-
ing list. The full name should be spelled out, rather than just Data Element Name Datatype Category Required
referencing the number. This requirement is to increase the
detection-method-table-name string C1 . . .
detection-method-comments string C1 . . .
readability of the datasets.
detection-method-name string C1 . . .
3.1.17.1 Discussion—Users are advised to be as specific as
detector-name string C1 . . .
possible, although for simplicity, users should at least put “gas detector-maximum-value floating-point C1 M1
detector-minimum-value floating-point C1 M1
chromatography”forGCor“liquidchromatography”forLCto
detector-unit string C1 M1
differentiate between these two most commonly used tech-
niques.
3.2.4 sample-injection-volume—volume of sample injected,
Separation Experiment Types
Gas Chromatography
with a unit of microliters.
Gas Liquid Chromatography
3.2.5 sample-name—user-assigned name of the sample.
Gas Solid Chromatography
3.2.6 sample-type—indicated whether the sample is a
Liquid Chromatography
standard, unknown, control, or blank.
Normal Phase Liquid Chromatography
Reversed Phase Liquid Chromatography
3.3 Definitions for Detection-Method Information Class—
Ion Exchange Liquid Chromatography
Size Exclusion Liquid Chromatography This information class holds the information needed to set up
Ion Pair Liquid Chromatography
the detection system for an experiment. Data element names
Other
assume a multi-channel system. The first implementation
Other Chromatography appliestoasingle-channelsystemonly.Table3showsonlythe
Supercritical Fluid Chromatography
column headers for a detection method for a single sample.
Thin Layer Chromatography
3.3.1 detection-method-comments—users’ comments about
Field Flow Fractionation
Capillary Zone Electrophoresis detectormethodthatisnotcontainedinanyotherdataelement.
3.1.18 source-file-reference—adequateinformationtolocate
3.3.2 detection-method-name—name of this detection-
the original dataset. This information makes the dataset self-
method actually used. This name is included for archiving and
referenced for easier viewing and provides internal documen-
retrieval purposes.
tation for GLP-compliant systems.
3.3.3 detection-method-table-name—name of this detection
3.1.18.1 Discussion—This data element should include the
method table. This name is global to this table. It is included
complete filename, including node name of the computer
for reference by the sequence information table and other
system. For UNIX this should include the full path name. For
tables.
VAX/VMS this should include the node-name, device-name,
3.3.4 detector-maximum-value—maximum output value of
directory-name, and file-name. The version number of the file
the detector as transformed by the analog-to-digital converter,
(if applicable) should also be included. For personal computer
given in detector-unit. In other words, it is the maximum
networks this needs to be the server name and directory path.
possible raw data value (which is not necessarily actual
3.1.18.2 Discussion—Ifthesourcefilewasalibraryfile,this
maximumvalueintherawdataarray).Itisrequiredforscaling
dataelementshouldcontainthelibrarynameandserialnu
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.