ASTM E2077-00(2016)
(Specification)Standard Specification for Analytical Data Interchange Protocol for Mass Spectrometric Data
Standard Specification for Analytical Data Interchange Protocol for Mass Spectrometric Data
ABSTRACT
This specification covers an analytical data interchange protocol for mass spectrometric data representation and a software vehicle to affect the transfer of mass spectrometric data between instrument data systems. This specification does not provide for the storage of data acquired simultaneous to and integrated with the mass spectrometric data, but on other detectors. The protocol, which is designed to benefit users of analytical instruments and increase laboratory productivity and efficiency, provides a standardized format for the creation of raw data files, library spectrum files or results files. This file, which has a ".cdf" extension, contains typical header information like instrument, sample, and acquisition method description, followed by raw, library, or processed data. Once data have been written or converted to this protocol, they can be read and processed by software packages that support the protocol. This protocol is intended to perform the following functions: (1) transfer data between various vendors' instrument systems; (2) provide Laboratory Information Management Systems (LIMS) communications; (3) link data to document processing applications; (4) link data to spreadsheet applications, and (5) archive analytical data, or a combination thereof.
SCOPE
1.1 This specification covers a standardized format for mass spectrometric data representation and a software vehicle to effect the transfer of mass spectrometric data between instrument data systems. This specification provides a protocol designed to benefit users of analytical instruments and increase laboratory productivity and efficiency.
1.2 The protocol in this specification provides a standardized format for the creation of raw data files, library spectrum files or results files. This standard format has the extension “.cdf” (derived from NetCDF). The contents of the file include typical header information like instrument, sample, and acquisition method description, followed by raw, library or processed data. Once data have been written or converted to this protocol, they can be read and processed by software packages that support the protocol.
1.3 This specification does not provide for the storage of data acquired simultaneous to and integrated with the mass spectrometric data, but on other detectors; for example attached to the mass spectrometer's liquid or gas chromatographic system. Related Specification E1947 and Guide E1948 describe the storage of 2-dimensional chromatographic data.
1.4 The software transfer vehicle used for the protocol in this specification is NetCDF, which was developed by the Unidata Program and is funded by the Division of Atmospheric Sciences of the National Science Foundation.2
1.5 The protocol in this specification is intended to (1) transfer data between various vendors' instrument systems, (2) provide Laboratory Information Management Systems (LIMS) communications, (3) link data to document processing applications, (4) link data to spreadsheet applications, and (5) archive analytical data, or a combination thereof. The protocol is a consistent, vendor independent data format that facilitates the analytical data interchange for these activities.
1.6 The protocol consists of:
1.6.1 This specification on mass spectrometric data, which gives the full definitions for each one of the generic mass spectrometric data elements used in implementation of the protocol. It defines the analytical information categories, which are a convenient way for sorting analytical data elements to make them easier to standardize.
1.6.2 Guide E2078 on mass spectrometric data, which gives the full details on how to implement the content of the protocol using the public-domain NetCDF data interchange system. It includes a brief introduction to using NetCDF and describes an API (Application Programming Interface) that is intended to be incorporated into application programs to read or write NetCDF files. I...
General Information
Relations
Standards Content (Sample)
This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the
Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
Designation: E2077 −00 (Reapproved 2016)
Standard Specification for
Analytical Data Interchange Protocol for Mass
Spectrometric Data
This standard is issued under the fixed designation E2077; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision.Anumber in parentheses indicates the year of last reapproval.A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope is a consistent, vendor independent data format that facilitates
the analytical data interchange for these activities.
1.1 Thisspecificationcoversastandardizedformatformass
1.6 The protocol consists of:
spectrometric data representation and a software vehicle to
1.6.1 This specification on mass spectrometric data, which
effect the transfer of mass spectrometric data between instru-
gives the full definitions for each one of the generic mass
ment data systems. This specification provides a protocol
spectrometric data elements used in implementation of the
designedtobenefitusersofanalyticalinstrumentsandincrease
protocol.Itdefinestheanalyticalinformationcategories,which
laboratory productivity and efficiency.
are a convenient way for sorting analytical data elements to
1.2 The protocol in this specification provides a standard-
make them easier to standardize.
ized format for the creation of raw data files, library spectrum
1.6.2 GuideE2078onmassspectrometricdata,whichgives
files or results files. This standard format has the extension
thefulldetailsonhowtoimplementthecontentoftheprotocol
“.cdf” (derived from NetCDF).The contents of the file include
using the public-domain NetCDF data interchange system. It
typical header information like instrument, sample, and acqui-
includesabriefintroductiontousingNetCDFanddescribesan
sition method description, followed by raw, library or pro-
API(ApplicationProgrammingInterface)thatisintendedtobe
cessed data. Once data have been written or converted to this
incorporated into application programs to read or write
protocol,theycanbereadandprocessedbysoftwarepackages
NetCDF files. It is intended for software implementors, not
that support the protocol.
those wanting to understand the definitions of data in a mass
1.3 This specification does not provide for the storage of
spectrometric dataset.
data acquired simultaneous to and integrated with the mass
1.6.3 NetCDF Users Guide.
spectrometric data, but on other detectors; for example at-
tached to the mass spectrometer’s liquid or gas chromato-
2. Referenced Documents
graphicsystem.RelatedSpecificationE1947andGuideE1948
2.1 ASTM Standards:
describe the storage of 2-dimensional chromatographic data.
E1947Specification for Analytical Data Interchange Proto-
1.4 The software transfer vehicle used for the protocol in
col for Chromatographic Data
this specification is NetCDF, which was developed by the
E1948Guide for Analytical Data Interchange Protocol for
UnidataProgramandisfundedbytheDivisionofAtmospheric
Chromatographic Data
Sciences of the National Science Foundation.
E2078Guide for Analytical Data Interchange Protocol for
Mass Spectrometric Data
1.5 The protocol in this specification is intended to (1)
transferdatabetweenvariousvendors’instrumentsystems,(2) 2.2 Other Standards:
provideLaboratoryInformationManagementSystems(LIMS) EIA 232
IEEE 488
communications, (3) link data to document processing
applications, (4) link data to spreadsheet applications, and (5) IEEE 802
archive analytical data, or a combination thereof.The protocol Occupational Safety and Health Administration (OSHA)
1 3
This specification is under the jurisdiction of ASTM Committee E13 on For referenced ASTM standards, visit the ASTM website, www.astm.org, or
Molecular Spectroscopy and Separation Science and is the direct responsibility of contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
Subcommittee E13.15 on Analytical Data. Standards volume information, refer to the standard’s Document Summary page on
Current edition approved April 1, 2016. Published May 2016. Originally the ASTM website.
approved in 2000. Last previous edition approved in 2010 as E2077–00 (2010). Available from Electronic Industries Alliance (EIA), 2500 Wilson Blvd.,
DOI: 10.1520/E2077-00R16. Arlington, VA 22201.
2 5
For more information on the NetCDF standard, contact Unidata at www.uni- Available from Institute of Electrical and Electronics Engineers, Inc. (IEEE),
data.ucar.edu. 445 Hoes Ln., Piscataway, NJ 08854-4141, http://www.ieee.org.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E2077 − 00 (2016)
TABLE 1 Administrative Information Class
Standards-29 CFR part 1910
NetCDFUser’s Guide
NOTE 1—Particular analytical information categories (C1, C2, C3, C4,
8 or C5) are assigned to each data element under the Category column.The
2.3 ISO Standards:
meaning of this category assignment is explained in Section 5.
ISO 639:1988Code for the representation of names of
NOTE 2—The Required column indicates whether a data element is
languages
required, and if required, for which categories. For example, M1234
ISO8601:1988Dataelementsandinterchangeformats(First
indicates that that particular data element is required for any dataset that
edition published 1988-06-15; with Technical Corrigen-
includes information from Category 1, 2, 3, or 4. M4 indicates that a data
element is only required for Category 4 datasets.
dum 1 published 1991-05-01)
ISO 9000Quality Management Systems
NOTE 3—Unless otherwise specified, data elements are generally
ISO/IEC 8802 recorded to be their actual test values, instead of the nominal values that
were used at the initiation of a test.
3. Terminology
NOTE 4—A table is not to be interpreted as a table of keywords. The
software implementation is independent of the data element names used
3.1 Analytical Information Classes—The Mass Spectrom-
here,andisinfactquitedifferent.Likewise,thedatatypesgivenarenotan
etry Information Model categorizes mass spectrometric infor- implementation representation, but a description of the form of the data
element name. That is, a data element labeled as floating point may, for
mation into a number of information “classes.” There is not a
example, be implemented as a double precision floating point number; in
direct mapping of these classes into the implementation cat-
this document, it is sufficient to note it as floating point without reference
egories described further below. The implementation catego-
to precision.
riesdescribetheinformationhierarchy;theclassesdescribethe
Data Element Name Datatype Category Required
contents within the hierarchy. The model presented here only
dataset-completeness string C1 M12345
protocol-template-revision string C1 M12345
partially addresses these classes. In particular, the last two
netcdf-revision string C1 M12345
(Processed Results and Component Quantitation Results) are
languages string C1 or C5 . . .
administrative-comments string C1 or C2 . . .
not described at all. Only Implementation Category 1 is
dataset-origin string C1 M4
required for compliance within this specification. Information
dataset-owner string C1 . . .
about the other implementation categories is provided for
dataset-date-time-stamp string C1 M1234
injection-date-time-stamp string C1 M1234
historical interest. The classes defined here are:
experiment-title string C1 . . .
experiment-cross-references string array[n] C3 or C4
3.1.1 Administrative—information for administrative track-
operator-name string C1 M4
ing of experiments.
experiment-type string C1 or C4 . . .
pre-experiment-program-name string C2 or C5 . . .
3.1.2 Instrument-ID—information about the instrument that
post-experiment-program-name string C2 or C5 . . .
generally does not change from experiment to experiment.
number-of-times-processed integer C5
number-of-times-calibrated integer C5
3.1.3 Sample Description—information describing the
calibration-history string array[n] C5
source-file-reference string C5 M4
sample and its history, handling and processing.
source-file-format string C5
source-file-date-time-stamp string C5 M4
3.1.4 Test Method—allinformationusedtogeneratetheraw
external-file-references string array[n] C5
data and processed results. This includes instrument control,
error-log string C5
detection, calibration, data processing and quantitation meth-
3.2.1 administrative-comments—comments about the data-
ods.
set identification of the experiment. This free text field is for
3.1.5 Raw Data—the data as stored in the data file, along anything in this information class that is not covered by the
other data elements in this class.
with any parameters needed to describe it.
3.2.2 calibration-history—an audit trail of file names and
3.1.6 Processed Results—processinginformationandvalues
data sets which records the calibration history; used for Good
derived from the raw data.
Laboratory Practice (GLP) compliance.
3.1.7 Component Quantitation Results—individual quanti-
3.2.3 dataset-completeness—indicates which analytical in-
tation results for components in a complex mixture.
formation categories are contained in the dataset. The string
3.2 Definitions for Administrative Information Class—
shouldexactlylistthecategoryvalues,asappropriate,asoneor
These definitions are for those data elements that are imple-
more of the following “C1+C2+C3+C4+C5,” in a string
mented in the protocol. See Table 1.
separated by plus (+) signs.This data element is used to check
for completeness of the analytical dataset being transferred.
3.2.4 dataset-date-time-stamp—indicates the absolute time
Available from Occupational Safety and Health Administration (OSHA), 200
of dataset creation relative to Greenwich Mean Time. Ex-
Constitution Ave., Washington, DC 20210, http://www.osha.gov.
7 pressed as the synthetic datetime given in the form:
Available from Russell K. Rew, Unidata Program Center, University Corpora-
tion for Atmospheric Research, P.O. Box 3000, Boulder, CO 80307-3000, http:// YYYYMMDDhhmmss6ffff.
www.unidata.ucar.edu/.
3.2.4.1 Discussion—This is a synthesis of ISO 8601:1988,
Available from International Organization for Standardization (ISO), ISO
which compensates for local time variations.
Central Secretariat, BIBC II, Chemin de Blandonnet 8, CP 401, 1214 Vernier,
Geneva, Switzerland, http://www.iso.org. 3.2.4.2 Discussion—The YYYYMMDDhhmmss expresses
E2077 − 00 (2016)
the local time, and time differential factor (ffff) expresses the 3.2.10.2 Discussion—A required Raw Data Information
hours and minutes between local time and the Coordinated parameter, the number of scans, is used to define the shape of
Universal Time (UTC or Greenwich Mean Time, as dissemi- the data in the file, that is, to differentiate between single and
nated by time signals), as defined in ISO 8601:1988. The time multiplespectrumfiles.Anotherparameter,thescannumber,is
differential factor (ffff) is represented by a four-digit number used to determine whether multiple scan files have an order or
preceded by a plus (+) or a minus (−) sign, indicating the relatedness between scans.
number of hours and minutes that local time differs from the
3.2.10.3 Discussion—Some instruments are capable of
UTC. Local times vary throughout the world from UTC by as
mixedmodedataacquisition,forexample,alternatingpositive/
much as −1200 h (west of the Greenwich Meridian) and by as
negative EI (Electron Ionisation) or CI (Chemical Ionisation)
much as +1300 h (east of the Greenwich Meridian). When the
scans. In order to keep this interchange standard as simple as
time differential factor equals zero, this indicates a zero hour,
possible, each scan mode must be treated as a separate data
zerominute,andzeroseconddifferencefromGreenwichMean
set regardless of how the data are actually stored in the source
Time.
data file. Alternating positive/negative EI data, for example,
3.2.4.3 Discussion—An example of a value for a datetime
will generate two interchange files (possibly simultaneously,
would be: 1991,08,01,12:30:23-0500 or 19910801123023-
depending on the implementation); one for the positive EI
0500. In human terms this is 23 s past 12:30 PM onAugust 1,
scans and one for the negative EI scans. These files may be
1991 in New York City. Note that the −0500 h is 5 full hours
made mutually cross-referential using their “external-file-
time behind Greenwich MeanTime.The ISO standard permits
references” fields.
theuseofseparatorsasshown,iftheyarerequiredtofacilitate
3.2.11 external-file-references—an array of strings listing
human understanding. However, separators are not required
filenamesreferredtofromwithintherawdatafile.Thesecould
andconsequentlyshallnotbeusedtoseparatedateandtimefor
include, for example, tune parameter, method, calibration,
interchange among data processing systems.
reference, sequence, or other files. NetCDF files produced in
3.2.4.4 Discussion—The numerical value for the month of
parallel(suchaspairedfilescontainingalternatingEI/CIscans)
the year is used, because this eliminates problems with the
should be cross-referenced here.
different month abbreviations used in different human lan-
3.2.12 injection-date-time-stamp—indicates the absolute
guages.
time of sample injection relative to Greenwich Mean Time.
3.2.5 dataset-origin—name of the organization, address,
Expressed as the synthetic datetime given in the form:
telephone number, electronic mail nodes, and names of indi-
YYYYMMDDhhmmss 6ffff. See dataset-date-time-stamp for
vidual contributors, including operator(s), and any other infor-
details of the ISO standard definition of a date-time-stamp.
mation as appropriate. This is where the dataset originated.
3.2.13 languages—optional list of natural (human) lan-
3.2.6 dataset-owner—name of the owner of a proprietary
guages and programming languages delineated for processing
dataset. The person or organization named here is responsible
by language tools.
for this field’s accuracy. Copyrighted data should be indicated
here.
3.2.13.1 ISO-639-language—indicates a language symbol
and country code from Annex B and D of ISO 639:1988.
3.2.7 error-log—informationthatservesasalogforfailures
of any type, such as instrument control, data acquisition, data
3.2.13.2 other-language—indicates the languages and dia-
processing or others.
lect using a user-readable name; applies only for those lan-
3.2.8 experiment-cross-references—an array of strings
guages and dialects not covered by ISO 639:1988 (such as
which reference other related experi
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.