ASTM E2078-00(2016)
(Guide)Standard Guide for Analytical Data Interchange Protocol for Mass Spectrometric Data
Standard Guide for Analytical Data Interchange Protocol for Mass Spectrometric Data
SIGNIFICANCE AND USE
7.1 General Coding Guidelines—The NetCDF libraries are supplied to developers as source code. End users receive the libraries in compiled binary form as part of a vendor's application.
7.1.1 Developers setting out to write a program to convert their data files to the Mass Spectrometric Data Protocol should consider using the NetCDF utilities ncgen and ncdump. After developers create the NetCDF file they should use the ncdump program to generate the ASCII representation of the data file, and examine it to ensure the data are being correctly put into the file.
7.2 Make Files for NetCDF Libraries and Utilities—In general the compilation is straightforward. The make files were modified after they were received from the Unidata Corporation, because they did not compile the first time on PCs. The changes needed to get the Unidata distribution to run on DOS are (1) rename the file MAKEFILE to UNIX.MK, and (2) rename MSOFT.MK to MAKEFILE, and then run NMAKE. The default switches in the Unidata distribution use the switches for the floating point coprocessor and Microsoft Windows options.
7.2.1 The protocol kit contains some complete makefile examples for Microsoft C V6.0 running on DOS. The Microsoft C V6.0 compiler manual should be consulted for the exact meaning of the compiler and linker options.
7.2.2 The VMS and SunOS compilation instructions are in directories for those operating systems.
7.3 NetCDF Library Build Order—The NetCDF libraries must be built in a specific order. The correct order to build the NetCDF directories is:
UTIL
XDR
SRC
NCDUMP
NCGEN
NCTEST
7.3.1 The UTIL and XDR makefiles work as distributed using NMAKE with Microsoft C V6.0.
SCOPE
1.1 This guide covers the implementation of the Mass Spectrometric Data Protocol in analytical software applications. Implementation of this protocol requires:
1.1.1 Specification E2077, which contains the full set of data definitions. The mass spectrometric data protocol is not based upon any specific implementation; it is designed to be independent of any particular implementation so that implementations can change as technology evolves. The protocol is implemented in categories to speed its acceptance through actual use.
1.1.2 Specification E2077 contains a full description of the contents of the data communications protocol, including the analytical information categories with data elements and their attributes for most aspects of mass spectrometric tests.
1.2 The analytical information categories are a practical convenience for breaking down the standardization process into smaller, more manageable pieces. It is easier for developers to build consensus and produce working systems based on smaller information sets, without the burden and complexity of the hundreds of data elements contained in all the categories. The categories also assist vendors and end users in using the guide in their computing environments.
1.3 The network common data format (NetCDF) data interchange system is the container used to communicate data between applications in a way that is independent of both computer architectures and end-user applications. In essence, it is a special type of application designed for data interchange.
1.4 The common data language (CDL) template for mass spectrometry is a language specification of the mass spectrometry dataset being interchanged. With the use of the NetCDF utilities, this human-readable template can be used to generate an equivalent binary file and the software subroutine calls needed for input and output of data in analytical applications.
General Information
Relations
Standards Content (Sample)
This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the
Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.
Designation: E2078 − 00 (Reapproved 2016)
Standard Guide for
Analytical Data Interchange Protocol for Mass
Spectrometric Data
This standard is issued under the fixed designation E2078; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope 2. Referenced Documents
2.1 ASTM Standards:
1.1 This guide covers the implementation of the Mass
Spectrometric Data Protocol in analytical software applica- E2077 Specification for Analytical Data Interchange Proto-
col for Mass Spectrometric Data
tions. Implementation of this protocol requires:
2.2 Other Standard:
1.1.1 Specification E2077, which contains the full set of
data definitions. The mass spectrometric data protocol is not NetCDF User’s Guide
2.3 ISO Standards:
based upon any specific implementation; it is designed to be
independent of any particular implementation so that imple- 8601:1988 Data elements and interchange formats, (First
edition published 1988-06-15; with Technical Corrigen-
mentations can change as technology evolves. The protocol is
implemented in categories to speed its acceptance through dum 1 published 1991-05-01)
actual use.
3. List of Contents and Use
1.1.2 Specification E2077 contains a full description of the
contents of the data communications protocol, including the 3.1 NetCDF Toolkit—The protocol is an application pro-
analytical information categories with data elements and their gramming interface (API) layered on top of the public domain
attributes for most aspects of mass spectrometric tests. NetCDF toolkit. NetCDF is a set of tools that facilitate reading
or writing platform-independent, self-describing data files. All
1.2 The analytical information categories are a practical
data in a NetCDF file is written using the external data
convenience for breaking down the standardization process
representation (XDR). XDR was developed by Sun Microsys-
into smaller, more manageable pieces. It is easier for develop-
tems and is used for platform-independent file systems for all
ers to build consensus and produce working systems based on
workstations and personal computers. Each NetCDF data
smaller information sets, without the burden and complexity of
element is self-describing - it has a name, type, and dimen-
the hundreds of data elements contained in all the categories.
sionality. A NetCDF file contains three parts: a dimensions
The categories also assist vendors and end users in using the
section, which defines the names and size of all dimensions
guide in their computing environments.
used to describe variables; a variables section, which defines
1.3 The network common data format (NetCDF) data inter-
the names, data types, dimensionality, and attributes for all
change system is the container used to communicate data
variables used in the file; and finally, a data section, which
between applications in a way that is independent of both
contains the actual values assigned to the variables. Attributes
computerarchitecturesandend-userapplications.Inessence,it
are numbers or strings which augment the description of
is a special type of application designed for data interchange.
variables or the file as a whole.
3.1.1 For example, a variable “x_axis_ values” might con-
1.4 The common data language (CDL) template for mass
tain an array of numbers representing the abscissa of a
spectrometry is a language specification of the mass spectrom-
two-dimensional data set. It would have a dimension, possibly
etry dataset being interchanged. With the use of the NetCDF
named “x_axis_size,” which would specify the number of
utilities, this human-readable template can be used to generate
an equivalent binary file and the software subroutine calls
needed for input and output of data in analytical applications.
For referenced ASTM standards, visit the ASTM website, www.astm.org, or
contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
Standards volume information, refer to the standard’s Document Summary page on
the ASTM website.
1 3
This guide is under the jurisdiction of ASTM Committee E13 on Molecular Available from Russell K. Rew, Unidata Program Center, University Corpora-
Spectroscopy and Separation Science and is the direct responsibility of Subcom- tion for Atmospheric Research, P.O. Box 3000, Boulder, CO 80307-3000, http://
mittee E13.15 on Analytical Data. www.unidata.ucar.edu/.
Current edition approved April 1, 2016. Published June 2016. Originally Available from International Organization for Standardization (ISO), ISO
approved in 2000. Last previous edition approved in 2010 as E2078 – 00 (2010). Central Secretariat, BIBC II, Chemin de Blandonnet 8, CP 401, 1214 Vernier,
DOI: 10.1520/E2078-00R16. Geneva, Switzerland, http://www.iso.org.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E2078 − 00 (2016)
abscissa points. The variable might have some descriptive would be required external to the NetCDF file to translate the
attributes,suchas“units”(withavalueof“Seconds,”perhaps), number into something meaningful.
“scale_factor” (with a value of 1000.0, specifying that all
storedabscissavaluesshouldbemultipliedby1000.0togetthe 4. Conventions
actual value), or “long_name” (with value“ Time”, which
4.1 The format convention adopted in this guide is as
might be used to label the abscissa when drawing a plot).
follows:
3.1.2 The NetCDF toolkit has been placed in the public
(1) Normal text is presented in this font (Times New
domain by the Unidata Program Center, a non-profit software
Roman).
support organization for the University Corporation for Atmo-
(2) API symbols (functions, formal types, etc.) are pre-
spheric Research. The Unidata Program Center is funded by
sented in boldface Helvetica font.
the National Science Foundation, National Center for Atmo-
(3) Parameters to API functions are presented in italic
spheric Research, and other organizations and provides ongo-
Helvetica font.
ing development and support of NetCDF and related tools.
(4) Example code is presented in normal Helvetica font.
3.1.3 The NetCDF version currently supported in this
4.2 Other Conventions—All indices begin at zero (C con-
implementation is 2.3.2.
vention). In several data structures, a scan_no or inst_no
element must be loaded before reading or writing. This
3.2 Data Structures—Each of the analytical information
identifies the scan or instrument component number for which
class tables in the specification document has a corresponding
data will be read or written. In all cases, scan or instrument
data structure; however, not every field in each table has a
component numbers begin at zero.
corresponding data element in a structure, and the data struc-
4.2.1 All date/time stamps are formatted using the ISO
tures may have elements that do not appear in any class table.
standard 8601 format referenced in the specification. An API
Most of these differences are due to details of the implemen-
utility function is provided for conversion between date/time
tation which could not be hidden.
information in numeric form and ISO-8601 string format (see
3.2.1 The data structures provide the mapping between the
ms_convert_date(), below).
attribute name and data type described in the specification and
the field and actual data type in the file. The actual NetCDF
5. Mass Spectrometric Data Protocol Distribution Kit
dimension, variable, and attribute names are hidden from the
5.1 It is intended that potential users of this implementation
API level. These names in fact are irrelevant for application
programs; it is the data structure which provides the informa- can obtain a complete NetCDF and API distribution kit from
various instrument vendors’ websites. Information on how to
tion interchange between the application and the file.
obtain the kit will be posted on the ASTM website
3.2.2 Each data structure and its mapping to an analytical
(www.astm.org) under Committee E01.25.
information class are described in detail later in this guide.
3.2.3 Application Programming Interface Functions: 5.2 The Analytical Data Interchange Protocol for Mass
Spectrometric Data distribution kit contains:
3.2.3.1 The application programming interface provides
5.2.1 Software—NetCDF distribution kit from Unidata
programmatic access to the contents of the files. Mass spectral
(withthemodifiedmakefileneededtomakethekitcompileout
data occurs in three forms: global information, which relates to
of the box).
thecontentsoftheentirefile,informationwhichdescribeseach
5.2.2 NetCDF User’s Guide—supplied by Unidata Program
part of a multi-component instrument, and information which
Center.
changes on a scan-by-scan basis for spectra and library entries.
5.2.3 Specification E2077.
API functions are provided for opening a file for reading or
5.2.4 Guide E2078.
writing; closing a file; reading and writing global, per-
component instrument, and per-scan spectral and library infor-
6. Hardware and Software
mation; initializing and clearing data structure contents; and a
few miscellaneous utility functions. Each of these functions is
6.1 This section describes the hardware and software con-
described in detail in a later section of this guide.
figurations used for testing. In general, the NetCDF system
3.2.4 Enumerated Sets—Many of the attributes listed in the puts very few requirements on the hardware because most
Analytical Data Interchange Protocol for Mass Spectrometric routines are left on disk. Only routines being used at any
Dataspecificationhaveanenumeratedsetofassociatedvalues. particulartimearekeptinmemory.Anylimitationsfoundwere
The attribute may take only one value from that restricted set. typically those not imposed by NetCDF but ones imposed by
In the implementation, each such attribute is defined as a the operating system or environment.
formal C type, and the allowed values are defined as an 6.1.1 Hardware (Personal Computers)—The personal com-
enumerated set of that formal type. Each enumerated value is puter system hardware originally used for testing was:
associated with a unique string literal, and it is these string 6.1.1.1 Intel 80286 processor,
literals, not the enumeration values, which are written to or 6.1.1.2 640K minimum,
read from the file. This practice both enforces the use of the 6.1.1.3 Monochrome, EGA, VGA graphics,
proper enumeration values and follows the NetCDF dictum 6.1.1.4 20 megabyte minimum, 80 megabyte hard-disk is
that files be self-describing. If the enumeration values were typical, and
written instead of the strings, then some lookup mechanism 6.1.1.5 A mouse (optional).
E2078 − 00 (2016)
6.1.1.6 NetCDF works well on AT-class machines and
UTIL
XDR
higher. NetCDF does not have the items in 6.1.1.1 – 6.1.1.5 as
SRC
requirements. These are just the minimum, base-level systems
NCDUMP
that were used.
NCGEN
NCTEST
6.1.2 Software—NetCDF runs on MS-DOS, OS/2,
Macintosh, Windows 95, and Windows NT operating systems
7.3.1 The UTIL and XDR makefiles work as distributed
for personal computers. NetCDF was originally ported from
using NMAKE with Microsoft C V6.0.
UNIX to DOS running on an IBM-PS/2 Model 80. It was
recently ported to the Macintosh OS. NetCDF is written in the
8. CDL Template Structure
C programming language, and there are FORTRAN jackets
8.1 ANetCDF template is built from CDLstatements and is
available for applications that want to use FORTRAN calls.
structured into three sections: (1) dimension declarations, (2)
The personal computer software originally employed for test-
variable declarations, and (3) the data section.
ing and developing NetCDF applications was:
6.1.2.1 Microsoft DOS V3.3 or above, 8.2 AfewpointsofclarificationabouttheCDLlanguageare
6.1.2.2 Microsoft C Compiler V6.0, given here to facilitate its understanding. For more in-depth
informationonCDL,pleaseconsultthe NetCDF User’s Guide.
6.1.2.3 Microsoft Windows V3.0,
6.1.2.4 Microsoft Windows SDK, and 8.2.1 A NetCDF template starts with the word “NetCDF”
6.1.2.5 NetCDF Version 2.0.1. followed by the dataset name.
6.1.3 Workstations and Servers—NetCDF runs easily on
8.2.2 CDL comments are indicated by two forward slash
UNIX workstations such as Sun 3, Sun 4, VAXstations, characters (//).
DECstation 3100, VAXstation II running ULTRIX or VMS,
8.2.3 Section indicators (dimensions:, variables:, and data:)
and IBM RS/6000. There are no particular hardware require-
end with a colon character (:). These are the only tokens that
ments for workstation class machines, since all workstations
end with a colon character.
have the minimum hardware outlined for personal computers
8.2.4 Statements within sections end with the semicolon
in 6.1.1.
character (;).
8.2.5 Variable names beginning with numbers must be
7. Significance and Use
preceded by an underline character (_). Otherwise the ncgen
7.1 General Coding Guidelines—The NetCDF libraries are parser will flag an error.
supplied to developers as source code. End users receive the
8.2.5.1 Underline characters were chosen for this protocol
libraries in compiled binary form as part of a vendor’s
over hyphen characters, because some compilers may interpret
application.
hyphens as subtraction operators. The feature of CDL that
7.1.1 Developers setting out to write a program to convert
allows implicit numerical datatyping of attributes in not being
their data files to the Mass Spectrometric Data Protocol should
used in the first version of the template. Instead, all floating
consider using the NetCDF utilities ncgen and ncdump. After
point attributes are being handled as strings. This forces
developers create the NetCDF file they should use the ncdump
programmers to explicitly type variables, thereby encouraging
program to generate the ASCII representation of the data file,
more deliberate programming styles. For example:
and examine it to ensure the data are being correctly put into
:aia_template_revision = “0.8”; //M12345
the file.
:netcdf_revision = “2.0.1”; //M12345
Consult the NetCDF User’s Guide for more complete
7.2 Make Files for NetCDF Libraries and Utilities—In
information on CDL syntax and usage.
generalthecompilationisstraightforward.Themakefileswere
8.2.6 Underline characters only can be used as separators
modified after they were received from the Unidata
between words within variable names, like:
Corporation, because they did not compile th
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.