ISO/TR 3985:2021
(Main)Biotechnology — Data publication — Preliminary considerations and concepts
Biotechnology — Data publication — Preliminary considerations and concepts
This document reviews best practices that: a) respect the existing standardization efforts of life sciences research communities; b) normalize key aspects of data description particularly at the level of the biology being studied (and shared) across the life sciences communities; c) ensure that data are “findable” and useable by other researchers; and d) provide guidance and metrics for assessing the applicability of a particular data sharing plan. This document is applicable to domains in life sciences including biotechnology, genomics (including massively parallel nucleotide sequencing, metagenomics, epigenomics and functional genomics), transcriptomics, translatomics, proteomics, metabolomics, lipidomics, glycomics, enzymology, immunochemistry, life science imaging, synthetic biology, systems biology, systems medicine and related fields.
Biotechnologie — Publication de données — Considérations et concepts préliminaires
General Information
Buy Standard
Standards Content (Sample)
TECHNICAL ISO/TR
REPORT 3985
First edition
2021-05
Biotechnology — Data publication
— Preliminary considerations and
concepts
Biotechnologie — Publication de données — Considérations et
concepts préliminaires
Reference number
©
ISO 2021
© ISO 2021
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2021 – All rights reserved
Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Abbreviated terms . 4
5 Principles . 4
5.1 General . 4
5.2 Current technologies, approaches and their flaws . 5
5.3 Standards and best practices to facilitate data sharing and reuse . 6
5.3.1 Maximizing value to the payer . 6
5.3.2 Data findability . 6
5.3.3 Data machine and human interpretability . 6
5.3.4 Using accepted controlled vocabularies and naming conventions. 6
5.3.5 Biological annotation technology domain independence . 6
5.3.6 Data locatability using multiple queries. 7
5.4 Additional desirable attributes . 7
5.4.1 Data linkage to a published and openly accessible document describing
the experimental system . 7
5.4.2 Data format linkage to a published and openly accessible document
describing the format . 7
5.4.3 Existing information technology . 7
5.4.4 Development of tools and best practices for creating web friendly and
search engine crawlable data documents . 7
5.5 Essential considerations . 7
5.5.1 Common annotation across multiple data sources . 7
5.5.2 Keyword template . 8
5.5.3 Embedding ontological descriptions . 9
5.5.4 Pseudo-documents . 9
6 Major challenges .10
6.1 General .10
6.2 Domain .10
6.3 Regionalization .10
6.4 Proprietary data .10
6.5 Large number of existing bio-ontologies, controlled vocabularies and terminologies .10
6.6 Large number of existing data repositories and corresponding domain specific
data formats .11
6.7 Large number of funding agencies (e.g. national, educational, philanthropic,
commercial) .11
7 Examples of existing national and regional standards or requirements for data
sharing or publication .11
7.1 General .11
7.2 USA .11
7.3 Canada .11
7.4 European Union .11
7.5 Germany .12
7.6 China .12
7.7 United Kingdom .12
7.8 India .12
7.9 Japan .12
8 Existing legal requirements for data protection .12
8.1 USA .12
8.2 European Union .13
9 Timing of data publication .13
10 Costs of data publication .13
11 Archival data .13
12 Validation and verification of compliance .13
13 Affected stakeholder categories .13
Annex A (informative) Searchability of scientific content on the web .14
Annex B (informative) Example enhanced annotation of text documents .16
Bibliography .17
iv © ISO 2021 – All rights reserved
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for whom a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see https:// www .iso .org/ directives -and -policies .html).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
iso/ foreword .html.
This document was prepared by Technical Committee ISO/TC 276, Biotechnology.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
Introduction
The explosion of life sciences data (big data) has created a need to digitally locate data from diverse
biological assays, obtained in a wide range of laboratories, and from a wide range of experimental
protocols. To be able to extract value from big data, it is necessary that the data are “findable”, and
that the biology measured in the assay is described in a way that it can be located and interpreted.
Data producer’s use of a consistent method to describe the biology that t
...
TECHNICAL ISO/TR
REPORT 3985
First edition
2021-05
Biotechnology — Data publication
— Preliminary considerations and
concepts
Biotechnologie — Publication de données — Considérations et
concepts préliminaires
Reference number
©
ISO 2021
© ISO 2021
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2021 – All rights reserved
Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Abbreviated terms . 4
5 Principles . 4
5.1 General . 4
5.2 Current technologies, approaches and their flaws . 5
5.3 Standards and best practices to facilitate data sharing and reuse . 6
5.3.1 Maximizing value to the payer . 6
5.3.2 Data findability . 6
5.3.3 Data machine and human interpretability . 6
5.3.4 Using accepted controlled vocabularies and naming conventions. 6
5.3.5 Biological annotation technology domain independence . 6
5.3.6 Data locatability using multiple queries. 7
5.4 Additional desirable attributes . 7
5.4.1 Data linkage to a published and openly accessible document describing
the experimental system . 7
5.4.2 Data format linkage to a published and openly accessible document
describing the format . 7
5.4.3 Existing information technology . 7
5.4.4 Development of tools and best practices for creating web friendly and
search engine crawlable data documents . 7
5.5 Essential considerations . 7
5.5.1 Common annotation across multiple data sources . 7
5.5.2 Keyword template . 8
5.5.3 Embedding ontological descriptions . 9
5.5.4 Pseudo-documents . 9
6 Major challenges .10
6.1 General .10
6.2 Domain .10
6.3 Regionalization .10
6.4 Proprietary data .10
6.5 Large number of existing bio-ontologies, controlled vocabularies and terminologies .10
6.6 Large number of existing data repositories and corresponding domain specific
data formats .11
6.7 Large number of funding agencies (e.g. national, educational, philanthropic,
commercial) .11
7 Examples of existing national and regional standards or requirements for data
sharing or publication .11
7.1 General .11
7.2 USA .11
7.3 Canada .11
7.4 European Union .11
7.5 Germany .12
7.6 China .12
7.7 United Kingdom .12
7.8 India .12
7.9 Japan .12
8 Existing legal requirements for data protection .12
8.1 USA .12
8.2 European Union .13
9 Timing of data publication .13
10 Costs of data publication .13
11 Archival data .13
12 Validation and verification of compliance .13
13 Affected stakeholder categories .13
Annex A (informative) Searchability of scientific content on the web .14
Annex B (informative) Example enhanced annotation of text documents .16
Bibliography .17
iv © ISO 2021 – All rights reserved
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for whom a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see https:// www .iso .org/ directives -and -policies .html).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
iso/ foreword .html.
This document was prepared by Technical Committee ISO/TC 276, Biotechnology.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
Introduction
The explosion of life sciences data (big data) has created a need to digitally locate data from diverse
biological assays, obtained in a wide range of laboratories, and from a wide range of experimental
protocols. To be able to extract value from big data, it is necessary that the data are “findable”, and
that the biology measured in the assay is described in a way that it can be located and interpreted.
Data producer’s use of a consistent method to describe the biology that t
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.