Genomics informatics - Description rules for genomic data for genetic detection products and services

This document specifies requirements on the category definition and quality assessment of genomic data, including the content structure, attribute and description rules of data format, and the compilation rules of data format. This document applies to all the genomic data used for human genetic detection products and services. This document applies to genomic data processing and analysis, and to the quality evaluation/assessment of genomic data.

Informatique génomique — Règles de description des données génomiques pour les produits et services de détection génétique

General Information

Status
Published
Publication Date
08-May-2023
Current Stage
6060 - International Standard published
Start Date
09-May-2023
Due Date
25-Oct-2024
Completion Date
09-May-2023
Ref Project

Overview

ISO/TS 8392:2023 - "Genomics informatics - Description rules for genomic data for genetic detection products and services" is a technical specification that defines how genomic data used in human genetic detection products and services should be described, formatted and quality-assessed. The standard specifies requirements for category definition, content structure, attribute and description rules for data formats, and compilation rules for data archiving and metadata. It applies to genomic data processing, analysis and quality evaluation/assessment.

Key topics and technical requirements

  • Data classification: Differentiates between unstructured and structured genomic data and prescribes distinct description methods for each (data format & archive catalogue for unstructured data; metadata and data element codes for structured data).
  • Identifier rules: Requires alphanumeric data identifiers (DI) and version identifiers (VI), typically using a two-level identifier structure to uniquely identify data sets and versions.
  • Data format attributes: Defines a set of attributes for data formats (e.g., identifier, name, edition, registration authority, application scope, representation of allowed values) and recommends a logical, unique naming convention for data elements.
  • Metadata structure: Specifies metadata element classes and attributes to improve semantic clarity and enable consistent downstream processing.
  • Code requirements: Provides rules on code structure, length and format:
    • Prefer concise, user-friendly structures that harmonize with existing standards.
    • Recommend equal-length codes over variable-length; use numeric, alphabetic or alphanumeric characters.
    • Avoid ambiguous characters (e.g., “1” vs “I”); maintain consistent case, font and format.
    • Guidance for layered/hierarchical codes and code lists, with examples in annexes.
  • Compatibility and data cleaning: Advises processes for mapping and cleaning data when exchanging between differing description rules, including filling missing structured data, resolving logical inconsistencies and semi-automated cleaning.

Practical applications

  • Standardizing genomic data representation for genetic testing products, enabling consistent storage, retrieval and exchange.
  • Improving data quality and integrity for clinical genomics, bioinformatics pipelines and research databases.
  • Facilitating interoperability among laboratories, clinical decision support systems, regulatory submissions and third‑party services.
  • Supporting repeatable versioning and provenance for genomic datasets used in diagnostics and precision medicine.
  • Enabling reliable archiving and audit trails for genomic data in labs and service providers.

Who should use this standard

  • Clinical and research laboratories producing genomic test results
  • Bioinformatics teams designing data models and pipelines
  • Developers of genomic data management systems and clinical informatics platforms
  • Data stewards, registrars and quality managers in genomics services
  • Regulatory and compliance personnel evaluating genomic data practices

Related standards and compatibility

ISO/TS 8392:2023 was prepared by ISO/TC 215 (Health informatics, Subcommittee SC 1 Genomics informatics) and emphasizes harmonization with existing description rules. It supports compatibility across data exchange scenarios by prescribing metadata, identifier and code conventions and outlining data-cleaning practices for interoperability.

Keywords: genomics informatics, genomic data description, metadata standards, data identifier, data quality, genetic detection products, ISO/TS 8392:2023, code structure, data format.

Technical specification
ISO/TS 8392:2023 - Genomics informatics — Description rules for genomic data for genetic detection products and services Released:9. 05. 2023
English language
17 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


TECHNICAL ISO/TS
SPECIFICATION 8392
First edition
2023-05
Genomics informatics — Description
rules for genomic data for genetic
detection products and services
Informatique génomique — Règles de description des données
génomiques pour les produits et services de détection génétique
Reference number
© ISO 2023
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Data format attribute and description rules . 2
5 Composition and rules of genomic data description . 2
5.1 Identifier . 2
5.2 Data format . 2
5.3 Data archiving catalogue . 2
5.4 Metadata . 2
6 Core elements and rules for the description of genomic data . 2
6.1 Identifier . 2
6.2 Name . 3
7 Requirement of code . 3
7.1 Code structure . 3
7.2 Code length . 4
7.3 Code type and format . 4
7.4 Code list naming . 4
8 Compatibility with other rules .5
Annex A (informative)  . 6
Annex B (informative) Examples of Identifier: Two-level structure DI_V1 .11
Annex C (informative) Examples of Metadata Code .14
Annex D (informative) Code length calculation .16
Bibliography .17
iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see
www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 215, Health informatics, Subcommittee
SC 1, Genomics informatics.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
iv
Introduction
The decreasing cost of sequencing and the gradual in-depth study of genomics have led to the generation
of more and more genomic data, but the data quality in genomics is not optimal. From the dimension
of data level, there is a lack of data integrity, and medical information has been facing a problem of
semantic disunity. These problems have caused great obstacles to downstream applications.
Standardization of data is a prerequisite for data asset management and data storage and applications,
which can give better storage for genomic data and enlarge these genomic data used in precision
medicine.
This document is based on the actual situation of industry data production, combined with the needs of
upstream and downstream industry users. It also takes into account the use made by stakeholders and
user friendliness for all common types of genomic data. Solving the problem of data scope and semantic
unification can enhance the data association ability, ensure information exchange, improve data flow,
improve the data quality from the aspects of data integrity and data validity, and lay a good foundation
for subsequent data storage, data application and data sharing.
v
TECHNICAL SPECIFICATION ISO/TS 8392:2023(E)
Genomics informatics — Description rules for genomic
data for genetic detection products and services
1 Scope
This document specifies requirements on the category definition and quality assessment of genomic
data, including the content structure, attribute and description rules of data format, and the compilation
rules of data format.
This document applies to all the genomic data used for human genetic detection products and services.
This document applies to genomic data processing and analysis, and to the quality evaluation/
assessment of genomic data.
2 Normative references
There are no normative references in this document.
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
alignment-sequence code
continuous coding of objects in the same series, and reserving of extended space
3.2
code
representation of a piece of information such as a letter, word or phrase in another form, usually briefer
3.3
code structure
representation of the composition and length of a complete code
3.4
equal length code
coding system in which all coding objects have the same length
3.5
data identifier
DI
identifier that uniquely distinguishes one set of data from all others
3.6
layer code
hierarchical code consisting of membership order of coded objects
3.7
sequential code
code that represents in the natural order of Arabic numerals, or letters
3.8
variable-length code
code system in which the length of code is not exactly the same
3.9
version identifier
VI
unique number assigned to identify a version of submitted genomic data
4 Data format attribute and description rules
Genomic data can be classified as unstructured data and structured data.
The unstructured data should be described by data format, illustration of data format and archiving
catalogue.
The structured data should be described by metadata and data element code.
5 Composition and rules of genomic data description
5.1 Identifier
The description of genomic data should include data format, data attribute and metadata.
5.2 Data format
Data attribute elements for the description of data format are totally classified into 11 attributes in five
categories, shown in Table A.1. According to the universal property, including data element common
attributes and data element specific attributes.
5.3 Data archiving catalogue
Data elements for the data archiving catalogue are totally classified into 11 attributes in five categories,
shown in Table A.2. According to the universal property, including data element common attributes and
data element specific attributes.
5.4 Metadata
Data elements for metadata description are totally classified into 14 attributes in five categories, shown
in Table A.3. According to the universal property, including data element common attributes and data
element specific attributes.
6 Core elements and rules for the description of genomic data
6.1 Identifier
Identifier shall use alphanumeric code. The structure may be considered a two-level structure,
including DI and VI.
The structure of a data identifier is shown in Figure 1. Data identifier examples are shown in Annex B,
such as sequence information (see Table B.1) and bioinformatic analysis (see Table B.2).
Figure 1 — Structure of data identifier
6.2 Name
6.2.1 The data format name shall be unique and in the form of strings with letters and numbers. The
naming of data elements should use a certain logical structure and general terminology.
6.2.2 A complete data element name shall consist of object class term, property term, representation
term and (qualifier term).
— A data element has one and only one object class term. If there is one object in an omics data element
catalogue, it may be omitted as appropriate;
— A data element has one and only one property term. Property term is an essential component of any
data element name. Other terms may be abbreviated as appropriate when the expression of the data
element concept is complete, accurate, and unambiguous;
— A data element has a unique representation term. Redundant words can be removed from the name
when there are duplicates or partial repetitions of the representation term and the property term;
— Qualifier term is optional and is given from particular professional fields.
7 Requirement of code
7.1 Code structure
7.1.1 The structure design shall follow the requirements:
— The structure of the code shall be concise and avoid carrying too much information;
— The structure shall accord with the basic method of information processing and harmonize with
relevant standard structures;
— When adding, deleting or modifying one part of the code, the structure shall be unbroken;
— The code shall use user-friendly symbols.
7.1.2 The description for the data element code structure shall conform with the following
requirements:
— The type of code, the structure of the code and the coding method shall be clearly described;
— When the c
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...

Frequently Asked Questions

ISO/TS 8392:2023 is a technical specification published by the International Organization for Standardization (ISO). Its full title is "Genomics informatics - Description rules for genomic data for genetic detection products and services". This standard covers: This document specifies requirements on the category definition and quality assessment of genomic data, including the content structure, attribute and description rules of data format, and the compilation rules of data format. This document applies to all the genomic data used for human genetic detection products and services. This document applies to genomic data processing and analysis, and to the quality evaluation/assessment of genomic data.

This document specifies requirements on the category definition and quality assessment of genomic data, including the content structure, attribute and description rules of data format, and the compilation rules of data format. This document applies to all the genomic data used for human genetic detection products and services. This document applies to genomic data processing and analysis, and to the quality evaluation/assessment of genomic data.

ISO/TS 8392:2023 is classified under the following ICS (International Classification for Standards) categories: 35.240.80 - IT applications in health care technology. The ICS classification helps identify the subject area and facilitates finding related standards.

You can purchase ISO/TS 8392:2023 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.

The article discusses the ISO/TS 8392:2023 standard which provides guidelines for the description and quality assessment of genomic data used in genetic detection products and services. It sets requirements for the structure, attributes, and formatting rules of genomic data. The standard applies to all genomic data used in human genetic detection, including data processing, analysis, and quality evaluation.

제목: ISO/TS 8392:2023 - 유전체 정보학 - 유전체 데이터에 대한 유전 검출 제품 및 서비스에 대한 설명 규칙 내용: 이 문서는 유전체 데이터의 범주 정의 및 품질 평가에 대한 요구 사항을 명시하며, 데이터 형식의 내용 구조, 속성 및 설명 규칙, 데이터 형식의 편집 규칙 등을 다룹니다. 이 문서는 인간 유전 검출 제품 및 서비스에 사용되는 모든 유전체 데이터에 적용됩니다. 이 문서는 유전체 데이터 처리 및 분석, 그리고 유전체 데이터의 품질 평가와 평가에 적용됩니다.

The article discusses ISO/TS 8392:2023, which sets out guidelines for the categorization and quality assessment of genomic data used in genetic detection products and services. The document specifies requirements for the structure, attributes, and description rules of data formats, as well as the compilation rules of data formats. It applies to all genomic data used in human genetic detection products and services and covers data processing, analysis, and quality evaluation.

기사 제목: ISO/TS 8392:2023 - 유전체 정보학, 유전 검출 제품 및 서비스용 유전체 데이터에 대한 설명 규칙 기사 내용: 이 문서는 유전체 데이터의 범주 정의와 품질 평가에 대한 요구 사항을 명시하며, 데이터 형식의 내용 구조, 속성 및 설명 규칙, 데이터 형식의 편집 규칙 등을 다룹니다. 이 문서는 인간 유전 검출 제품 및 서비스에 사용되는 모든 유전체 데이터에 적용됩니다. 또한 이 문서는 유전체 데이터 처리 및 분석, 그리고 유전체 데이터의 품질 평가에도 적용됩니다.

記事のタイトル: ISO/TS 8392:2023 - ゲノム情報学-遺伝子検出製品およびサービスのゲノムデータの記述ルール 記事の内容:この文書では、遺伝子検出製品やサービスで使用されるゲノムデータのカテゴリー定義や品質評価の要件を指定しています。データ形式の構造や属性、説明のルール、およびデータ形式の編集ルールについても定めています。この文書は、人間の遺伝子検出製品やサービスで使用されるすべてのゲノムデータに適用されます。ゲノムデータの処理や分析、そして品質評価にも適用されます。

記事のタイトル:ISO/TS 8392:2023 - ゲノミクス情報学-遺伝子検出製品およびサービスのためのゲノムデータの記述ルール 記事の内容:この文書では、ゲノムデータのカテゴリー定義と品質評価に関する要件が明記されており、データ形式の内容構造、属性および説明のルール、データ形式の作成のルールなどが述べられています。この文書は、人間の遺伝子検出製品およびサービスで使用されるすべてのゲノムデータに適用されます。また、ゲノムデータの処理と分析、およびゲノムデータの品質評価にも適用されます。