ISO/IEC 23092-1:2025
(Main)Information technology — Genomic information representation — Part 1: Transport and storage of genomic information
Information technology — Genomic information representation — Part 1: Transport and storage of genomic information
This document specifies data formats for both transport and storage of genomic information, including the conversion process.
Technologie de l'information — Représentation des informations génomiques — Partie 1: Transport et stockage des informations génomiques
General Information
Relations
Standards Content (Sample)
International
Standard
ISO/IEC 23092-1
Third edition
Information technology — Genomic
2025-01
information representation —
Part 1:
Transport and storage of genomic
information
Technologie de l'information — Représentation des informations
génomiques —
Partie 1: Transport et stockage des informations génomiques
Reference number
© ISO/IEC 2025
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
© ISO/IEC 2025 – All rights reserved
ii
Contents Page
Foreword .v
Introduction .vii
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Conventions . 4
4.1 Operators and functions .4
4.1.1 Arithmetic operators .4
4.1.2 Logical operators .4
4.1.3 Relational operators .4
4.1.4 Bitwise operators .4
4.1.5 Assignment operators .5
4.1.6 String/Character functions and operator .5
4.1.7 Data structure function and operator .5
4.1.8 Mathematical functions .5
4.1.9 Array operation functions .5
4.2 Syntax and semantics .6
4.2.1 Method of specifying syntax in tabular form .6
4.2.2 Bit ordering .6
4.2.3 Specification of syntax functions .6
4.2.4 Processes .7
5 Structure of coded genomic data . 7
5.1 Genomic sequencing data record .7
5.2 Genomic annotation data records .8
5.3 Data classes .9
5.4 Access units .10
5.5 Datasets .10
5.6 Annotation data tile .11
5.7 Annotation tables .11
5.8 Annotation access units .11
5.9 Selective access . 12
6 Data format .12
6.1 Format structure . 12
6.1.1 General . 12
6.1.2 Box order .17
6.2 Syntax for representation .18
6.3 Output data unit .19
6.4 Data structures common to file format and transport format . 20
6.4.1 File header . 20
6.4.2 Dataset group . 20
6.4.3 Dataset . 29
6.4.4 Access unit . 40
6.4.5 Block . 46
6.4.6 Annotation Table .47
6.4.7 Attribute Group .57
6.4.8 Annotation access unit .59
6.4.9 AAU block . 63
6.5 Data structures specific to file format . 64
6.5.1 General . 64
6.5.2 Indexing . 64
6.5.3 Descriptor stream .74
6.5.4 Offset .76
6.6 Data structures specific to transport format . 77
© ISO/IEC 2025 – All rights reserved
iii
6.6.1 General . 77
6.6.2 Data streams . 77
6.6.3 Dataset mapping table list . 77
6.6.4 Dataset mapping table . 78
6.6.5 Packet . 80
6.7 Reference procedures to convert transport format to file format . 81
6.7.1 Procedure for genomic sequencing data . 81
6.7.2 Procedure for genomic annotation data. 83
7 String indexing technologies .87
7.1 Master string index . 87
7.1.1 General . 87
7.1.2 Syntax . . 87
7.1.3 Master String Index Header . 87
7.1.4 String index . 88
7.1.5 Compressed string index .
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.