Document management -- Digital file format recommendations for long-term storage

This document gives guidelines for selecting the most appropriate file format(s) for the storage, usability, and exchange of data with a long-term management objective. It is applicable to the selection of file formats to be used to store electronic documents. It provides guidance that takes into account: — the durability of documents in a readable form; — fidelity to the original and data integrity; — interoperability, i.e. independence from creation applications, information systems and rendition platforms; — compliance with relevant laws and regulations; — compliance with format specifications; — reducing costs by reducing the number of conversions/migrations over time. This document is applicable to all office activities (e.g. text processing, spreadsheets, presentations), email and static web pages, as well as all types of electronic components, including images, video and sound. It does not apply to database formats.

Gestion électronique -- Recommandations de format de fichier numérique pour le stockage à long terme

General Information

Status
Published
Publication Date
04-Nov-2018
Current Stage
6060 - International Standard published
Start Date
28-Sep-2018
Completion Date
05-Nov-2018
Ref Project

Buy Standard

Technical report
ISO/TR 22299:2018 - Document management -- Digital file format recommendations for long-term storage
English language
12 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (sample)

TECHNICAL ISO/TR
REPORT 22299
First edition
2018-11
Document management — Digital file
format recommendations for long-
term storage
Gestion électronique — Recommandations de format de fichier
numérique pour le stockage à long terme
Reference number
ISO/TR 22299:2018(E)
ISO 2018
---------------------- Page: 1 ----------------------
ISO/TR 22299:2018(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2018

All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may

be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting

on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address

below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2018 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/TR 22299:2018(E)
Contents Page

Foreword ........................................................................................................................................................................................................................................iv

Introduction ..................................................................................................................................................................................................................................v

1 Scope ................................................................................................................................................................................................................................. 1

2 Normative references ...................................................................................................................................................................................... 1

3 Terms and definitions ..................................................................................................................................................................................... 1

4 Basic selection criteria according to the content type description .................................................................1

4.1 General ........................................................................................................................................................................................................... 1

4.2 Selection methodology ..................................................................................................................................................................... 2

4.2.1 File format description ............................................................................................................................................... 2

4.2.2 Long-term availability of readers or players ........................................................................................... 2

4.2.3 File presentation stability ........................................................................................................................................ 3

4.2.4 Software and/or operating system migrations ..................................................................................... 3

4.2.5 File format selection ..................................................................................................................................................... 3

5 File formats ................................................................................................................................................................................................................ 4

5.1 General ........................................................................................................................................................................................................... 4

5.2 Coded text .................................................................................................................................................................................................... 4

5.3 Vector graphics ....................................................................................................................................................................................... 4

5.3.1 2D graphics ........................................................................................................................................................................... 4

5.3.2 3D graphics ........................................................................................................................................................................... 4

5.3.3 Technical drawings ........................................................................................................................................................ 5

5.4 Images ............................................................................................................................................................................................................ 5

5.5 Sound ............................................................................................................................................................................................................... 5

5.5.1 Linear formats for sound files .............................................................................................................................. 5

5.5.2 Lossless compression formats ............................................................................................................................. 6

5.5.3 Lossy compression formats .................................................................................................................................... 6

5.5.4 Container formats ........................................................................................................................................................... 6

5.6 Video ................................................................................................................................................................................................................ 6

5.6.1 General...................................................................................................................................................................................... 6

5.6.2 Coding ....................................................................................................................................................................................... 7

5.6.3 Digitalization ....................................................................................................................................................................... 7

5.6.4 Compression ........................................................................................................................................................................ 7

5.6.5 Video container formats ............................................................................................................................................ 8

5.7 Office automation ................................................................................................................................................................................. 8

5.8 Formats suitable for preservation .......................................................................................................................................... 8

Bibliography .............................................................................................................................................................................................................................10

© ISO 2018 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/TR 22299:2018(E)
Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards

bodies (ISO member bodies). The work of preparing International Standards is normally carried out

through ISO technical committees. Each member body interested in a subject for which a technical

committee has been established has the right to be represented on that committee. International

organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.

ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of

electrotechnical standardization.

The procedures used to develop this document and those intended for its further maintenance are

described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the

different types of ISO documents should be noted. This document was drafted in accordance with the

editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/directives).

Attention is drawn to the possibility that some of the elements of this document may be the subject of

patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of

any patent rights identified during the development of the document will be in the Introduction and/or

on the ISO list of patent declarations received (see www .iso .org/patents).

Any trade name used in this document is information given for the convenience of users and does not

constitute an endorsement.

For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and

expressions related to conformity assessment, as well as information about ISO's adherence to the

World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www .iso

.org/iso/foreword .html.

This document was prepared by Technical Committee ISO/TC 171, Document management applications,

Subcommittee SC 2, Document file formats, EDMS systems and authenticity of information.

Any feedback or questions on this document should be directed to the user’s national standards body. A

complete listing of these bodies can be found at www .iso .org/members .html.
iv © ISO 2018 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/TR 22299:2018(E)
Introduction

The document management industry is heavily reliant on standardized file formats for both long-term

storage and interoperability purposes.

Effective document management often requires the selection of an appropriate storage file format and

eventually conversion between the native digital document format and the selected storage file format.

This document provides information and guidelines on file formats to assist in the selection of file

formats.
© ISO 2018 – All rights reserved v
---------------------- Page: 5 ----------------------
TECHNICAL REPORT ISO/TR 22299:2018(E)
Document management — Digital file format
recommendations for long-term storage
1 Scope

This document gives guidelines for selecting the most appropriate file format(s) for the storage,

usability, and exchange of data with a long-term management objective.

It is applicable to the selection of file formats to be used to store electronic documents. It provides

guidance that takes into account:
— the durability of documents in a readable form;
— fidelity to the original and data integrity;

— interoperability, i.e. independence from creation applications, information systems and rendition

platforms;
— compliance with relevant laws and regulations;
— compliance with format specifications;
— reducing costs by reducing the number of conversions/migrations over time.

This document is applicable to all office activities (e.g. text processing, spreadsheets, presentations), email

and static web pages, as well as all types of electronic components, including images, video and sound.

It does not apply to database formats.
2 Normative references

The following documents are referred to in the text in such a way that some or all of their content

constitutes requirements of this document. For dated references, only the edition cited applies. For

undated references, the latest edition of the referenced document (including any amendments) applies.

ISO 12651-1, Electronic document management — Vocabulary — Part 1: Electronic document imaging

3 Terms and definitions

For the purposes of this document, the terms and definitions given in ISO 12651-1 apply.

ISO and IEC maintain terminological databases for use in standardization at the following addresses:

— ISO Online browsing platform: available at https: //www .iso .org/obp
— IEC Electropedia: available at http: //www .electropedia .org/
4 Basic selection criteria according to the content type description
4.1 General
The following criteria can be considered when selecting a file format:

— the file format functionality, i.e. the type of content it is able to support (e.g. text only, enhanced text

with images or style sheets, images, video, sound);
© ISO 2018 – All rights reserved 1
---------------------- Page: 6 ----------------------
ISO/TR 22299:2018(E)
— the file format specifications that are made available as an open standard;
— the file format that can be used in the intended application;
— the metadata that can be incorporated into the file;

— the likelihood that a reader or a player will still be available on a long-term basis;

— whether the file format has widespread support by the industry and vendors.
4.2 Selection methodology
4.2.1 File format description

The unconstrained availability of a file format specification is essential for the development of software

products, now and in the future, that are capable of correctly representing the content of files of this type.

End users should seek assurance of the openness (free availability) of a file format specification before

using this format for long-term storage. If file format specifications are not freely available, the file

format is not recommended for long-term retention and could be used only after a comprehensive risk

analysis.

The file format should be available as an open standard, which has been developed and is maintained

by an authoritative, neutral standardization body with no copyright restrictions or fees for use.

Electronic content can be stored in a document management environment so that software and/

or users can use the content. There are standardized and non-standardized file formats that can be

considered. Non-standardized formats should only be used with caution and only if the file format is

fully documented. Examples of standardized file formats include JPG and PDF (and the PDF sub-sets).

Non-standardized, but widespread (and commonly used) formats include TIFF, which is a proprietary

format. The decision to use (select) standardized formats versus non-standardized formats should

be considered by the end-user organization and is dependent on other aspects of the document

management system. For example, a document may be received in PDF format, but then its pages may

be extracted into TIFF or JPG for further processing, such as data extraction, etc.

4.2.2 Long-term availability of readers or players

From a long-term storage/archival perspective, the organization should always take into account

the potential need to migrate and/or convert existing formats. As technology continues to mature

and expand, file formats are being updated as required. For example, the PDF subsets that are now

available. As a result, formats that are in use today may need to be updated to ensure the usability of

the information they contain is retained in the future.

An organization may need to maintain the originals of documents that contain essential information for

authenticity and integrity, such as digital signatures, seals or timestamps, recognizing that migration/

conversion to another format could invalidate those elements. The organization should recognize the

existence of different use cases for file formats and take this into account when selecting long-term file

formats.

It is also important to take into consideration that a tool or application has to be available that can

properly open and display the contents of the file. These “readers” should be kept up-to-date so that

they are able to function in the current operating environment. In cases where non-standardized

formats are used, it is important that the organization is able to maintain a reader to open/read the

files. As technologies change and expand (e.g. a new sub-set of PDF), the organization should verify that

the reader is not only able to open/display new files, but also legacy files.
There are three strategies for managing reader applications:
— porting the existing software to new operating systems;
2 © ISO 2018 – All rights reserved
---------------------- Page: 7 ----------------------
ISO/TR 22299:2018(E)
— developing new software for new operating systems;

— emulation supporting the continuous usage of old software in new computing environments.

The first two options are suitable for widely adopted file formats. Porting is considered when the

corresponding cost is relatively low. Developing new software allows the user to add new functionalities

and to improve usability.
4.2.3 File presentation stability

Content retained for legal or records management purposes should be stored using tamper-resistant

file formats.

Files should not depend on external resources that could be modified or become unavailable in the future.

Files should not contain embedded code (e.g. macros) or other features that could change the

representation of the file content.
Enhanced text is characterized by the fact that:
— letters can be presented using different fonts;
— images can be represented using different file formats.

A reader may only support a reduced set of fonts. If there is a need to use one or more fonts in addition

to those of that reduced set, the additional fonts should be embedded inside the file. Since this can

increase the file size, it can be preferable to only use the fonts that are supported in the reduced set.

Where different fonts are supported by the reader, it is preferable to allow only embedded fonts, in

order to avoid external dependencies.

A reader may only support a reduced set of image formats. It may support additional formats using

external readers. However, the availability of these external readers should be demonstrated in the

same way as those of the text readers.
4.2.4 Software and/or operating system migrations

Tests should be performed to provide assurance of the fidelity of the rendering when:

— porting the existing software to new operating systems;
— developing new software for new operating systems;

— emulation supporting continuous usage of old software in new computing environments.

4.2.5 File format selection

Different file formats may be considered where the content to be stored is coded text, enhanced text,

2D graphics, 3D graphics, images, sound or video. These formats are addressed in Clause 4.

Consideration should be given to the following criteria when selecting a file format:

— any intellectual property associated with the use of the format;
— available software tools for reading and writing the format;
— long-term access to the technical specification(s) defining the format;
— certification and/or compliance related to the format.
© ISO 2018 – All rights reserved 3
---------------------- Page: 8 ----------------------
ISO/TR 22299:2018(E)
5 File formats
5.1 General

To reduce the volume of information processing, it is important to consider compressing the data (e.g.

images, sound and video) while preserving the required quality and usability (e.g. evaluating the sound

quality for the listener). For digitizing analogue materials or digital recordings for the purposes of

long-term preservation, any lossy compression process should be avoided. Only a few of the numerous

compression methods are identified below. It is important to understand that the same format name

may be shared by a family of sub-formats with different compression characteristics.

5.2 Coded text

Plain text file contains only characters and special symbols. Different encodings can be used. See ISO/

IEC 646, ISO 1073 (all pa
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.