Document management -- Electronic content/document management (CDM) data interchange format

ISO 22938:2017 defines the interchange format for content/document management (CDM) data and all associated resources.

Gestion de documents -- Format d'échange de données pour la gestion de documents/du contenu électronique

General Information

Status
Published
Publication Date
25-Jun-2017
Current Stage
6060 - International Standard published
Start Date
18-May-2017
Completion Date
26-Jun-2017
Ref Project

RELATIONS

Buy Standard

Standard
ISO 22938:2017 - Document management -- Electronic content/document management (CDM) data interchange format
English language
13 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (sample)

INTERNATIONAL ISO
STANDARD 22938
Second edition
2017-06
Document management — Electronic
content/document management
(CDM) data interchange format
Gestion de documents — Format d’échange de données pour la
gestion de documents/du contenu électronique
Reference number
ISO 22938:2017(E)
ISO 2017
---------------------- Page: 1 ----------------------
ISO 22938:2017(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2017, Published in Switzerland

All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form

or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior

written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of

the requester.
ISO copyright office
Ch. de Blandonnet 8 • CP 401
CH-1214 Vernier, Geneva, Switzerland
Tel. +41 22 749 01 11
Fax +41 22 749 09 47
copyright@iso.org
www.iso.org
ii © ISO 2017 – All rights reserved
---------------------- Page: 2 ----------------------
ISO 22938:2017(E)
Contents Page

Foreword ........................................................................................................................................................................................................................................iv

Introduction ..................................................................................................................................................................................................................................v

1 Scope ................................................................................................................................................................................................................................. 1

2 Normative references ...................................................................................................................................................................................... 1

3 Terms and definitions ..................................................................................................................................................................................... 1

4 Abbreviated terms .............................................................................................................................................................................................. 1

5 XML-based data interchange format with OPC-based packaging ...................................................................... 2

5.1 General ........................................................................................................................................................................................................... 2

5.2 Use of XML and OPC for content/document management data ................................................................... 2

5.2.1 Overview of OPC structure ...................................................................................................................................... 2

5.2.2 Content/document management (CDM) — Specific OPC structure ................................... 2

5.2.3 Content/document management (CDM) — Specific relationships ..................................... 2

5.2.4 Overview of XML structure ..................................................................................................................................... 2

5.2.5 Content/document management (CDM) — Specific XML structure .................................. 3

5.3 Representing CDM data — Example .................................................................................................................................... 7

5.4 Representing CDM data and associated content using the OPC package — Example .............. 9

Bibliography .............................................................................................................................................................................................................................13

© ISO 2017 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO 22938:2017(E)
Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards

bodies (ISO member bodies). The work of preparing International Standards is normally carried out

through ISO technical committees. Each member body interested in a subject for which a technical

committee has been established has the right to be represented on that committee. International

organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.

ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of

electrotechnical standardization.

The procedures used to develop this document and those intended for its further maintenance are

described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the

different types of ISO documents should be noted. This document was drafted in accordance with the

editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).

Attention is drawn to the possibility that some of the elements of this document may be the subject of

patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of

any patent rights identified during the development of the document will be in the Introduction and/or

on the ISO list of patent declarations received (see www .iso .org/ patents).

Any trade name used in this document is information given for the convenience of users and does not

constitute an endorsement.

For an explanation on the voluntary nature of standards, the meaning of ISO specific terms and

expressions related to conformity assessment, as well as information about ISO’s adherence to the

World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see the following

URL: w w w . i s o .org/ iso/ foreword .html.

This document was prepared by Technical Committee ISO/TC 171, Document management applications,

Subcommittee SC 2, Document file formats, EDMS systems and authenticity of information.

This second edition cancels and replaces the first edition (ISO 22938:2008), which has been technically

revised.
iv © ISO 2017 – All rights reserved
---------------------- Page: 4 ----------------------
ISO 22938:2017(E)
Introduction

This document specifies a consistent interchange format for data contained in electronic

content/document management (CDM) systems, including documents, their associated resources, and

retrieval index values that are stored in, or managed by, these technologies. Such a standard should

facilitate the exact interchange of CDM data, i.e. the standard should not require that the data be

irreversibly modified or packaged within a format that does not allow the reconstruction of the original

data. Therefore, this document avoids choosing one particular data format and anointing it as the

interchange standard for CDM. Rather, this document specifies a common markup format, based on the

XML (eXtensible Markup Language), which encapsulates all forms of CDM data. A DTD (document type

definition) describes the XML markup used for CDM data transfer. The XML format is a W3C (World

Wide Web Consortium) standard, adopted in February 1998. XML is extensible, so that additional CDM

formats may be easily specified by appropriately updating the DTD.

The purpose of this document is to define standards for information interchange in a way that benefits

both the consumers and vendors of content/document management systems. Some possible benefits

are as follows:

a) document information can be exported from one standard’s compliant CDM system and afterwards

imported to another standard’s compliant CDM system;

b) disparate CDM systems within an enterprise (due to autonomous selection, replacement, or

merger/acquisition) will be able to exchange or consolidate CDM information.

To this end, the standards are defined with the goal of striking a balance between being either too

restrictive or too general. They should be broad enough to encompass all common CDM information

types and all common uses of CDM systems, as well as ones that might be expected in the future. On

the other hand, the standards should be restrictive enough so that CDM vendors do not have inordinate

difficulty complying with the standards.
© ISO 2017 – All rights reserved v
---------------------- Page: 5 ----------------------
INTERNATIONAL STANDARD ISO 22938:2017(E)
Document management — Electronic content/document
management (CDM) data interchange format
1 Scope

This document defines the interchange format for content/document management (CDM) data and all

associated resources.
2 Normative references

The following documents are referred to in the text in such a way that some or all of their content

constitutes requirements of this document. For dated references, only the edition cited applies. For

undated references, the latest edition of the referenced document (including any amendments) applies.

ISO/IEC 29500-2:2012, Information technology — Document description and processing languages —

Office Open XML file formats — Part 2: Open Packaging Conventions

Berners-Lee T., Fielding R. and Masinter L. RFC 3986: Uniform Resource Identifier (URI): Generic

Syntax. The Internet Society, 2005 [viewed 2017-05-15]. Available from: http://www.ietf.org/rfc/

rfc3986.txt
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.

ISO and IEC maintain terminological databases for use in standardization at the following addresses:

— IEC Electropedia: available at http:// www .electropedia .org/
— ISO Online browsing platform: available at http:// www .iso .org/ obp
3.1
document
discreet unit or collection of content
3.2
rendition
electronic encoding of a document (3.1)
3.3
packages
collection containing rendition(s) (3.2) and related metadata
4 Abbreviated terms
CDM content/document management
DTD document type definition
W3C World Wide Web Consortium
XML eXtensible Markup Language
© ISO 2017 – All rights reserved 1
---------------------- Page: 6 ----------------------
ISO 22938:2017(E)
5 XML-based data interchange format with OPC-based packaging
5.1 General

The document interchange format for electronic documents is an application of the XML. XML is an

extensible, flexible, platform-independent format, and has been adopted by the W3C as a standard

(officially a “recommendation” in W3C terminology).

The primary use of this document is to exchange data between diverse document management systems

that do not already have an exchange methodology in place. This document is considered to be the

foundational platform from which other XML-based exchange standards are developed, ensuring a

common framework throughout the document management industry. The use of the ZIP-based Open

Packaging Convention (OPC) to group the document interchange format XML, the content it describes,

and related resources into a single standardized archive file allows the interchange of documents

among CDM systems without the risk of the related parts becoming separated or out of sync.

5.2 Use of XML and OPC for content/document management data
5.2.1 Overview of OPC structure

The document interchange format for electronic documents utilizes the packaging format described

in ISO/IEC 29500-2 (“OPC”). This is a ZIP-based format containing data files (“Parts”) and metadata

describing relationships between these parts.
5.2.2 Content/document management (CDM) — Specific OPC structure

A document of the format specified in this document which implements OPC packaging shall be an OPC

package, as specified in ISO/IEC 29500-2. In addition to the requirements specified in ISO/IEC 29500-2,

the package shall contain the OPC parts shown in Table 1.
Table 1 — OPC parts
Logical Name Description Content type

/metadata.xml XML metadata content/document manage- application/vnd.documentmanagement-

ment structure (as specified in 5.2.4) metadata+xml

/_rels/.rels XML representation of relationships be- application/vnd.openxmlformats-

tween Parts included in the package as package.relationships+xml
specified in 5.2.3.

Other parts Renditions of content as specified in 5.2.5, f). Appropriate to content

The content types of OPC Parts contained in the package shall be mapped to package data as defined in

ISO/IEC 29500-2:2012, 10.1.2, which includes mapping of the content type of most types of data stored

in the package to the data in a Content Types stream with the logical name [Content_Type].xml included in

the package as specified in ISO/IEC 29500-2:2012, 10.2.6.
5.2.3 Content/document management (CDM) — Specific relationships

A document of the format specified in this document which implements the OPC packaging described in

5.2 shall include a Relationships part as specified in ISO/IEC 29500-2:2012, 9.3.1. The Relationships part

shall include, at a minimum, a Relationship identifying the document interchange format XML, with the

relationship type identified as http:// placeholder _uri/ documentmanagement -metadata.

5.2.4 Overview of XML structure

XML consists of markup and data. The markup consists of (usually paired) tags called elements, which

may contain descriptive data called attributes. The data are the non-markup content residing between

2 © ISO 2017 – All rights reserved
---------------------- Page: 7 ----------------------
ISO 22938:2017(E)

element pairs. The elements can be nested, so that one element may contain sub-elements, which can in

turn contain sub-sub-elements, etc.

This document defines the elements, element structure, and element attributes suitably, so that the

various forms of CDM data, resources, index values, etc., can be clearly and unambiguously described

and included as data. The model which describes this is an XML Schema. The precise schema is the

essential content of this document.
5.2.5 Content/document management (CDM) — Specific XML structure

The XML structure of a CDM is described in an XML Schema Definition (XSD) below. The elements used

in that XSD and their meanings are the following.
a) cdm_interchange

This is the root node of the interchange XML. It consists of an identifier to uniquely identify the

interchange operation (interchange_id), the action that a CDM system should execute when

processing the interchange XML (cdm_action), information about the creation of the interchange

package (creator, vendor, creation_date, creation_time), and a set of document collections (cdm_

collection). Creation_time should be a string in ISO 8601 format.
b) cdm_collection

This is the collection of documents contained in the package. It consists of a collection identifier

(coll_id), a name (coll_name), a set of index values for the collection (index_set), and a set of

documents (cdm_doc).
c) cdm_doc

This is the element representing a document contained in a document collection. It consists of a

unique document identifier (doc_id), a document type (type), a document title (title), a set of index

values for the document (index_set), and the content that comprises the actual document data

(doc_content). It shall contain an index_set of metadata and a doc_content element, which contains

the method used to encode or provide explicit external reference to the data.
d) index_set

This element contains metadata related to a document or document collection. It consists of a set

of fields (index_field) or a record (index_record). Index_set shall contain at least one index_field for

each cdm_doc, with the attributes of index_name, index_description and index_content.

e) index_field

This element references index_name, index_description, and index_content elements. Any index_

set element shall contain at least one index_field element.
f) index_record
This element organizes multiple index_field entries into a logical group.
g) doc_content

This element defines the document contents being transmitted as part of the cdm_interchange

operation. Each doc_content shall contain one or more renditions.
h) rendition

This element defines the renditions, if any, and their attributes. Rendition includes the document

content (content) and resources needed to use the content (rsrc_data) elements. These elements

are used to provide a mechanism to define the access_method, encoding and compression for

each rendition. The access_method is required, and the encoding and compression attributes are

optional. Supported values of access_method include Base64, URI, and MIME.
© ISO 2017 – All rights reserved 3
---------------------- Page: 8 ----------------------
ISO 22938:2017(E)

When using OPC to package the CDM data XML, content, and rsrc_data, the access_method for

renditions included in the OPC package shall be URI, and the encoding shall be set to the relative

URI (as specified in RFC 3986:2005, 4.2) of the content or rsrc_data within the OPC package as

specified in ISO/IEC 29500-2:2012, A.3. For such renditions, the compression attribute should not

be included by producing applications, and may be ignored by consuming applications.

i) rsrc_data

This element encloses CDM resource data within each rendition. Examples of resource data are

bitmaps and fonts that are needed to render the contained document. It provides information

defining the method to be used to access the resource (access_method), the type of the resource file

(filetype), the encoding used to store the resource (encoding), and any method used to compress

the resource in the package (compression). Examples of filetype could be TIFF, PDF, PDF/A, JPEG,

JPEG2000 and RTF. It is recommended to use only IANA-registered mimetypes.
j) annotations

This element encloses the annotation-related information for a rendition. The annotation is

expressed as a stream of knowledge that would be defined by the vendor. Some vendors have

highlight information, while others might have blobs, bitmaps or data files. The knowledge content

of the annotation would be vendor-specific. It provides information defining the method to be used to

access the annotations (access_method), the type of the annotation file (filetype), the encoding

used to store the annotation (encoding), and any method used to compress the annotation in the

package (compression).
k) content

This element provides information defining the method to be used to access the content (access_

method), the type of the content file (filetype), the encoding used to store the content (encoding),

and any method used to compress the content in the package (compression). Encoding is the base64

representation of the document rendition data based on the value of the access_method attribute.

l) index_name

This element provides for a name to be associated with the index element record attributes.

m) record attributes
This element provides a name and description for the index record.
n) index_description

This element allows a description containing unconstrained text to be associated with the index for

documentation of information purposes.
o) index_content
This element contains the value for the index.

The schema used for CDM data interchange is below. Schemas for other XML parts included in CDM

packages using OPC packaging are specified in ISO\IEC 29500-2:2012, Annex D.

This schema is intended to provide the framework/mechanism to exchange data between diverse

systems in the absence of a specific schema. Organizations that do not have an implementation-

specific model of this schema shall use this model for specific information exchange between

diverse document management systems.

To create an application-specific instance of this schema, users shall use this schema as the

framework, or model, ensuring the appropriate level of information exchange between diverse

document management systems.
4 © ISO 2017 – All rights reserved
---------------------- Page: 9 ----------------------
ISO 22938:2017(E)











































...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.