Management of terminology resources — Terminological markup framework

This document specifies a framework for representing data recorded in terminological data collections (TDCs). This framework includes a metamodel and methods for describing specific terminological markup languages (TMLs), exemplified in this document in eXtensible Markup Language (XML). The mechanisms for implementing constraints in a TML are defined, but not the specific constraints for individual TMLs. This document is designed to support the development and use of computer applications for terminological data and the exchange of such data between different applications. This document also defines the conditions that allow the data expressed in one TML to be mapped onto another TML.

Gestion des ressources terminologiques — Plate-forme pour le balisage de terminologies informatisées

Upravljanje terminoloških virov - Ogrodje za označevanje terminologije

General Information

Status
Published
Publication Date
09-Dec-2025
Current Stage
6060 - International Standard published
Start Date
10-Dec-2025
Due Date
23-Jun-2026
Completion Date
10-Dec-2025

Relations

Standard
ISO 16642:2025 - Management of terminology resources — Terminological markup framework Released:10. 12. 2025
English language
18 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
ISO/DIS 16642:2025
English language
21 pages
sale 10% off
sale 10% off
e-Library read for
1 day

Standards Content (Sample)


International
Standard
ISO 16642
Third edition
Management of terminology
2025-12
resources — Terminological
markup framework
Gestion des ressources terminologiques — Plate-forme pour le
balisage de terminologies informatisées
Reference number
© ISO 2025
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Modular approach . 4
5 Generic model for describing terminological data . 6
5.1 Principles .6
5.2 Metamodel .7
5.3 Example .9
6 Requirements for compliance to TMF . 9
7 Interchange and interoperability . 10
8 Representing languages . 10
9 Defining a TML .11
9.1 Requirements .11
9.2 Defining interoperability conditions .11
10 Implementing an XML-serialized TML .11
10.1 General .11
10.2 Implementing the metamodel .11
10.3 Anchoring data categories . 12
10.3.1 General . 12
10.3.2 Styles and vocabulary . 12
10.4 Constraints on datatypes . 13
10.5 Implementing annotations . 13
10.6 Implementing brackets . 13
Annex A (informative) Conformance of terminological data to TMF: example scenario . 14
Bibliography .18

iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO’s adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology, Subcommittee
SC 3, Management of terminology resources.
This third edition cancels and replaces the second edition (ISO 16642:2017), which has been technically
revised.
The main changes are as follows:
— The scope is no longer restricted to representing terminological data in XML format. Terminological
markup languages can be serialized in any formats, but they are exemplified in this document in XML.
— DatCatInfo PIDs have been updated.
— Terms and definitions have been updated according to the latest International Standards.
— Examples have been updated to reflect ISO 30042:2019.
— Annex A has been significantly revised in order to show two TMF-compliant XML-based serialization
examples of the same terminological data.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.

iv
Introduction
Terminological data are collected, managed and stored in a wide variety of systems, typically database
management systems, ranging from personal computer applications for individual users to large terminology
database systems operated by major companies and governmental agencies. Terminology databases
comprise various types of information, called “data categories”, and can adopt different structural models.
However, terminological data often need to be shared and reused in a number of applications, and this
sharing is facilitated when the data adhere to a common model. To facilitate co-operation and to prevent
duplicate work, it is important to develop standards and guidelines for creating and using terminological
data collections (TDCs) as well as for sharing and exchanging data.
This document presents a modular approach for analysing existing TDCs and designing new ones. It also
provides a framework for defining terminological markup languages (TMLs) that are interoperable.
This document refers to DatCatInfo, an example of an available data category repository (DCR). DatCatInfo
is an online database of information about the types of data that can be included in TDCs and other language
resources. It is available at www.datcatinfo.net.

v
International Standard ISO 16642:2025(en)
Management of terminology resources — Terminological
markup framework
1 Scope
This document specifies a framework for representing data recorded in terminological data collections
(TDCs). This framework includes a metamodel and methods for describing specific terminological markup
languages (TMLs), exemplified in this document in eXtensible Markup Language (XML). The mechanisms for
implementing constraints in a TML are defined, but not the specific constraints for individual TMLs.
This document is designed to support the development and use of computer applications for terminological
data and the exchange of such data between different applications. This document also defines the conditions
that allow the data expressed in one TML to be mapped onto another TML.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO 704, Terminology work — Principles and methods
ISO 1087, Terminology work and terminology science — Vocabulary
ISO 12616-1, Terminology work in support of multilingual communication — Part 1: Fundamentals of
translation-oriented terminography
ISO 26162 (all parts), Management of terminology resources — Terminology databases
ISO 30042, Management of terminology resources — TermBase eXchange (TBX)
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 1087 and the following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
basic information unit
information unit (3.13) attached to a component (3.3) of the metamodel (3.15) that can be expressed by means
of a single data category (3.7)
EXAMPLE Term, note.
3.2
complementary information
CI
information supplementary to that described in concept entries (3.5) and shared across the terminological
data collection (3.23)
EXAMPLE Domain hierarchies, bibliographic references, references to text corpora.
3.3
component
elementary description unit of a metamodel (3.15) to which data categories (3.7) can be associated to form a
data model
3.4
compound information unit
information unit (3.13) attached to a component (3.3) of the metamodel (3.15) that can be expressed by means
of several grouped data categories (3.7)
EXAMPLE IDLangTgtDtyp, transacGrp.
3.5
concept entry
CE
terminological entry
entry
part of a terminological data collection (3.23) which contains the terminological data related to one concept
3.6
conceptual domain
permissible content of a data category (3.7)
EXAMPLE In a terminology database, the data category /part of speech/ can have a conceptual domain consisting
of the values /noun/, /verb/, /adjective/, /adverb/.
Note 1 to entry: The permissible content can be closed, as in the example, or subject to formal restrictions such as
dates, or free text such as the conceptual domain of /definition/. Although the latter type is not formally restricted, it
is nevertheless subject to adherence to the requirements of its data category specification (3.10), i.e. it contains a true
definition and not a note, example or some other piece of information.
[SOURCE: ISO 12620-1:2022, 3.1]
3.7
data category
DC
class of data items that are closely related from a formal or semantic point of view
EXAMPLE /part of speech/, /subject field/, /definition/.
Note 1 to entry: A data category can be viewed as a generalization of the notion of a field in a database.
Note 2 to entry: In running text, such as in this document, data category names are enclosed in forward slashes (e.g.
/part of speech/).
3.8
data category repository
DCR
digital collection of data category specifications (3.10)
EXAMPLE DatCatInfo, a DCR for language resources (see Reference [6]).
Note 1 to entry: Data category repositories are used as references when specifying language resources.
[SOURCE: ISO 12620-1:2022, 3.6]

3.9
data category selection
DCS
DC selection
set of data category specifications (3.10) selected from a data category repository (3.8)
Note 1 to entry: A data category selection can represent the data categories (3.7) used within a research discipline or
a specific application or project.
[SOURCE: ISO 12620-1:2022, 3.7, modified — “chosen” replaced by “selected”; “DCS” made the second
preferred term.]
3.10
data category specification
DC specification
complete descriptive record of a data category (3.7)
[SOURCE: ISO 12620-1:2022, 3.5]
3.11
expansion tree
structured group of serialized elements that implement a level of the metamodel (3.15) in a given
terminological markup language (3.24)
3.12
global information
GI
technical and administrative information applying to an entire terminological data collection (3.23)
EXAMPLE Title of the terminological data collection, revision history, owner or copyright information.
3.13
information unit
IU
elementary piece of information attached to a component (3.3) of the metamodel (3.15)
3.14
language section
LS
part of a concept entry (3.5) containing information related to one language
3.15
metamodel
model that specifies one or more other models
[SOURCE: ISO/IEC TR 19583-24:2025, 3.2.19]
3.16
object language
language being described
3.17
PID
persistent identifier
unique identifier that ensures permanent access for a digital object by providing access to it independently
of its physical location or current ownership
[SOURCE: ISO 24619:2011, 3.2.4, modified — Note 1 to entry deleted.]

3.18
serialization
process of translating data structures or object states into a format that can be stored or transmitted and
reconstructed later
[SOURCE: ISO/TS 23494-1:2023, 3.16]
3.19
serialization format
data storage format for storing, transmitting, and reconstructing data structures or object states
[SOURCE: ISO/IEC TR 19583-24:2025, 3.2.41, modified — “object state” replaced by “object states”.]
3.20
style
specification for the implementation of a data category (3.7) in any serialization format (3.19)
3.21
term component section
TCS
part of a term section (3.22) containing linguistic information about the parts of a term
[SOURCE: ISO 26162-1:2019, 3.2.10, modified — “components” replaced by “parts”.]
3.22
term section
TS
part of a language section (3.14) containing information about a term
[SOURCE: ISO 26162-1:2019, 3.2.9]
3.23
terminological data collection
TDC
resource consisting of concept entries (3.5) with associated metadata and documentary information
EXAMPLE A TBX document instance, ISO 1087:2019.
3.24
terminological markup language
TML
serialization format (3.19) for representing a terminological data collection (3.23) conforming to the
constraints of the metamodel (3.15)
3.25
vocabulary
set of strings used to implement a data category (3.7) according to a style (3.20)
3.26
working language
language used to describe objects
4 Modular approach
Terminological markup framework (TMF) consists of two levels of abstraction:
— The first (and most abstract) level is the metamodel level. The metamodel level supports analysis, design
and exchange at a very general level, i.e. it is independent of any specific implementation or software. The
metamodel shall be shared by all TDCs that are compliant with TMF.
— The second level is the data model level, which adds the necessary data categories for representing
specific TDCs.
The implementation of a data model in any serialization format is called a TML. In this document, TMLs are
exemplified in XML format (see Reference [7]). TMLs can be described on the basis of a limited number of
characteristics, namely:
— the structural organization of the metamodel (i.e. the expansion trees) expressed by the TML;
— the specific data categories used by the TML and how they relate to the metamodel;
— the way in which these data categories can be serialized and anchored on the expansion trees of the TML,
e.g. the XML style of the data categories;
— the vocabularies used by the TML to express those various informational objects as, for example, XML
elements and attributes according to the corresponding XML styles.
Figure 1 represents the information required to fully specify a TML:
— the metamodel which describes the basic hierarchy of components to which any TML shall conform;
— a set of data category specifications from a data category repository (DCR), which can form the basis for
defining a data category selection (DCS) for the TML;
— the dialectal specification (dialect) which includes the various elements needed to represent a given
TML in a serialization format. These elements comprise expansion trees and data category instantiation
styles, together with their corresponding vocabularies.

Figure 1 — Various knowledge sources involved in the description of a TML
An example of a DCR providing sample data category specifications for language resources is available at
Reference [6]. Where possible, data categories documented in existing DCRs should be used for a TML. If
no suitable data category is available in existing DCRs, the implementers of the TML should propose the
creation of the required data category specification within existing or new DCRs.

5 Generic model for describing terminological data
5.1 Principles
This clause describes a class of XML document structures which can be used to represent a wide range of
terminological data formats and provides a framework for representing these document structures in XML.
Each type of document structure is described by means of a three-tiered information structure that
describes:
— a metamodel, which comprises a hierarchy of components;
— information units, which can be associated with each component of the metamodel;
— annotations, which can be used to qualify properties associated with a given information unit.
Information units can be basic or compound. A basic information unit encapsulates information that can be
expressed by means of a single data category. A compound information unit encapsulates information that is
expressed by means of several grouped data categories that, taken together, express a coherent information
unit.
EXAMPLE A compound information unit can be used to represent the fact that a transaction can be a combination
of a transaction type (such as modification), the person who performed it, and the date when it was performed.
Basic information units, whether they are directly attached to a component or placed within a compound
information unit, can take two non-exclusive types of value:
— an atomic value corresponding either to a simple type (in the sense of XML schemas) such as a number,
string, element of a picklist, or to a mixed content type in the case of annotated text;
— a reference to a component in order to express a relation between it and the current component.
Information units can be abstractly represented as feature-value structures. For instance, the following
markup sample
XYZ
can be modelled as a basic information unit in the following feature-value structure:

[owner = XYZ]
In addition, the following TermBase eXchange (TBX) markup sample (see Reference [9])


modification
ABC
2024-04-04

can be modelled in a feature-value structure as shown in Figure 2.
Figure 2 — Feature-value structure
There can also be a need to associate additional information with the content of a data category; this is
achieved through annotations. A typical example is a definition in which the genus and/or diffe
...


SLOVENSKI STANDARD
01-junij-2025
Upravljanje terminoloških virov - Ogrodje za označevanje terminologije
Management of terminology resources — Terminological markup framework
Titre manque
Ta slovenski standard je istoveten z: ISO/DIS 16642
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

DRAFT
International
Standard
ISO/DIS 16642
ISO/TC 37/SC 3
Management of terminology
Secretariat: DIN
resources — Terminological
Voting begins on:
markup framework
2025-02-03
ICS: 01.020; 35.240.30
Voting terminates on:
2025-04-28
THIS DOCUMENT IS A DRAFT CIRCULATED
FOR COMMENTS AND APPROVAL. IT
IS THEREFORE SUBJECT TO CHANGE
AND MAY NOT BE REFERRED TO AS AN
INTERNATIONAL STANDARD UNTIL
PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL,
TECHNOLOGICAL, COMMERCIAL AND
USER PURPOSES, DRAFT INTERNATIONAL
STANDARDS MAY ON OCCASION HAVE TO
This document is circulated as received from the committee secretariat.
BE CONSIDERED IN THE LIGHT OF THEIR
POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN
NATIONAL REGULATIONS.
RECIPIENTS OF THIS DRAFT ARE INVITED
TO SUBMIT, WITH THEIR COMMENTS,
NOTIFICATION OF ANY RELEVANT PATENT
RIGHTS OF WHICH THEY ARE AWARE AND TO
PROVIDE SUPPORTING DOCUMENTATION.
Reference number
ISO/DIS 16642:2025(en)
DRAFT
ISO/DIS 16642:2025(en)
International
Standard
ISO/DIS 16642
ISO/TC 37/SC 3
Management of terminology
Secretariat: DIN
resources — Terminological
Voting begins on:
markup framework
ICS: 01.020; 35.240.30
Voting terminates on:
THIS DOCUMENT IS A DRAFT CIRCULATED
FOR COMMENTS AND APPROVAL. IT
IS THEREFORE SUBJECT TO CHANGE
AND MAY NOT BE REFERRED TO AS AN
INTERNATIONAL STANDARD UNTIL
PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL,
© ISO 2025
TECHNOLOGICAL, COMMERCIAL AND
USER PURPOSES, DRAFT INTERNATIONAL
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
STANDARDS MAY ON OCCASION HAVE TO
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
This document is circulated as received from the committee secretariat. BE CONSIDERED IN THE LIGHT OF THEIR
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN
or ISO’s member body in the country of the requester.
NATIONAL REGULATIONS.
ISO copyright office
RECIPIENTS OF THIS DRAFT ARE INVITED
CP 401 • Ch. de Blandonnet 8
TO SUBMIT, WITH THEIR COMMENTS,
CH-1214 Vernier, Geneva
NOTIFICATION OF ANY RELEVANT PATENT
Phone: +41 22 749 01 11
RIGHTS OF WHICH THEY ARE AWARE AND TO
PROVIDE SUPPORTING DOCUMENTATION.
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland Reference number
ISO/DIS 16642:2025(en)
ii
ISO/DIS 16642:2025(en)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Modular approach . 4
5 Generic model for describing terminological data . 5
5.1 Principles .5
5.2 The metamodel . .6
5.3 Example .8
6 Requirements for compliance to TMF . 9
7 Interchange and interoperability . 9
8 Representing languages . 10
9 Defining a TML .10
9.1 Steps .10
9.2 Defining interoperability conditions .10
10 Implementing an XML-serialized TML .11
10.1 General .11
10.2 Implementing the metamodel .11
10.3 Anchoring data categories .11
10.3.1 General .11
10.3.2 Styles and vocabulary .11
10.4 Constraints on datatypes . 12
10.5 Implementing annotations . 12
10.6 Implementing brackets . 12
Annex A (informative) Conformance of terminological data to TMF: example scenario .13
Bibliography .16

iii
ISO/DIS 16642:2025(en)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology, Subcommittee
SC 3, Management of terminology resources.
This third edition cancels and replaces the second edition (ISO 16642:2017), which has been editorially and
technically revised.
The main changes are as follows:
— The scope is no longer restricted to representing terminological data in XML format. Terminological
markup languages can be serialized in any formats, but they are exemplified in this document in XML.
— DatCatInfo PIDs have been updated.
— Terms and definitions have been updated according to the latest ISO standards.
— Examples have been updated to reflect ISO 30042:2019.
— Appendix A has been significantly revised in order to show two TMF-compliant XML-based serialization
examples of the same terminological data.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.

iv
ISO/DIS 16642:2025(en)
Introduction
Terminological data are collected, managed and stored in a wide variety of systems, typically various
kinds of database management systems, ranging from personal computer applications for individual
users to large terminology database systems operated by major companies and governmental agencies.
Terminology databases are comprised of various types of information, called data categories, and can adopt
different structural models. However, terminological data often need to be shared and reused in a number
of applications, and this sharing is facilitated when the data adheres to a common model. To facilitate co-
operation and to prevent duplicate work, it is important to develop standards and guidelines for creating
and using terminological data collections (TDCs) as well as for sharing and exchanging data.
This document presents a modular approach for analyzing existing TDCs and designing new ones. It also
provides a framework for defining terminological markup languages (TMLs) that are interoperable.
This document makes reference to DatCatInfo, an example of an available data category repository (DCR).
DatCatInfo is an online database of information about the types of data that can be included in TDCs and
other language resources. It is available at www.datcatinfo.net.

v
DRAFT International Standard ISO/DIS 16642:2025(en)
Management of terminology resources — Terminological
markup framework
1 Scope
This document specifies a framework for representing data recorded in terminological data collections
(TDCs). This framework includes a metamodel and methods for describing specific terminological markup
languages (TMLs), exemplified in this document in eXtensible Markup Language (XML). The mechanisms for
implementing constraints in a TML are defined, but not the specific constraints for individual TMLs.
This document is designed to support the development and use of computer applications for terminological
data and the exchange of such data between different applications. This document also defines the conditions
that allow the data expressed in one TML to be mapped onto another TML.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO 704, Terminology work — Principles and methods
ISO 1087, Terminology work and terminology science — Vocabulary
ISO 12616-1, Terminology work in support of multilingual communication — Part 1: Fundamentals of
translation-oriented terminography
ISO 26162 (all parts), Management of terminology resources — Terminology databases
ISO 30042, Management of terminology resources — TermBase eXchange (TBX)
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 1087 and the following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
basic information unit
information unit (3.13) attached to a component (3.3) of the metamodel that can be expressed by means of a
single data category (3.7)
3.2
complementary information
CI
information supplementary to that described in concept entries (3.5) and shared across the terminological
data collection (3.21)
EXAMPLE Domain hierarchies, institution descriptions, bibliographic references and references to text corpora
are typical examples of complementary information.

ISO/DIS 16642:2025(en)
3.3
component
elementary description unit of a metamodel to which data categories (3.7) can be associated to form a data model
3.4
compound information unit
information unit (3.13) attached to a component (3.3) of the metamodel that is expressed by means of several
grouped data categories (3.7)
3.5
concept entry
CE
terminological entry
entry
part of a terminological data collection (3.21) which contains the terminological data related to one concept
[SOURCE: ISO 30042:2019, 3.5]
3.6
conceptual domain
permissible content of a data category (3.7)
EXAMPLE In a terminology database, the data category /part of speech/ can have a conceptual domain consisting
of the values /noun/, /verb/, /adjective/, /adverb/.
Note 1 to entry: The permissible content can be closed, as in the example, or subject to formal restrictions such as
dates, or free text such as the conceptual domain of /definition/. Although the latter type is not formally restricted,
it is nevertheless subject to adherence to the requirements of its data category specification, i.e., it contains a true
definition and not a note, example or some other piece of information.
[SOURCE: ISO 12620-1:2022, 3.1]
3.7
data category
DC
class of data items that are closely related from a formal or semantic point of view
EXAMPLE /part of speech/, /subject field/, /definition/.
Note 1 to entry: A data category can be viewed as a generalization of the notion of a field in a database.
Note 2 to entry: In running text, such as in this document, data category names are enclosed in forward slashes (e.g., /
part of speech/).
[SOURCE: ISO 30042:2019, 3.8, modified — The admitted term “DC” has been added.]
3.8
data category repository
DCR
digital collection of data category specifications (3.10)
EXAMPLE DatCatInfo, a DCR for language resources (see Reference [5]).
Note 1 to entry: Data category repositories are used as references when specifying language resources.
[SOURCE: ISO 12620-1:2022, 3.6]

ISO/DIS 16642:2025(en)
3.9
data category selection
DC selection
DCS
set of data category specifications (3.10) selected from a data category repository (3.8)
Note 1 to entry: A data category selection can represent the data categories (3.7) used within a research discipline or
a specific application or project.
[SOURCE: ISO 12620-1:2022, 3.7]
3.10
data category specification
DC specification
complete descriptive record of a data category (3.7)
[SOURCE: ISO 12620-1:2022, 3.5]
3.11
expansion tree
structured group of serialized elements that implement a level of the metamodel in a given terminological
markup language (3.22)
3.12
global information
GI
technical and administrative information applying to an entire terminological data collection (3.21)
EXAMPLE Title of the terminological data collection, revision history, owner or copyright information.
3.13
information unit
IU
elementary piece of information attached to a component (3.3) of the metamodel
3.14
language section
LS
part of a concept entry (3.5) containing information related to one language
3.15
object language
language being described
3.16
persistent identifier
PID
unique Uniform Resource Identifier (URI) that assures permanent access to a digital object by providing
access to it independently of its physical location or current ownership
3.18
style
specification for the implementation of a data category (3.7) in any serialization format
3.19
term component section
TCS
part of a term section (3.20) containing linguistic information about the components (3.3) of a term
[SOURCE: ISO 26162-1:2019, 3.2.10]

ISO/DIS 16642:2025(en)
3.20
term section
TS
part of a language section (3.14) containing information about a term
[SOURCE: ISO 26162-1:2019, 3.2.9]
3.21
terminological data collection
TDC
resource consisting of concept entries (3.5) with associated metadata and documentary information
3.22
terminological markup language
TML
serialization format for representing a terminological data collection (3.21) conforming to the constraints of
the metamodel
3.23
vocabulary
set of strings used to implement a data category (3.7) according to a style (3.18)
3.24
working language
language used to describe objects
4 Modular approach
Terminological markup framework (TMF) consists of two levels of abstraction. The first (and most abstract)
level is the metamodel level. The metamodel level supports analysis, design and exchange at a very general
level, i.e., it is independent of any specific implementation or software. The metamodel shall be shared by all
TDCs that are compliant with TMF. The second level is the data model level, which adds the necessary data
categories for representing specific TDCs.
The implementation of a data model in any serialization format is called a terminological markup language
(TML). In this document, TMLs are exemplified in XML format (see Reference [6]). TMLs can be described on
the basis of a limited number of characteristics, namely:
— the structural organization of the metamodel (i.e., the expansion trees) expressed by the TML;
— the specific data categories used by the TML and how they relate to the metamodel;
— the way in which these data categories can be serialized and anchored on the expansion trees of the TML,
e.g., the XML style of the data categories;
— the vocabularies used by the TML to express those various informational objects as, for example, XML
elements and attributes according to the corresponding XML styles.
Figure 1 represents the information required to fully specify a TML:
— the metamodel which describes the basic hierarchy of components to which any TML shall conform;
— a set of data category specifications from a data category repository (DCR), which can form the basis for
defining a data category selection (DCS) for the TML;
— the dialectal specification (dialect) which includes the various elements needed to represent a given
TML in a serialization format. These elements comprise expansion trees and data category instantiation
styles, together with their corresponding vocabularies.

ISO/DIS 16642:2025(en)
Figure 1 — Various knowledge sources involved in the description of a TML
An example of a DCR providing sample data category specifications for language resources is available at
Reference [5]. Where possible, data categories documented in existing DCRs should be used for a TML. If
no suitable data category is available in existing DCRs, the implementers of the TML should propose the
creation of the required data category specification within existing or new DCRs.
5 Generic model for describing terminological data
5.1 Principles
This clause describes a class of XML document structures which can be used to represent a wide range of
terminological data formats and provides a framework for representing these document structures in XML.
Each type of document structure is described by means of a three-tiered information structure that
describes:
— a metamodel, which comprises a hierarchy of components;
— information units, which can be associated with each component of the metamodel;
— annotations, which can be used to qualify properties associated with a given information unit.
Information units can be basic or compound. A basic information unit encapsulates information that can be
expressed by means of a single data category. A compound information unit encapsulates information that is
expressed by means of several grouped data categories that, taken together, express a coherent information unit.
EXAMPLE A compound information unit can be used to represent the fact that a transaction can be a combination
of a transaction type (such as modification), the person who performed it, and the date when it was performed.
Basic information units, whether they are directly attached to a component or are placed within a compound
information unit, can take two non-exclusive types of value:
— an atomic value corresponding either to a simple type (in the sense of XML schemas) such as a number,
string, element of a picklist, etc., or to a mixed content type in the case of a
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.