SIST ISO 26162-1:2021
(Main)Management of terminology resources -- Terminology databases -- Part 1: Design
Management of terminology resources -- Terminology databases -- Part 1: Design
This document specifies general, i.e. implementation- and use-case-independent terminology database design principles to enable maximum efficiency and quality in terminology work. Thus, this document supports creating, processing, and using high quality terminology. The intended audiences of this document are terminologists, translators, interpreters, technical communicators, language planners, subject field experts, and terminology management system developers.
This document describes a maximum approach, i.e. terminology database design for distributed, multilingual terminology management. It can also be used for designing smaller solutions.
Systèmes de gestion de la terminologie, de la connaissance et du contenu -- Bases de données terminologiques -- Partie 1: Conception
Upravljanje terminoloških virov - Terminološke baze podatkov - 1. del: Zasnova
General Information
Relations
Buy Standard
Standards Content (Sample)
SLOVENSKI STANDARD
SIST ISO 26162-1:2021
01-marec-2021
Nadomešča:
SIST ISO 26162:2013
Upravljanje terminoloških virov - Terminološke baze podatkov - 1. del: Zasnova
Management of terminology resources -- Terminology databases -- Part 1: Design
Systèmes de gestion de la terminologie, de la connaissance et du contenu -- Bases de
données terminologiques -- Partie 1: Conception
Ta slovenski standard je istoveten z: ISO 26162-1:2019
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
01.140.20 Informacijske vede Information sciences
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
SIST ISO 26162-1:2021 en
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
---------------------- Page: 1 ----------------------
SIST ISO 26162-1:2021
---------------------- Page: 2 ----------------------
SIST ISO 26162-1:2021
INTERNATIONAL ISO
STANDARD 26162-1
First edition
2019-11
Management of terminology
resources — Terminology
databases —
Part 1:
Design
Gestion des ressources terminologiques — Bases de données
terminologiques —
Partie 1: Conception
Reference number
ISO 26162-1:2019(E)
©
ISO 2019
---------------------- Page: 3 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2019
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2019 – All rights reserved
---------------------- Page: 4 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
3.1 Concepts . 1
3.2 Terminology databases . 2
4 Terminology database design . 4
4.1 General . 4
4.2 Terminological metamodel . 5
4.3 Data categories . 7
4.3.1 General. 7
4.3.2 Types of data categories . 8
4.3.3 Shared resources . 9
4.3.4 Concept relations .10
4.4 Concept entries .11
4.4.1 Concept orientation . .11
4.4.2 Language .11
4.4.3 Dependency and repeatability of data categories .12
4.4.4 Data granularity .12
4.4.5 Data elementarity .13
4.4.6 Data-modeling variation .13
4.5 Roles .14
Annex A (informative) Terminology database excerpt based on the terminological
metamodel — Example .15
Bibliography .19
© ISO 2019 – All rights reserved iii
---------------------- Page: 5 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
iso/ foreword .html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology,
Subcommittee SC 3, Management of terminology resources.
This first edition of ISO 26162-1, together with ISO 26162-2, cancels and replaces ISO 26162:2012, which
has been technically revised.
The main changes compared to the previous edition are as follows:
— the document has been split into parts. The first part is focusing on the design of terminology
database design, the second part on the development of terminology management systems;
— all references to generic software design principles and specific use cases have been removed.
A list of all parts of the ISO 26162 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
iv © ISO 2019 – All rights reserved
---------------------- Page: 6 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
Introduction
Terminologies are the totality of concepts in given subject fields represented by terms and other
designations and described by using additional terminological data. In general, these data are organized
in structured terminology databases and are usually manipulated in specific software applications
called terminology management systems. Terminology databases usually vary with regard to their
underlying data models and consist of different sets of data categories, while terminology management
systems generally differ depending on their functionality and the platform they are designed for.
The ISO 26162 series gives guidance on designing terminology databases and on essential terminology
management system features. The series can also be used to evaluate the conformance and suitability
of terminology databases and terminology management systems.
© ISO 2019 – All rights reserved v
---------------------- Page: 7 ----------------------
SIST ISO 26162-1:2021
---------------------- Page: 8 ----------------------
SIST ISO 26162-1:2021
INTERNATIONAL STANDARD ISO 26162-1:2019(E)
Management of terminology resources — Terminology
databases —
Part 1:
Design
1 Scope
This document specifies general, i.e. implementation- and use-case-independent terminology database
design principles to enable maximum efficiency and quality in terminology work. Thus, this document
supports creating, processing, and using high quality terminology. The intended audiences of this
document are terminologists, translators, interpreters, technical communicators, language planners,
subject field experts, and terminology management system developers.
This document describes a maximum approach, i.e. terminology database design for distributed,
multilingual terminology management. It can also be used for designing smaller solutions.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 704, Terminology work — Principles and methods
ISO 1087, Terminology work — Vocabulary
ISO 12620, Management of terminology resources — Data category specifications
ISO 16642:2017, Computer applications in terminology — Terminological markup framework
ISO 23185, Assessment and benchmarking of terminological resources — General concepts, principles and
requirements
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 1087 and the following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
3.1 Concepts
3.1.1
object
anything perceivable or conceivable
Note 1 to entry: Objects can be material (e.g. an engine, a sheet of paper, a diamond), immaterial (e.g. a conversion
ratio, a project plan) or imagined (e.g. a unicorn, a scientific hypothesis).
© ISO 2019 – All rights reserved 1
---------------------- Page: 9 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
Note 2 to entry: Objects can undergo changes which cause conceptual or designation change.
[SOURCE: ISO 1087:2019, 3.1.1, modified — Note 2 to entry added.]
3.1.2
concept
unit of knowledge created by a unique combination of characteristics
Note 1 to entry: Concepts are not necessarily bound to particular natural languages. They are, however,
influenced by the social or cultural background which often leads to different categorizations.
Note 2 to entry: Due to their dynamic nature, concepts are also defined as units of thinking (see ISO 704:2009, 5.1
and DIN 2342:2011-08, 4.1).
[SOURCE: ISO 1087:2019, 3.2.7, modified — former Note 2 to entry removed and replaced by a new
Note 2 to entry.]
3.1.3
designation
designator
representation of a concept (3.1.2) by a sign which denotes it in a domain or subject
Note 1 to entry: A designation can be linguistic or non-linguistic. It can consist of various types of characters, but
also punctuation marks such as hyphens and parentheses, governed by domain-, subject-, or language-specific
conventions.
Note 2 to entry: A designation may be a term (3.1.4) including appellations, a proper name, or a symbol.
[SOURCE: ISO 1087:2019, 3.4.1]
3.1.4
term
designation (3.1.3) that represents a general concept by linguistic means
EXAMPLE "laser printer", "planet", "pacemaker", "chemical compound", "¾ time", "Influenza A virus", "oil
painting".
Note 1 to entry: Terms may be partly or wholly verbal.
[SOURCE: ISO 1087:2019, 3.4.2]
3.2 Terminology databases
3.2.1
terminology database
termbase
database comprising a terminological data collection (3.2.4)
[SOURCE: ISO 30042:2019, 3.28, modified — admitted term "terminology database" made preferred
term and preferred term "termbase" made admitted term.]
3.2.2
data model
graphical and/or lexical representation of data, specifying their properties, structure, and inter-
relationships
[SOURCE: ISO/IEC 11179-1:2015, 3.2.7]
3.2.3
terminological metamodel
data model (3.2.2) that describes the basis for designing and implementing terminological data
collections (3.2.4)
2 © ISO 2019 – All rights reserved
---------------------- Page: 10 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
3.2.4
terminological data collection
TDC
resource consisting of concept entries (3.2.7) with associated metadata and documentary information
[SOURCE: ISO 16642:2017, 3.21, modified — "terminological entries" replaced by "concept entries".]
3.2.5
global information
GI
technical and administrative information applying to the entire terminological data collection (3.2.4)
EXAMPLE The title of the terminological data collection, revision history, owner or copyright information.
[SOURCE: ISO 16642:2017, 3.11, modified — "Note 1 to entry" replaced by "EXAMPLE"; "For example,"
removed in the example.]
3.2.6
complementary information
CI
information supplementary to that described in concept entries (3.2.7) and shared across the
terminological data collection (3.2.4)
EXAMPLE Domain hierarchies, institution descriptions, bibliographic references, and references to text
corpora.
[SOURCE: ISO 16642:2017, 3.2, modified — "terminological entries" replaced by "concept entries"
within definition; "Note 1 to entry" replaced by "EXAMPLE"; "are typical examples of complementary
information" removed in the example.]
3.2.7
concept entry
CE
terminological entry
part of a terminological data collection (3.2.4) which contains the terminological data related to one
concept (3.1.2)
[SOURCE: ISO 16642:2017, 3.22, modified — "concept entry" and acronym "CE" added as preferred
terms; preferred term "terminological entry" made admitted term; preferred term "TE" removed;
Note 1 to entry removed.]
3.2.8
language section
LS
part of a concept entry (3.2.7) containing information related to one language
[SOURCE: ISO 16642:2017, 3.13, modified — "terminological entry" replaced by "concept entry"; Note 1
to entry removed.]
3.2.9
term section
TS
part of a language section (3.2.8) containing information about a term (3.1.4)
[SOURCE: ISO 16642:2017, 3.20, modified — "giving" replaced by "containing".]
3.2.10
term component section
TCS
part of a term section (3.2.9) containing linguistic information about the components of a term (3.1.4)
[SOURCE: ISO 16642:2017, 3.19, modified — "giving" replaced by "containing.]
© ISO 2019 – All rights reserved 3
---------------------- Page: 11 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
3.2.11
data category
class of data items that are closely related from a formal or semantic point of view
EXAMPLE /part of speech/, /subject field/, /definition/.
Note 1 to entry: A data category can be viewed as a generalization of the notion of a field in a database.
Note 2 to entry: In running text, such as in this document, data category names are enclosed in forward slashes
(e.g. /part of speech/).
[SOURCE: ISO 12620:2019, 3.2, modified — preferred term "DC" removed".]
3.2.12
repeatability
principle whereby a data category (3.2.11) can be repeated within a database definition and whereby it
can also be combined with other data categories
3.2.13
concept orientation
principle whereby a concept entry (3.2.7) describes a single concept (3.1.2)
Note 1 to entry: When two or more different concepts are represented by the same designation (in the same
language), this designation is considered a homograph. Such concepts are documented in separate concept
entries.
3.2.14
term autonomy
principle whereby all terms (3.1.4) in a concept entry (3.2.7) are considered independent sub-units and
can be described using the same set of data categories (3.2.11)
3.2.15
data granularity
degree of data precision
EXAMPLE The set of individual data categories /part of speech/, /grammatical gender/, and /grammatical
number/ provides for greater data granularity than does the single data category /grammar/.
3.2.16
data elementarity
principle whereby a data field contains only one data element
EXAMPLE For example, including both a full form and an abbreviation of a term in the same data field would
be a violation of data elementarity.
3.2.17
data-modeling variation
variation in data models (3.2.2) describing the same information
4 Terminology database design
4.1 General
Terminology database design requires a deep understanding of terminology theory and terminology
work. In this sense, and to achieve high quality results, the following shall be used:
— established terms and definitions as specified in ISO 1087;
— principles and methods as specified in ISO 704;
— data-modeling criteria as specified in ISO 16642 and ISO 12620;
4 © ISO 2019 – All rights reserved
---------------------- Page: 12 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
— usability metrics as specified in ISO 23185.
Terminology databases have a logical structure that is reflected in a fundamental hierarchical data model
(as described in 4.2) containing various levels at which data categories (see 4.3) can be anchored. This
data-modeling approach provides the necessary flexibility, since the design of a terminology database
is always subject to specific work profiles (terminology work, technical communication, translation,
etc.) and to organizational needs (freelancers, translation agencies, company or organization in-house
departments, etc.). Thus, in the very early design process, a long-term and detailed management plan
shall be defined, taking into consideration all possible user groups, as well as organizational and
technical issues in order to avoid the need for substantial, time consuming and costly changes after
concluding the design process.
4.2 Terminological metamodel
Terminology databases shall comply with the terminological metamodel (or a subset thereof) defined
in ISO 16642:2017 (see Figure 1 and Figure 2). The essence of the metamodel constitutes the principle
of concept orientation (see 4.4.1), i.e.:
— in a terminology database, an entry (concept entry = CE) describes a concept that can be further
described using:
— additional sublevels for instantiating languages and language-specific information (language
section = LS), further subdivided by:
— terms and term-specific information (term section = TS);
— individual words of a multiword term or one of the components of a single-word term, such as a
morpheme (term component section = TCS).
Furthermore, the metamodel provides high-level containers that allow for documenting:
— global information (GI) that applies to the complete terminology database (name of the terminology
database, institution or individual originating the file, copyright information, history, etc.);
— complementary information (CI) such as complete bibliographical or administrative information,
binary data, picklist values or references to text corpora that are referenced from concept entries.
The above-mentioned levels of the terminological metamodel can be schematized as shown in Figure 1.
© ISO 2019 – All rights reserved 5
---------------------- Page: 13 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
Figure 1 — Terminological metamodel — Schematic view (adapted from ISO 16642:2017)
The metamodel levels, their relationships and cardinalities can also be expressed using UML (Unified
Modeling Language) as shown in Figure 2.
6 © ISO 2019 – All rights reserved
---------------------- Page: 14 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
Cardinalities
1.1 = Shall occur exactly once.
1.* = Shall occur one or more times.
0.1 = May occur zero or one time.
0.* = May occur zero or more times.
Figure 2 — Terminological metamodel — UML diagram (adapted from ISO 16642:2017)
An example of a terminology database excerpt compliant with the terminological metamodel is given in
Annex A.
4.3 Data categories
4.3.1 General
Concept entries are made up of specific units of information, such as terms, definitions and contexts, and
each class of these units of information is identified by a data category. Over recent decades terminology
practitioners have gathered, standardized, classified, and published relevant data categories that are
stored in recognized data category repositories, such as DatCatInfo (see Reference [11]). For instance,
/grammatical gender/ is a data category that would typically be used as a field name, and "feminine",
"masculine", "neuter", etc. would occur as values of this field. Of course, terminology database designers
can create and use their data categories according to their specific needs. However, care shall be taken
when creating new data categories or when adapting the names for the data categories proposed by
recognized data category repositories to avoid confusion or misinterpretation.
© ISO 2019 – All rights reserved 7
---------------------- Page: 15 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
ISO 10241-1:2011 and ISO 12616:2002 include a list of common data categories used to document the
following units of information:
— terms (in any desired language);
— term-specific information (such as grammatical attributes, term types, register, status, contexts, etc.);
— language-specific information (such as definitions, notes, etc.);
— concept-specific information (such as subject fields, definitions, examples, notes, graphics, etc.);
— administrative and bibliographical information:
— identifiers of various sorts, such as to identify products or projects with which the terms are
associated;
— dates, names of people who created or modified the concept entry or parts of it;
— entry status, for example "submitted", "working", "approved";
— sources of terms, definitions, contexts, notes, etc.
Designers who need new data categories that are not included in recognized data category repositories
shall adhere to the requirements for creating, documenting, harmonizing, and maintaining data
category specifications as defined in ISO 12620.
4.3.2 Types of data categories
4.3.2.1 General
Terminology databases typically contain several different types of data categories depending on the
kind of information they contain. Data categories can be divided into three categories, namely:
— open and closed data categories;
— mandatory, optional, system-generated, and default data categories;
— read-write, read-only, and hidden data categories.
Careful selection of data category types ensures higher quality content in a terminology database.
4.3.2.2 Open and closed data categories
Open data categories can contain any text. For instance, /term/ is considered open, because the actual
term that can be recorded in the corresponding data field is unpredictable. Other examples of open data
categories are /definition/ or /context/.
In contrast, closed data categories can only contain one or more of a finite set of permissible values.
When documenting terms in the German language, for instance, /grammatical gender/ may only
take the values "masculine", "feminine", or "neuter". Apart from /grammatical gender/, typical
representatives of closed data categories are /part of speech/, /term type/, /geographical usage/,
/administrative status/ or /subject field/.
Values themselves can constitute data categories such as /masculine/, for example, with the Boolean
values "yes|no" or "true|false".
The use of picklists ensures that only predefined values can be selected, thus preventing the insertion of
inadmissible values or misspelled or variant forms. For instance, left to their own devices, users might
type "masculine", "masc.", or simply "m." for a masculine noun. Providing a uniform representation of
these values ensures consistency throughout the terminology database, which is important for ensuring
the performance of searching, filtering, data exchange, and other data management tasks.
8 © ISO 2019 – All rights reserved
---------------------- Page: 16 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
4.3.2.3 Mandatory, optional, system-generated, and default data categories
Data categories can be either mandatory or optional. Typically, a terminology management system does
not allow users to save concept entries if a mandatory field is empty. At least one language and one term
in that language shall be required for each concept entry, unless the concept has yet to be named or it is
represented by a diagram or a node in the concept diagram.
Other typical mandatory fields include information about the subject field or part of speech, both of
which can be essential for differentiating homographs. However, designating certain data categories to
be mandatory can sometimes be problematic. For instance, it can take considerable time and effort to
find definitions and contexts. There can be even more complex conditions, such as forcing a source for a
definition or the grammatical gender in case of nouns for a specific language section.
System-generated data fields are not inserted manually by the user, they are populated by the system.
For instance, many terminology management systems automatically assign entry numbers to concept
entries, as well as creation and modification dates, and the names of the users who created or modified
the concept entry.
Part of terminology database design involves making decisions about the levels within the concept
entry that will be documented with such administrative information. It is generally insufficient to
record such administrative information only for the concept entry as a whole, because different people
can be responsible for different parts of the concept entry, especially for different language sections.
It is also possible to set default values for certain data categories. For instance, if a user is going to
document terms associated with a single project or source text, it can be convenient to allow the user to
pre-set the values for the corresponding data categories /project subset/ or /source/ so that all concept
entries created automatically contain those values.
4.3.2.4 Read-write, read-only, and hidden data categories
In a terminology database, it shall be possible to set different access levels for users, depending on
the user's needs and role. A field that is visible and can be modified is a "read-write" field. A field that
is visible but cannot be modified is a "r
...
SLOVENSKI STANDARD
SIST ISO 26162-1:2021
01-marec-2021
Nadomešča:
SIST ISO 26162:2013
Upravljanje terminoloških virov - Terminološke baze podatkov - 1. del: Zasnova
Management of terminology resources -- Terminology databases -- Part 1: Design
Systèmes de gestion de la terminologie, de la connaissance et du contenu -- Bases de
données terminologiques -- Partie 1: Conception
Ta slovenski standard je istoveten z: ISO 26162-1:2019
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
SIST ISO 26162-1:2021 en
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
---------------------- Page: 1 ----------------------
SIST ISO 26162-1:2021
---------------------- Page: 2 ----------------------
SIST ISO 26162-1:2021
INTERNATIONAL ISO
STANDARD 26162-1
First edition
2019-11
Management of terminology
resources — Terminology
databases —
Part 1:
Design
Gestion des ressources terminologiques — Bases de données
terminologiques —
Partie 1: Conception
Reference number
ISO 26162-1:2019(E)
©
ISO 2019
---------------------- Page: 3 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2019
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2019 – All rights reserved
---------------------- Page: 4 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
3.1 Concepts . 1
3.2 Terminology databases . 2
4 Terminology database design . 4
4.1 General . 4
4.2 Terminological metamodel . 5
4.3 Data categories . 7
4.3.1 General. 7
4.3.2 Types of data categories . 8
4.3.3 Shared resources . 9
4.3.4 Concept relations .10
4.4 Concept entries .11
4.4.1 Concept orientation . .11
4.4.2 Language .11
4.4.3 Dependency and repeatability of data categories .12
4.4.4 Data granularity .12
4.4.5 Data elementarity .13
4.4.6 Data-modeling variation .13
4.5 Roles .14
Annex A (informative) Terminology database excerpt based on the terminological
metamodel — Example .15
Bibliography .19
© ISO 2019 – All rights reserved iii
---------------------- Page: 5 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
iso/ foreword .html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology,
Subcommittee SC 3, Management of terminology resources.
This first edition of ISO 26162-1, together with ISO 26162-2, cancels and replaces ISO 26162:2012, which
has been technically revised.
The main changes compared to the previous edition are as follows:
— the document has been split into parts. The first part is focusing on the design of terminology
database design, the second part on the development of terminology management systems;
— all references to generic software design principles and specific use cases have been removed.
A list of all parts of the ISO 26162 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
iv © ISO 2019 – All rights reserved
---------------------- Page: 6 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
Introduction
Terminologies are the totality of concepts in given subject fields represented by terms and other
designations and described by using additional terminological data. In general, these data are organized
in structured terminology databases and are usually manipulated in specific software applications
called terminology management systems. Terminology databases usually vary with regard to their
underlying data models and consist of different sets of data categories, while terminology management
systems generally differ depending on their functionality and the platform they are designed for.
The ISO 26162 series gives guidance on designing terminology databases and on essential terminology
management system features. The series can also be used to evaluate the conformance and suitability
of terminology databases and terminology management systems.
© ISO 2019 – All rights reserved v
---------------------- Page: 7 ----------------------
SIST ISO 26162-1:2021
---------------------- Page: 8 ----------------------
SIST ISO 26162-1:2021
INTERNATIONAL STANDARD ISO 26162-1:2019(E)
Management of terminology resources — Terminology
databases —
Part 1:
Design
1 Scope
This document specifies general, i.e. implementation- and use-case-independent terminology database
design principles to enable maximum efficiency and quality in terminology work. Thus, this document
supports creating, processing, and using high quality terminology. The intended audiences of this
document are terminologists, translators, interpreters, technical communicators, language planners,
subject field experts, and terminology management system developers.
This document describes a maximum approach, i.e. terminology database design for distributed,
multilingual terminology management. It can also be used for designing smaller solutions.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 704, Terminology work — Principles and methods
ISO 1087, Terminology work — Vocabulary
ISO 12620, Management of terminology resources — Data category specifications
ISO 16642:2017, Computer applications in terminology — Terminological markup framework
ISO 23185, Assessment and benchmarking of terminological resources — General concepts, principles and
requirements
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 1087 and the following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
3.1 Concepts
3.1.1
object
anything perceivable or conceivable
Note 1 to entry: Objects can be material (e.g. an engine, a sheet of paper, a diamond), immaterial (e.g. a conversion
ratio, a project plan) or imagined (e.g. a unicorn, a scientific hypothesis).
© ISO 2019 – All rights reserved 1
---------------------- Page: 9 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
Note 2 to entry: Objects can undergo changes which cause conceptual or designation change.
[SOURCE: ISO 1087:2019, 3.1.1, modified — Note 2 to entry added.]
3.1.2
concept
unit of knowledge created by a unique combination of characteristics
Note 1 to entry: Concepts are not necessarily bound to particular natural languages. They are, however,
influenced by the social or cultural background which often leads to different categorizations.
Note 2 to entry: Due to their dynamic nature, concepts are also defined as units of thinking (see ISO 704:2009, 5.1
and DIN 2342:2011-08, 4.1).
[SOURCE: ISO 1087:2019, 3.2.7, modified — former Note 2 to entry removed and replaced by a new
Note 2 to entry.]
3.1.3
designation
designator
representation of a concept (3.1.2) by a sign which denotes it in a domain or subject
Note 1 to entry: A designation can be linguistic or non-linguistic. It can consist of various types of characters, but
also punctuation marks such as hyphens and parentheses, governed by domain-, subject-, or language-specific
conventions.
Note 2 to entry: A designation may be a term (3.1.4) including appellations, a proper name, or a symbol.
[SOURCE: ISO 1087:2019, 3.4.1]
3.1.4
term
designation (3.1.3) that represents a general concept by linguistic means
EXAMPLE "laser printer", "planet", "pacemaker", "chemical compound", "¾ time", "Influenza A virus", "oil
painting".
Note 1 to entry: Terms may be partly or wholly verbal.
[SOURCE: ISO 1087:2019, 3.4.2]
3.2 Terminology databases
3.2.1
terminology database
termbase
database comprising a terminological data collection (3.2.4)
[SOURCE: ISO 30042:2019, 3.28, modified — admitted term "terminology database" made preferred
term and preferred term "termbase" made admitted term.]
3.2.2
data model
graphical and/or lexical representation of data, specifying their properties, structure, and inter-
relationships
[SOURCE: ISO/IEC 11179-1:2015, 3.2.7]
3.2.3
terminological metamodel
data model (3.2.2) that describes the basis for designing and implementing terminological data
collections (3.2.4)
2 © ISO 2019 – All rights reserved
---------------------- Page: 10 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
3.2.4
terminological data collection
TDC
resource consisting of concept entries (3.2.7) with associated metadata and documentary information
[SOURCE: ISO 16642:2017, 3.21, modified — "terminological entries" replaced by "concept entries".]
3.2.5
global information
GI
technical and administrative information applying to the entire terminological data collection (3.2.4)
EXAMPLE The title of the terminological data collection, revision history, owner or copyright information.
[SOURCE: ISO 16642:2017, 3.11, modified — "Note 1 to entry" replaced by "EXAMPLE"; "For example,"
removed in the example.]
3.2.6
complementary information
CI
information supplementary to that described in concept entries (3.2.7) and shared across the
terminological data collection (3.2.4)
EXAMPLE Domain hierarchies, institution descriptions, bibliographic references, and references to text
corpora.
[SOURCE: ISO 16642:2017, 3.2, modified — "terminological entries" replaced by "concept entries"
within definition; "Note 1 to entry" replaced by "EXAMPLE"; "are typical examples of complementary
information" removed in the example.]
3.2.7
concept entry
CE
terminological entry
part of a terminological data collection (3.2.4) which contains the terminological data related to one
concept (3.1.2)
[SOURCE: ISO 16642:2017, 3.22, modified — "concept entry" and acronym "CE" added as preferred
terms; preferred term "terminological entry" made admitted term; preferred term "TE" removed;
Note 1 to entry removed.]
3.2.8
language section
LS
part of a concept entry (3.2.7) containing information related to one language
[SOURCE: ISO 16642:2017, 3.13, modified — "terminological entry" replaced by "concept entry"; Note 1
to entry removed.]
3.2.9
term section
TS
part of a language section (3.2.8) containing information about a term (3.1.4)
[SOURCE: ISO 16642:2017, 3.20, modified — "giving" replaced by "containing".]
3.2.10
term component section
TCS
part of a term section (3.2.9) containing linguistic information about the components of a term (3.1.4)
[SOURCE: ISO 16642:2017, 3.19, modified — "giving" replaced by "containing.]
© ISO 2019 – All rights reserved 3
---------------------- Page: 11 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
3.2.11
data category
class of data items that are closely related from a formal or semantic point of view
EXAMPLE /part of speech/, /subject field/, /definition/.
Note 1 to entry: A data category can be viewed as a generalization of the notion of a field in a database.
Note 2 to entry: In running text, such as in this document, data category names are enclosed in forward slashes
(e.g. /part of speech/).
[SOURCE: ISO 12620:2019, 3.2, modified — preferred term "DC" removed".]
3.2.12
repeatability
principle whereby a data category (3.2.11) can be repeated within a database definition and whereby it
can also be combined with other data categories
3.2.13
concept orientation
principle whereby a concept entry (3.2.7) describes a single concept (3.1.2)
Note 1 to entry: When two or more different concepts are represented by the same designation (in the same
language), this designation is considered a homograph. Such concepts are documented in separate concept
entries.
3.2.14
term autonomy
principle whereby all terms (3.1.4) in a concept entry (3.2.7) are considered independent sub-units and
can be described using the same set of data categories (3.2.11)
3.2.15
data granularity
degree of data precision
EXAMPLE The set of individual data categories /part of speech/, /grammatical gender/, and /grammatical
number/ provides for greater data granularity than does the single data category /grammar/.
3.2.16
data elementarity
principle whereby a data field contains only one data element
EXAMPLE For example, including both a full form and an abbreviation of a term in the same data field would
be a violation of data elementarity.
3.2.17
data-modeling variation
variation in data models (3.2.2) describing the same information
4 Terminology database design
4.1 General
Terminology database design requires a deep understanding of terminology theory and terminology
work. In this sense, and to achieve high quality results, the following shall be used:
— established terms and definitions as specified in ISO 1087;
— principles and methods as specified in ISO 704;
— data-modeling criteria as specified in ISO 16642 and ISO 12620;
4 © ISO 2019 – All rights reserved
---------------------- Page: 12 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
— usability metrics as specified in ISO 23185.
Terminology databases have a logical structure that is reflected in a fundamental hierarchical data model
(as described in 4.2) containing various levels at which data categories (see 4.3) can be anchored. This
data-modeling approach provides the necessary flexibility, since the design of a terminology database
is always subject to specific work profiles (terminology work, technical communication, translation,
etc.) and to organizational needs (freelancers, translation agencies, company or organization in-house
departments, etc.). Thus, in the very early design process, a long-term and detailed management plan
shall be defined, taking into consideration all possible user groups, as well as organizational and
technical issues in order to avoid the need for substantial, time consuming and costly changes after
concluding the design process.
4.2 Terminological metamodel
Terminology databases shall comply with the terminological metamodel (or a subset thereof) defined
in ISO 16642:2017 (see Figure 1 and Figure 2). The essence of the metamodel constitutes the principle
of concept orientation (see 4.4.1), i.e.:
— in a terminology database, an entry (concept entry = CE) describes a concept that can be further
described using:
— additional sublevels for instantiating languages and language-specific information (language
section = LS), further subdivided by:
— terms and term-specific information (term section = TS);
— individual words of a multiword term or one of the components of a single-word term, such as a
morpheme (term component section = TCS).
Furthermore, the metamodel provides high-level containers that allow for documenting:
— global information (GI) that applies to the complete terminology database (name of the terminology
database, institution or individual originating the file, copyright information, history, etc.);
— complementary information (CI) such as complete bibliographical or administrative information,
binary data, picklist values or references to text corpora that are referenced from concept entries.
The above-mentioned levels of the terminological metamodel can be schematized as shown in Figure 1.
© ISO 2019 – All rights reserved 5
---------------------- Page: 13 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
Figure 1 — Terminological metamodel — Schematic view (adapted from ISO 16642:2017)
The metamodel levels, their relationships and cardinalities can also be expressed using UML (Unified
Modeling Language) as shown in Figure 2.
6 © ISO 2019 – All rights reserved
---------------------- Page: 14 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
Cardinalities
1.1 = Shall occur exactly once.
1.* = Shall occur one or more times.
0.1 = May occur zero or one time.
0.* = May occur zero or more times.
Figure 2 — Terminological metamodel — UML diagram (adapted from ISO 16642:2017)
An example of a terminology database excerpt compliant with the terminological metamodel is given in
Annex A.
4.3 Data categories
4.3.1 General
Concept entries are made up of specific units of information, such as terms, definitions and contexts, and
each class of these units of information is identified by a data category. Over recent decades terminology
practitioners have gathered, standardized, classified, and published relevant data categories that are
stored in recognized data category repositories, such as DatCatInfo (see Reference [11]). For instance,
/grammatical gender/ is a data category that would typically be used as a field name, and "feminine",
"masculine", "neuter", etc. would occur as values of this field. Of course, terminology database designers
can create and use their data categories according to their specific needs. However, care shall be taken
when creating new data categories or when adapting the names for the data categories proposed by
recognized data category repositories to avoid confusion or misinterpretation.
© ISO 2019 – All rights reserved 7
---------------------- Page: 15 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
ISO 10241-1:2011 and ISO 12616:2002 include a list of common data categories used to document the
following units of information:
— terms (in any desired language);
— term-specific information (such as grammatical attributes, term types, register, status, contexts, etc.);
— language-specific information (such as definitions, notes, etc.);
— concept-specific information (such as subject fields, definitions, examples, notes, graphics, etc.);
— administrative and bibliographical information:
— identifiers of various sorts, such as to identify products or projects with which the terms are
associated;
— dates, names of people who created or modified the concept entry or parts of it;
— entry status, for example "submitted", "working", "approved";
— sources of terms, definitions, contexts, notes, etc.
Designers who need new data categories that are not included in recognized data category repositories
shall adhere to the requirements for creating, documenting, harmonizing, and maintaining data
category specifications as defined in ISO 12620.
4.3.2 Types of data categories
4.3.2.1 General
Terminology databases typically contain several different types of data categories depending on the
kind of information they contain. Data categories can be divided into three categories, namely:
— open and closed data categories;
— mandatory, optional, system-generated, and default data categories;
— read-write, read-only, and hidden data categories.
Careful selection of data category types ensures higher quality content in a terminology database.
4.3.2.2 Open and closed data categories
Open data categories can contain any text. For instance, /term/ is considered open, because the actual
term that can be recorded in the corresponding data field is unpredictable. Other examples of open data
categories are /definition/ or /context/.
In contrast, closed data categories can only contain one or more of a finite set of permissible values.
When documenting terms in the German language, for instance, /grammatical gender/ may only
take the values "masculine", "feminine", or "neuter". Apart from /grammatical gender/, typical
representatives of closed data categories are /part of speech/, /term type/, /geographical usage/,
/administrative status/ or /subject field/.
Values themselves can constitute data categories such as /masculine/, for example, with the Boolean
values "yes|no" or "true|false".
The use of picklists ensures that only predefined values can be selected, thus preventing the insertion of
inadmissible values or misspelled or variant forms. For instance, left to their own devices, users might
type "masculine", "masc.", or simply "m." for a masculine noun. Providing a uniform representation of
these values ensures consistency throughout the terminology database, which is important for ensuring
the performance of searching, filtering, data exchange, and other data management tasks.
8 © ISO 2019 – All rights reserved
---------------------- Page: 16 ----------------------
SIST ISO 26162-1:2021
ISO 26162-1:2019(E)
4.3.2.3 Mandatory, optional, system-generated, and default data categories
Data categories can be either mandatory or optional. Typically, a terminology management system does
not allow users to save concept entries if a mandatory field is empty. At least one language and one term
in that language shall be required for each concept entry, unless the concept has yet to be named or it is
represented by a diagram or a node in the concept diagram.
Other typical mandatory fields include information about the subject field or part of speech, both of
which can be essential for differentiating homographs. However, designating certain data categories to
be mandatory can sometimes be problematic. For instance, it can take considerable time and effort to
find definitions and contexts. There can be even more complex conditions, such as forcing a source for a
definition or the grammatical gender in case of nouns for a specific language section.
System-generated data fields are not inserted manually by the user, they are populated by the system.
For instance, many terminology management systems automatically assign entry numbers to concept
entries, as well as creation and modification dates, and the names of the users who created or modified
the concept entry.
Part of terminology database design involves making decisions about the levels within the concept
entry that will be documented with such administrative information. It is generally insufficient to
record such administrative information only for the concept entry as a whole, because different people
can be responsible for different parts of the concept entry, especially for different language sections.
It is also possible to set default values for certain data categories. For instance, if a user is going to
document terms associated with a single project or source text, it can be convenient to allow the user to
pre-set the values for the corresponding data categories /project subset/ or /source/ so that all concept
entries created automatically contain those values.
4.3.2.4 Read-write, read-only, and hidden data categories
In a terminology database, it shall be possible to set different access levels for users, depending on
the user's needs and role. A field that is visible and can be modified is a "read-write" field. A field that
is visible but cannot be modified is a "read-only" field. Fields that are not needed by som
...
INTERNATIONAL ISO
STANDARD 26162-1
First edition
2019-11
Management of terminology
resources — Terminology
databases —
Part 1:
Design
Gestion des ressources terminologiques — Bases de données
terminologiques —
Partie 1: Conception
Reference number
ISO 26162-1:2019(E)
©
ISO 2019
---------------------- Page: 1 ----------------------
ISO 26162-1:2019(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2019
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2019 – All rights reserved
---------------------- Page: 2 ----------------------
ISO 26162-1:2019(E)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
3.1 Concepts . 1
3.2 Terminology databases . 2
4 Terminology database design . 4
4.1 General . 4
4.2 Terminological metamodel . 5
4.3 Data categories . 7
4.3.1 General. 7
4.3.2 Types of data categories . 8
4.3.3 Shared resources . 9
4.3.4 Concept relations .10
4.4 Concept entries .11
4.4.1 Concept orientation . .11
4.4.2 Language .11
4.4.3 Dependency and repeatability of data categories .12
4.4.4 Data granularity .12
4.4.5 Data elementarity .13
4.4.6 Data-modeling variation .13
4.5 Roles .14
Annex A (informative) Terminology database excerpt based on the terminological
metamodel — Example .15
Bibliography .19
© ISO 2019 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO 26162-1:2019(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
iso/ foreword .html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology,
Subcommittee SC 3, Management of terminology resources.
This first edition of ISO 26162-1, together with ISO 26162-2, cancels and replaces ISO 26162:2012, which
has been technically revised.
The main changes compared to the previous edition are as follows:
— the document has been split into parts. The first part is focusing on the design of terminology
database design, the second part on the development of terminology management systems;
— all references to generic software design principles and specific use cases have been removed.
A list of all parts of the ISO 26162 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
iv © ISO 2019 – All rights reserved
---------------------- Page: 4 ----------------------
ISO 26162-1:2019(E)
Introduction
Terminologies are the totality of concepts in given subject fields represented by terms and other
designations and described by using additional terminological data. In general, these data are organized
in structured terminology databases and are usually manipulated in specific software applications
called terminology management systems. Terminology databases usually vary with regard to their
underlying data models and consist of different sets of data categories, while terminology management
systems generally differ depending on their functionality and the platform they are designed for.
The ISO 26162 series gives guidance on designing terminology databases and on essential terminology
management system features. The series can also be used to evaluate the conformance and suitability
of terminology databases and terminology management systems.
© ISO 2019 – All rights reserved v
---------------------- Page: 5 ----------------------
INTERNATIONAL STANDARD ISO 26162-1:2019(E)
Management of terminology resources — Terminology
databases —
Part 1:
Design
1 Scope
This document specifies general, i.e. implementation- and use-case-independent terminology database
design principles to enable maximum efficiency and quality in terminology work. Thus, this document
supports creating, processing, and using high quality terminology. The intended audiences of this
document are terminologists, translators, interpreters, technical communicators, language planners,
subject field experts, and terminology management system developers.
This document describes a maximum approach, i.e. terminology database design for distributed,
multilingual terminology management. It can also be used for designing smaller solutions.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 704, Terminology work — Principles and methods
ISO 1087, Terminology work — Vocabulary
ISO 12620, Management of terminology resources — Data category specifications
ISO 16642:2017, Computer applications in terminology — Terminological markup framework
ISO 23185, Assessment and benchmarking of terminological resources — General concepts, principles and
requirements
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 1087 and the following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
3.1 Concepts
3.1.1
object
anything perceivable or conceivable
Note 1 to entry: Objects can be material (e.g. an engine, a sheet of paper, a diamond), immaterial (e.g. a conversion
ratio, a project plan) or imagined (e.g. a unicorn, a scientific hypothesis).
© ISO 2019 – All rights reserved 1
---------------------- Page: 6 ----------------------
ISO 26162-1:2019(E)
Note 2 to entry: Objects can undergo changes which cause conceptual or designation change.
[SOURCE: ISO 1087:2019, 3.1.1, modified — Note 2 to entry added.]
3.1.2
concept
unit of knowledge created by a unique combination of characteristics
Note 1 to entry: Concepts are not necessarily bound to particular natural languages. They are, however,
influenced by the social or cultural background which often leads to different categorizations.
Note 2 to entry: Due to their dynamic nature, concepts are also defined as units of thinking (see ISO 704:2009, 5.1
and DIN 2342:2011-08, 4.1).
[SOURCE: ISO 1087:2019, 3.2.7, modified — former Note 2 to entry removed and replaced by a new
Note 2 to entry.]
3.1.3
designation
designator
representation of a concept (3.1.2) by a sign which denotes it in a domain or subject
Note 1 to entry: A designation can be linguistic or non-linguistic. It can consist of various types of characters, but
also punctuation marks such as hyphens and parentheses, governed by domain-, subject-, or language-specific
conventions.
Note 2 to entry: A designation may be a term (3.1.4) including appellations, a proper name, or a symbol.
[SOURCE: ISO 1087:2019, 3.4.1]
3.1.4
term
designation (3.1.3) that represents a general concept by linguistic means
EXAMPLE "laser printer", "planet", "pacemaker", "chemical compound", "¾ time", "Influenza A virus", "oil
painting".
Note 1 to entry: Terms may be partly or wholly verbal.
[SOURCE: ISO 1087:2019, 3.4.2]
3.2 Terminology databases
3.2.1
terminology database
termbase
database comprising a terminological data collection (3.2.4)
[SOURCE: ISO 30042:2019, 3.28, modified — admitted term "terminology database" made preferred
term and preferred term "termbase" made admitted term.]
3.2.2
data model
graphical and/or lexical representation of data, specifying their properties, structure, and inter-
relationships
[SOURCE: ISO/IEC 11179-1:2015, 3.2.7]
3.2.3
terminological metamodel
data model (3.2.2) that describes the basis for designing and implementing terminological data
collections (3.2.4)
2 © ISO 2019 – All rights reserved
---------------------- Page: 7 ----------------------
ISO 26162-1:2019(E)
3.2.4
terminological data collection
TDC
resource consisting of concept entries (3.2.7) with associated metadata and documentary information
[SOURCE: ISO 16642:2017, 3.21, modified — "terminological entries" replaced by "concept entries".]
3.2.5
global information
GI
technical and administrative information applying to the entire terminological data collection (3.2.4)
EXAMPLE The title of the terminological data collection, revision history, owner or copyright information.
[SOURCE: ISO 16642:2017, 3.11, modified — "Note 1 to entry" replaced by "EXAMPLE"; "For example,"
removed in the example.]
3.2.6
complementary information
CI
information supplementary to that described in concept entries (3.2.7) and shared across the
terminological data collection (3.2.4)
EXAMPLE Domain hierarchies, institution descriptions, bibliographic references, and references to text
corpora.
[SOURCE: ISO 16642:2017, 3.2, modified — "terminological entries" replaced by "concept entries"
within definition; "Note 1 to entry" replaced by "EXAMPLE"; "are typical examples of complementary
information" removed in the example.]
3.2.7
concept entry
CE
terminological entry
part of a terminological data collection (3.2.4) which contains the terminological data related to one
concept (3.1.2)
[SOURCE: ISO 16642:2017, 3.22, modified — "concept entry" and acronym "CE" added as preferred
terms; preferred term "terminological entry" made admitted term; preferred term "TE" removed;
Note 1 to entry removed.]
3.2.8
language section
LS
part of a concept entry (3.2.7) containing information related to one language
[SOURCE: ISO 16642:2017, 3.13, modified — "terminological entry" replaced by "concept entry"; Note 1
to entry removed.]
3.2.9
term section
TS
part of a language section (3.2.8) containing information about a term (3.1.4)
[SOURCE: ISO 16642:2017, 3.20, modified — "giving" replaced by "containing".]
3.2.10
term component section
TCS
part of a term section (3.2.9) containing linguistic information about the components of a term (3.1.4)
[SOURCE: ISO 16642:2017, 3.19, modified — "giving" replaced by "containing.]
© ISO 2019 – All rights reserved 3
---------------------- Page: 8 ----------------------
ISO 26162-1:2019(E)
3.2.11
data category
class of data items that are closely related from a formal or semantic point of view
EXAMPLE /part of speech/, /subject field/, /definition/.
Note 1 to entry: A data category can be viewed as a generalization of the notion of a field in a database.
Note 2 to entry: In running text, such as in this document, data category names are enclosed in forward slashes
(e.g. /part of speech/).
[SOURCE: ISO 12620:2019, 3.2, modified — preferred term "DC" removed".]
3.2.12
repeatability
principle whereby a data category (3.2.11) can be repeated within a database definition and whereby it
can also be combined with other data categories
3.2.13
concept orientation
principle whereby a concept entry (3.2.7) describes a single concept (3.1.2)
Note 1 to entry: When two or more different concepts are represented by the same designation (in the same
language), this designation is considered a homograph. Such concepts are documented in separate concept
entries.
3.2.14
term autonomy
principle whereby all terms (3.1.4) in a concept entry (3.2.7) are considered independent sub-units and
can be described using the same set of data categories (3.2.11)
3.2.15
data granularity
degree of data precision
EXAMPLE The set of individual data categories /part of speech/, /grammatical gender/, and /grammatical
number/ provides for greater data granularity than does the single data category /grammar/.
3.2.16
data elementarity
principle whereby a data field contains only one data element
EXAMPLE For example, including both a full form and an abbreviation of a term in the same data field would
be a violation of data elementarity.
3.2.17
data-modeling variation
variation in data models (3.2.2) describing the same information
4 Terminology database design
4.1 General
Terminology database design requires a deep understanding of terminology theory and terminology
work. In this sense, and to achieve high quality results, the following shall be used:
— established terms and definitions as specified in ISO 1087;
— principles and methods as specified in ISO 704;
— data-modeling criteria as specified in ISO 16642 and ISO 12620;
4 © ISO 2019 – All rights reserved
---------------------- Page: 9 ----------------------
ISO 26162-1:2019(E)
— usability metrics as specified in ISO 23185.
Terminology databases have a logical structure that is reflected in a fundamental hierarchical data model
(as described in 4.2) containing various levels at which data categories (see 4.3) can be anchored. This
data-modeling approach provides the necessary flexibility, since the design of a terminology database
is always subject to specific work profiles (terminology work, technical communication, translation,
etc.) and to organizational needs (freelancers, translation agencies, company or organization in-house
departments, etc.). Thus, in the very early design process, a long-term and detailed management plan
shall be defined, taking into consideration all possible user groups, as well as organizational and
technical issues in order to avoid the need for substantial, time consuming and costly changes after
concluding the design process.
4.2 Terminological metamodel
Terminology databases shall comply with the terminological metamodel (or a subset thereof) defined
in ISO 16642:2017 (see Figure 1 and Figure 2). The essence of the metamodel constitutes the principle
of concept orientation (see 4.4.1), i.e.:
— in a terminology database, an entry (concept entry = CE) describes a concept that can be further
described using:
— additional sublevels for instantiating languages and language-specific information (language
section = LS), further subdivided by:
— terms and term-specific information (term section = TS);
— individual words of a multiword term or one of the components of a single-word term, such as a
morpheme (term component section = TCS).
Furthermore, the metamodel provides high-level containers that allow for documenting:
— global information (GI) that applies to the complete terminology database (name of the terminology
database, institution or individual originating the file, copyright information, history, etc.);
— complementary information (CI) such as complete bibliographical or administrative information,
binary data, picklist values or references to text corpora that are referenced from concept entries.
The above-mentioned levels of the terminological metamodel can be schematized as shown in Figure 1.
© ISO 2019 – All rights reserved 5
---------------------- Page: 10 ----------------------
ISO 26162-1:2019(E)
Figure 1 — Terminological metamodel — Schematic view (adapted from ISO 16642:2017)
The metamodel levels, their relationships and cardinalities can also be expressed using UML (Unified
Modeling Language) as shown in Figure 2.
6 © ISO 2019 – All rights reserved
---------------------- Page: 11 ----------------------
ISO 26162-1:2019(E)
Cardinalities
1.1 = Shall occur exactly once.
1.* = Shall occur one or more times.
0.1 = May occur zero or one time.
0.* = May occur zero or more times.
Figure 2 — Terminological metamodel — UML diagram (adapted from ISO 16642:2017)
An example of a terminology database excerpt compliant with the terminological metamodel is given in
Annex A.
4.3 Data categories
4.3.1 General
Concept entries are made up of specific units of information, such as terms, definitions and contexts, and
each class of these units of information is identified by a data category. Over recent decades terminology
practitioners have gathered, standardized, classified, and published relevant data categories that are
stored in recognized data category repositories, such as DatCatInfo (see Reference [11]). For instance,
/grammatical gender/ is a data category that would typically be used as a field name, and "feminine",
"masculine", "neuter", etc. would occur as values of this field. Of course, terminology database designers
can create and use their data categories according to their specific needs. However, care shall be taken
when creating new data categories or when adapting the names for the data categories proposed by
recognized data category repositories to avoid confusion or misinterpretation.
© ISO 2019 – All rights reserved 7
---------------------- Page: 12 ----------------------
ISO 26162-1:2019(E)
ISO 10241-1:2011 and ISO 12616:2002 include a list of common data categories used to document the
following units of information:
— terms (in any desired language);
— term-specific information (such as grammatical attributes, term types, register, status, contexts, etc.);
— language-specific information (such as definitions, notes, etc.);
— concept-specific information (such as subject fields, definitions, examples, notes, graphics, etc.);
— administrative and bibliographical information:
— identifiers of various sorts, such as to identify products or projects with which the terms are
associated;
— dates, names of people who created or modified the concept entry or parts of it;
— entry status, for example "submitted", "working", "approved";
— sources of terms, definitions, contexts, notes, etc.
Designers who need new data categories that are not included in recognized data category repositories
shall adhere to the requirements for creating, documenting, harmonizing, and maintaining data
category specifications as defined in ISO 12620.
4.3.2 Types of data categories
4.3.2.1 General
Terminology databases typically contain several different types of data categories depending on the
kind of information they contain. Data categories can be divided into three categories, namely:
— open and closed data categories;
— mandatory, optional, system-generated, and default data categories;
— read-write, read-only, and hidden data categories.
Careful selection of data category types ensures higher quality content in a terminology database.
4.3.2.2 Open and closed data categories
Open data categories can contain any text. For instance, /term/ is considered open, because the actual
term that can be recorded in the corresponding data field is unpredictable. Other examples of open data
categories are /definition/ or /context/.
In contrast, closed data categories can only contain one or more of a finite set of permissible values.
When documenting terms in the German language, for instance, /grammatical gender/ may only
take the values "masculine", "feminine", or "neuter". Apart from /grammatical gender/, typical
representatives of closed data categories are /part of speech/, /term type/, /geographical usage/,
/administrative status/ or /subject field/.
Values themselves can constitute data categories such as /masculine/, for example, with the Boolean
values "yes|no" or "true|false".
The use of picklists ensures that only predefined values can be selected, thus preventing the insertion of
inadmissible values or misspelled or variant forms. For instance, left to their own devices, users might
type "masculine", "masc.", or simply "m." for a masculine noun. Providing a uniform representation of
these values ensures consistency throughout the terminology database, which is important for ensuring
the performance of searching, filtering, data exchange, and other data management tasks.
8 © ISO 2019 – All rights reserved
---------------------- Page: 13 ----------------------
ISO 26162-1:2019(E)
4.3.2.3 Mandatory, optional, system-generated, and default data categories
Data categories can be either mandatory or optional. Typically, a terminology management system does
not allow users to save concept entries if a mandatory field is empty. At least one language and one term
in that language shall be required for each concept entry, unless the concept has yet to be named or it is
represented by a diagram or a node in the concept diagram.
Other typical mandatory fields include information about the subject field or part of speech, both of
which can be essential for differentiating homographs. However, designating certain data categories to
be mandatory can sometimes be problematic. For instance, it can take considerable time and effort to
find definitions and contexts. There can be even more complex conditions, such as forcing a source for a
definition or the grammatical gender in case of nouns for a specific language section.
System-generated data fields are not inserted manually by the user, they are populated by the system.
For instance, many terminology management systems automatically assign entry numbers to concept
entries, as well as creation and modification dates, and the names of the users who created or modified
the concept entry.
Part of terminology database design involves making decisions about the levels within the concept
entry that will be documented with such administrative information. It is generally insufficient to
record such administrative information only for the concept entry as a whole, because different people
can be responsible for different parts of the concept entry, especially for different language sections.
It is also possible to set default values for certain data categories. For instance, if a user is going to
document terms associated with a single project or source text, it can be convenient to allow the user to
pre-set the values for the corresponding data categories /project subset/ or /source/ so that all concept
entries created automatically contain those values.
4.3.2.4 Read-write, read-only, and hidden data categories
In a terminology database, it shall be possible to set different access levels for users, depending on
the user's needs and role. A field that is visible and can be modified is a "read-write" field. A field that
is visible but cannot be modified is a "read-only" field. Fields that are not needed by some users are
usually hidden.
For instance, it can be desirable for a lead terminologist to have read-write access to all fields, especially
to the concept entry status field, to set the corresponding value ("proposed", "under review", "approved",
"deprecated", etc.). In general, this person is also responsible for assigning authorization levels to other
users. Users editing concept entries for specific languages should only have read-write access to the
fields in their language sections. Product developers, technical communicators, translators, and service
and marketing staff can have read-write access to fields necessary to allow them to provide feedback
to the lead terminologists, and read-only rights to the remaining fields. It is also frequently desirable to
hide administrative fields such as "date" and "responsibility" to ensure that users have a less cluttered
view of the data.
4.3.3 Shared resources
Some data point to resources that reside outside the concept entry, such as figures, audio, video,
websites, full sets of bibliographical and personal data, references to text corpora, or other concept
entries. These resources are called shared resources, because any one resource can be referenced from
many concept entries.
For example, a field for figures is typically available at the concept lev
...
SLOVENSKI STANDARD
oSIST ISO/FDIS 26162-1:2019
01-oktober-2019
Upravljanje terminoloških virov - Terminološke baze podatkov - 1. del: Zasnova
Management of terminology resources -- Terminology databases -- Part 1: Design
Systèmes de gestion de la terminologie, de la connaissance et du contenu -- Bases de
données terminologiques -- Partie 1: Conception
Ta slovenski standard je istoveten z: ISO/FDIS 26162-1:2019
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
oSIST ISO/FDIS 26162-1:2019 en,fr,de
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
---------------------- Page: 1 ----------------------
oSIST ISO/FDIS 26162-1:2019
---------------------- Page: 2 ----------------------
oSIST ISO/FDIS 26162-1:2019
FINAL
INTERNATIONAL ISO/FDIS
DRAFT
STANDARD 26162-1
ISO/TC 37/SC 3
Management of terminology
Secretariat: DIN
resources — Terminology
Voting begins on:
2019-08-07 databases —
Voting terminates on:
Part 1:
2019-10-02
Design
Gestion des ressources terminologiques — Bases de données
terminologiques —
Partie 1: Conception
RECIPIENTS OF THIS DRAFT ARE INVITED TO
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
Reference number
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
ISO/FDIS 26162-1:2019(E)
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
©
NATIONAL REGULATIONS. ISO 2019
---------------------- Page: 3 ----------------------
oSIST ISO/FDIS 26162-1:2019
ISO/FDIS 26162-1:2019(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2019
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2019 – All rights reserved
---------------------- Page: 4 ----------------------
oSIST ISO/FDIS 26162-1:2019
ISO/FDIS 26162-1:2019(E)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
3.1 Concepts . 1
3.2 Terminology databases . 2
4 Terminology database design . 5
4.1 General . 5
4.2 Terminological metamodel . 5
4.3 Data categories . 7
4.3.1 General. 7
4.3.2 Types of data categories . 8
4.3.3 Shared resources . 9
4.3.4 Concept relations .10
4.4 Concept entries .11
4.4.1 Concept orientation . .11
4.4.2 Language .11
4.4.3 Dependency and repeatability of data categories .12
4.4.4 Data granularity .12
4.4.5 Data elementarity .13
4.4.6 Data-modeling variation .13
4.5 Roles .14
Annex A (informative) Terminology database excerpt based on the terminological
metamodel — Example .15
Bibliography .19
© ISO 2019 – All rights reserved iii
---------------------- Page: 5 ----------------------
oSIST ISO/FDIS 26162-1:2019
ISO/FDIS 26162-1:2019(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www .iso .org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www .iso
.org/iso/foreword .html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology,
Subcommittee SC 3, Management of terminology resources.
This first edition of ISO 26162-1, together with ISO 26162-2, cancels and replaces ISO 26162:2012, which
has been technically revised.
The main changes compared to the previous edition are as follows:
— the document has been split into parts. The first part is focusing on the design of terminology
database design, the second part on the development of terminology management systems;
— all references to generic software design principles and specific use cases have been removed.
A list of all parts of the ISO 26162 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/members .html.
iv © ISO 2019 – All rights reserved
---------------------- Page: 6 ----------------------
oSIST ISO/FDIS 26162-1:2019
ISO/FDIS 26162-1:2019(E)
Introduction
Terminologies are the totality of concepts in given subject fields represented by terms and other
designations and described by using additional terminological data. In general, these data are organized
in structured terminology databases and are usually manipulated in specific software applications
called terminology management systems. Terminology databases usually vary with regard to their
underlying data models and consist of different sets of data categories, while terminology management
systems generally differ depending on their functionality and the platform they are designed for.
The ISO 26162 series gives guidance on designing terminology databases and on essential terminology
management system features. The series can also be used to evaluate the conformance and suitability
of terminology databases and terminology management systems.
© ISO 2019 – All rights reserved v
---------------------- Page: 7 ----------------------
oSIST ISO/FDIS 26162-1:2019
---------------------- Page: 8 ----------------------
oSIST ISO/FDIS 26162-1:2019
FINAL DRAFT INTERNATIONAL STANDARD ISO/FDIS 26162-1:2019(E)
Management of terminology resources — Terminology
databases —
Part 1:
Design
1 Scope
This document specifies general, i.e. implementation- and use-case-independent terminology database
design principles to enable maximum efficiency and quality in terminology work. Thus, this document
supports creating, processing, and using high quality terminology. The intended audiences of this
document are terminologists, translators, interpreters, technical communicators, language planners,
subject field experts, and terminology management system developers.
This document describes a maximum approach, i.e. terminology database design for distributed,
multilingual terminology management. It can also be used for designing smaller solutions.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 704, Terminology work — Principles and methods
ISO 1087, Terminology work — Vocabulary
ISO 12620, Management of terminology resources — Data category specifications
ISO 16642:2017, Computer applications in terminology — Terminological markup framework
ISO 23185, Assessment and benchmarking of terminological resources — General concepts, principles and
requirements
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 1087 and the following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https: //www .iso .org/obp
— IEC Electropedia: available at http: //www .electropedia .org/
3.1 Concepts
3.1.1
object
anything perceivable or conceivable
Note 1 to entry: Objects can be material (e.g. an engine, a sheet of paper, a diamond), immaterial (e.g. a conversion
ratio, a project plan) or imagined (e.g. a unicorn, a scientific hypothesis).
© ISO 2019 – All rights reserved 1
---------------------- Page: 9 ----------------------
oSIST ISO/FDIS 26162-1:2019
ISO/FDIS 26162-1:2019(E)
Note 2 to entry: Objects can undergo changes which cause conceptual or designation change.
1)
[SOURCE: ISO 1087:— , 3.1.1, modified — Note 2 to entry added.]
3.1.2
concept
unit of knowledge created by a unique combination of characteristics
Note 1 to entry: Concepts are not necessarily bound to particular natural languages. They are, however,
influenced by the social or cultural background which often leads to different categorizations.
Note 2 to entry: Due to their dynamic nature, concepts are also defined as units of thinking (see ISO 704:2009, 5.1
and DIN 2342:2011-08, 4.1).
1)
[SOURCE: ISO 1087:— , 3.2.7, modified — former Note 2 to entry removed and replaced by a new Note 2
to entry.]
3.1.3
designation
designator
representation of a concept (3.1.2) by a sign which denotes it in a domain or subject
Note 1 to entry: A designation can be linguistic or non-linguistic. It can consist of various types of characters, but
also punctuation marks such as hyphens and parentheses, governed by domain-, subject-, or language-specific
conventions.
Note 2 to entry: A designation may be a term (3.1.4) including appellations, a proper name, or a symbol.
1)
[SOURCE: ISO 1087:— , 3.4.1]
3.1.4
term
designation (3.1.3) that represents a general concept by linguistic means
EXAMPLE "laser printer", "planet", "pacemaker", "chemical compound", "¾ time", "Influenza A virus", "oil
painting".
Note 1 to entry: Terms may be partly or wholly verbal.
1)
[SOURCE: ISO 1087:— , 3.4.2]
3.2 Terminology databases
3.2.1
terminology database
termbase
database comprising a terminological data collection (3.2.4)
[SOURCE: ISO 30042:2019, 3.28, modified — admitted term "terminology database" made preferred
term and preferred term "termbase" made admitted term.]
3.2.2
data model
graphical and/or lexical representation of data, specifying their properties, structure, and inter-
relationships
[SOURCE: ISO/IEC 11179-1:2015, 3.2.7]
1) Under preparation. Stage at the time of publication: ISO/FDIS 1087:2019.
2 © ISO 2019 – All rights reserved
---------------------- Page: 10 ----------------------
oSIST ISO/FDIS 26162-1:2019
ISO/FDIS 26162-1:2019(E)
3.2.3
terminological metamodel
data model (3.2.2) that describes the basis for designing and implementing terminological data
collections (3.2.4)
3.2.4
terminological data collection
TDC
resource consisting of concept entries (3.2.7) with associated metadata and documentary information
[SOURCE: ISO 16642:2017, 3.21, modified — "terminological entries" replaced by "concept entries".]
3.2.5
global information
GI
technical and administrative information applying to the entire terminological data collection (3.2.4)
EXAMPLE The title of the terminological data collection, revision history, owner or copyright information.
[SOURCE: ISO 16642:2017, 3.11, modified — "Note 1 to entry" replaced by "EXAMPLE"; "For example,"
removed in the example.]
3.2.6
complementary information
CI
information supplementary to that described in concept entries (3.2.7) and shared across the
terminological data collection (3.2.4)
EXAMPLE Domain hierarchies, institution descriptions, bibliographic references, and references to text
corpora.
[SOURCE: ISO 16642:2017, 3.2, modified — "terminological entries" replaced by "concept entries"
within definition; "Note 1 to entry" replaced by "EXAMPLE", "are typical examples of complementary
information" removed in the example.]
3.2.7
concept entry
CE
terminological entry
part of a terminological data collection (3.2.4) which contains the terminological data related to one
concept (3.1.2)
[SOURCE: ISO 16642:2017, 3.22, modified — "concept entry" and acronym "CE" added as preferred
terms; preferred term "terminological entry" made admitted term; preferred term "TE" removed;
Note 1 to entry removed.]
3.2.8
language section
LS
part of a concept entry (3.2.7) containing information related to one language
[SOURCE: ISO 16642:2017, 3.13, modified — "terminological entry" replaced by "concept entry"; Note 1
to entry removed.]
3.2.9
term section
TS
part of a language section (3.2.8) containing information about a term (3.1.4)
[SOURCE: ISO 16642:2017, 3.20, modified — "giving" replaced by "containing".]
© ISO 2019 – All rights reserved 3
---------------------- Page: 11 ----------------------
oSIST ISO/FDIS 26162-1:2019
ISO/FDIS 26162-1:2019(E)
3.2.10
term component section
TCS
part of a term section (3.2.9) containing linguistic information about the components of a term (3.1.4)
[SOURCE: ISO 16642:2017, 3.19, modified — "giving" replaced by "containing.]
3.2.11
data category
class of data items that are closely related from a formal or semantic point of view
EXAMPLE /part of speech/, /subject field/, /definition/.
Note 1 to entry: A data category can be viewed as a generalization of the notion of a field in a database.
Note 2 to entry: In running text, such as in this document, data category names are enclosed in forward slashes
(e.g. /part of speech/).
[SOURCE: ISO 12620:2019, 3.2, modified — preferred term "DC" removed".]
3.2.12
repeatability
principle whereby a data category (3.2.11) can be repeated within a database definition and whereby it
can also be combined with other data categories
3.2.13
concept orientation
principle whereby a concept entry (3.2.7) describes a single concept (3.1.2)
Note 1 to entry: When two or more different concepts are represented by the same designation (in the same
language), this designation is considered a homograph. Such concepts are documented in separate concept
entries.
3.2.14
term autonomy
principle whereby all terms (3.1.4) in a concept entry (3.2.7) are considered independent sub-units and
can be described using the same set of data categories (3.2.11)
3.2.15
data granularity
degree of data precision
EXAMPLE The set of individual data categories /part of speech/, /grammatical gender/, and /grammatical
number/ provides for greater data granularity than does the single data category /grammar/.
3.2.16
data elementarity
principle whereby a data field contains only one data element
EXAMPLE For example, including both a full form and an abbreviation of a term in the same data field would
be a violation of data elementarity.
3.2.17
data-modeling variation
variation in data models (3.2.2) describing the same information
4 © ISO 2019 – All rights reserved
---------------------- Page: 12 ----------------------
oSIST ISO/FDIS 26162-1:2019
ISO/FDIS 26162-1:2019(E)
4 Terminology database design
4.1 General
Terminology database design requires a deep understanding of terminology theory and terminology
work. In this sense, and to achieve high quality results, the following shall be used:
— established terms and definitions as specified in ISO 1087;
— principles and methods as specified in ISO 704;
— data-modeling criteria as specified in ISO 16642 and ISO 12620;
— usability metrics as specified in ISO 23185.
Terminology databases have a logical structure that is reflected in a fundamental hierarchical data model
(as described in 4.2) containing various levels at which data categories (see 4.3) can be anchored. This
data-modeling approach provides the necessary flexibility, since the design of a terminology database
is always subject to specific work profiles (terminology work, technical communication, translation,
etc.) and to organizational needs (freelancers, translation agencies, company or organization in-house
departments, etc.). Thus, in the very early design process, a long-term and detailed management plan
shall be defined, taking into consideration all possible user groups, as well as organizational and
technical issues in order to avoid the need for substantial, time consuming and costly changes after
concluding the design process.
4.2 Terminological metamodel
Terminology databases shall comply with the terminological metamodel (or a subset thereof) defined
in ISO 16642:2017 (see Figure 1 and Figure 2). The essence of the metamodel constitutes the principle
of concept orientation (see 4.4.1), i.e.:
— in a terminology database, an entry (concept entry = CE) describes a concept that can be further
described using:
— additional sublevels for instantiating languages and language-specific information (language
section = LS), further subdivided by:
— terms and term-specific information (term section = TS);
— individual words of a multiword term or one of the components of a single-word term, such as a
morpheme (term component section = TCS).
Furthermore, the metamodel provides high-level containers that allow for documenting:
— global information (GI) that applies to the complete terminology database (name of the terminology
database, institution or individual originating the file, copyright information, history, etc.);
— complementary information (CI) such as complete bibliographical or administrative information,
binary data, picklist values or references to text corpora that are referenced from concept entries.
The above-mentioned levels of the terminological metamodel can be schematized as shown in Figure 1.
© ISO 2019 – All rights reserved 5
---------------------- Page: 13 ----------------------
oSIST ISO/FDIS 26162-1:2019
ISO/FDIS 26162-1:2019(E)
Figure 1 — Terminological metamodel — Schematic view (adapted from ISO 16642:2017)
The metamodel levels, their relationships and cardinalities can also be expressed using UML (Unified
Modeling Language) as shown in Figure 2.
6 © ISO 2019 – All rights reserved
---------------------- Page: 14 ----------------------
oSIST ISO/FDIS 26162-1:2019
ISO/FDIS 26162-1:2019(E)
Cardinalities
1.1 = Shall occur exactly once.
1.* = Shall occur one or more times.
0.1 = May occur zero or one time.
0.* = May occur zero or more times.
Figure 2 — Terminological metamodel — UML diagram (adapted from ISO 16642:2017)
An example of a terminology database excerpt compliant with the terminological metamodel is given in
Annex A.
4.3 Data categories
4.3.1 General
Concept entries are made up of specific units of information, such as terms, definitions and contexts, and
each class of these units of information is identified by a data category. Over recent decades terminology
practitioners have gathered, standardized, classified, and published relevant data categories that are
stored in recognized data category repositories, such as DatCatInfo (see Reference [11]). For instance,
/grammatical gender/ is a data category that would typically be used as a field name, and "feminine",
"masculine", "neuter", etc. would occur as values of this field. Of course, terminology database designers
can create and use their data categories according to their specific needs. However, care shall be taken
when creating new data categories or when adapting the names for the data categories proposed by
recognized data category repositories to avoid confusion or misinterpretation.
© ISO 2019 – All rights reserved 7
---------------------- Page: 15 ----------------------
oSIST ISO/FDIS 26162-1:2019
ISO/FDIS 26162-1:2019(E)
ISO 10241-1:2011 and ISO 12616:2002 include a list of common data categories used to document the
following units of information:
— terms (in any desired language);
— term-specific information (such as grammatical attributes, term types, register, status, contexts, etc.);
— language-specific information (such as definitions, notes, etc.);
— concept-specific information (such as subject fields, definitions, examples, notes, graphics, etc.);
— administrative and bibliographical information:
— identifiers of various sorts, such as to identify products or projects with which the terms are
associated;
— dates, names of people who created or modified the concept entry or parts of it;
— entry status, for example "submitted", "working", "approved";
— sources of terms, definitions, contexts, notes, etc.
Designers who need new data categories that are not included in recognized data category repositories
shall adhere to the requirements for creating, documenting, harmonizing, and maintaining data
category specifications as defined in ISO 12620.
4.3.2 Types of data categories
4.3.2.1 General
Terminology databases typically contain several different types of data categories depending on the
kind of information they contain. Data categories can be divided into three categories, namely:
— open and closed data categories;
— mandatory, optional, system-generated, and default data categories;
— read-write, read-only, and hidden data categories.
Careful selection of data category types ensures higher quality content in a terminology database.
4.3.2.2 Open and closed data categories
Open data categories can contain any text. For instance, /term/ is considered open, because the actual
term that can be recorded in the corresponding data field is unpredictable. Other examples of open data
categories are /definition/ or /context/.
In contrast, closed data categories can only contain one or more of a finite set of permissible values.
When documenting terms in the German language, for instance, /grammatical gender/ may only
take the values "masculine", "feminine", or "neuter". Apart from /grammatical gender/, typical
representatives of closed data categories are /part of speech/, /term type/, /geographical usage/, /
administrative status/ or /subject field/.
Values themselves can constitute data categories such as /masculine/, for example, with the Boolean
values "yes|no" or "true|false".
The use of picklists ensures that only predefined values can be selected, thus preventing the insertion of
inadmissible values or misspelled or variant forms. For instance, left to their own devices, users might
type "masculine", "masc.", or simply "m." for a masculine noun. Providing a uniform representation of
these values ensures consistency throughout the terminology database, which is important for ensuring
the performance of searching, filtering, data exchange, and other data management tasks.
8 © ISO 2019 – All rights reserved
---------------------- Page: 16 ----------------------
oSIST ISO/FDIS 26162-1:2019
ISO/FDIS 26162-1:2019(E)
4.3.2.3 Mandatory, optional, system-generated, and default data categories
Data categories can be either mandatory or optional. Typically, a terminology management system does
not allow users to save concept entries if a mandatory field is empty. At least one language and one term
in that language shall be required for each concept entry, unless the concept has yet to be named or it is
represented by a diagram or a node in the concept diagram.
Other typical mandatory fields include information about the subject field or part of speech, both of
which can be essential for differentiating homographs. However, designating certain data categories to
be mandatory can sometimes be problematic. For instance, it can take considerable time and effort to
find definitions and contexts. There can be even more complex conditions, such as forcing a source for a
definition or the grammatical gender in case of nouns for a specific language section.
System-generated data fields are not inserted manually by the user, they are populated by the system.
For instance, many terminology management systems automatically assign entry numbers to concept
entries, as well as creation and modification dates, and the names of the users who created or modified
the concept entry.
Part of terminology database design involves making decisions about the levels within the concept
entry that will be documented with such administrative information. It is generally insufficient to
record such administrative information only for the concept entry as a whole, because different people
can be responsible for diff
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.