Management of terminology resources - Data categories - Part 2: Repositories

This document establishes criteria for the management of data categories for use in the creation and maintenance of language resources within a given community of practice (CoP). It defines the roles and responsibilities associated with the creation and maintenance of such repositories. It also specifies procedures to establish a governance structure for the management of a data category repository (DCR), including the addition of new data category specifications and continuous quality assurance.

Gestion des ressources terminologiques — Catégories de données — Partie 2: Répertoires

Upravljanje terminoloških virov - Podatkovne kategorije - 2. del: Repozitoriji

Ta dokument določa merila za upravljanje podatkovnih kategorij pri ustvarjanju in vzdrževanju jezikovnih virov v dani skupnosti prakse (CoP). Določa vloge in odgovornosti, povezane z ustvarjanjem in vzdrževanjem takih repozitorijev. Določa tudi postopke za vzpostavitev strukture upravljanja repozitorija podatkovnih kategorij (DCR), vključno z dodajanjem novih specifikacij podatkovnih kategorij in stalnim zagotavljanjem kakovosti.

General Information

Status
Published
Publication Date
07-Jul-2022
Current Stage
9092 - International Standard to be revised
Start Date
17-Nov-2025
Completion Date
13-Dec-2025

Relations

Effective Date
18-Nov-2023
Effective Date
04-Dec-2021

Overview

ISO 12620-2:2022 provides criteria and governance practices for managing data category repositories (DCRs) used in the creation and maintenance of language resources. The standard defines roles, responsibilities and procedures for establishing and operating a DCR within a community of practice (CoP), including how to add new data category specifications, run continuous quality assurance, and support interoperability of terminological, lexicographical and annotated corpora.

Key topics and requirements

  • Scope and purpose: Management of data categories for language resources to improve interoperability and reuse.
  • Governance roles: Clear definitions of CoPs, profile experts, profile managers, profile management groups, and a DCR management board responsible for oversight and publicity.
  • DCR functional requirements:
    • Electronic availability (online or limited environment)
    • Central collection of data category specifications
    • Mechanisms to prevent duplicate specifications
    • Submission and feedback processes for new or revised data category specifications
    • Search filters (by date, creator, fields) and export capabilities (e.g., CSV, XML)
    • User access controls for write permissions
    • Ability to define data category selections (DC selections) and export them
    • Subsetting based on an ontology of data category concepts
  • Profiles and selections:
    • Support for profiles (subsets of DCR entries managed by subject-matter experts)
    • DC selections to define application-specific sets of data categories in combination with a data model
  • Quality and workflow:
    • Procedures for specification lifecycle, governance, and continuous quality assurance (workflow for specification and maintenance is defined in the standard)
  • Public information: Guidance for transparency and interaction with stakeholders

Applications and practical value

ISO 12620-2 is directly applicable to:

  • Terminology managers and lexicographers building consistent field names and value sets
  • NLP engineers and corpus developers who need interoperable metadata and annotation schemas
  • Software developers and data architects implementing exchange formats or terminology databases
  • Research communities and CoPs seeking a governed, shared source of vetted data categories

Practical benefits include improved interoperability, faster integration of heterogeneous language resources, reduced duplication of schema elements, and clearer governance for evolving data categories.

Related standards

  • ISO 12620-1:2022 - Data category specifications (structure and rationale)
  • ISO 30042 - TBX and data model considerations (referenced for module/dialect composition)
  • ISO 16642 - Metamodel referenced for terminology exchange
  • DatCatInfo is cited as an example DCR for language resource descriptions.

Keywords: ISO 12620-2:2022, data category repository, DCR, data category specifications, terminology resources, profiles, data category selection, governance, interoperability.

Standard

ISO 12620-2:2022 - Management of terminology resources — Data categories — Part 2: Repositories Released:8. 07. 2022

English language
8 pages
sale 15% off
Preview
sale 15% off
Preview

Frequently Asked Questions

ISO 12620-2:2022 is a standard published by the International Organization for Standardization (ISO). Its full title is "Management of terminology resources - Data categories - Part 2: Repositories". This standard covers: This document establishes criteria for the management of data categories for use in the creation and maintenance of language resources within a given community of practice (CoP). It defines the roles and responsibilities associated with the creation and maintenance of such repositories. It also specifies procedures to establish a governance structure for the management of a data category repository (DCR), including the addition of new data category specifications and continuous quality assurance.

This document establishes criteria for the management of data categories for use in the creation and maintenance of language resources within a given community of practice (CoP). It defines the roles and responsibilities associated with the creation and maintenance of such repositories. It also specifies procedures to establish a governance structure for the management of a data category repository (DCR), including the addition of new data category specifications and continuous quality assurance.

ISO 12620-2:2022 is classified under the following ICS (International Classification for Standards) categories: 01.020 - Terminology (principles and coordination); 35.240.30 - IT applications in information, documentation and publishing. The ICS classification helps identify the subject area and facilitates finding related standards.

ISO 12620-2:2022 has the following relationships with other standards: It is inter standard links to ISO 19634, ISO 12620:2019. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

You can purchase ISO 12620-2:2022 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.

Standards Content (Sample)


SLOVENSKI STANDARD
01-januar-2023
Nadomešča:
SIST ISO 12620:2019
Upravljanje terminoloških virov - Podatkovne kategorije - 2. del: Repozitoriji
Management of terminology resources - Data categories - Part 2: Repositories
Gestion des ressources terminologiques - Catégories de données - Partie 2: Répertoires
Ta slovenski standard je istoveten z: ISO 12620-2:2022
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

INTERNATIONAL ISO
STANDARD 12620-2
First edition
2022-07
Management of terminology
resources — Data categories —
Part 2:
Repositories
Gestion des ressources terminologiques — Catégories de données —
Partie 2: Répertoires
Reference number
© ISO 2022
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Data category repositories. 2
4.1 Data category environments . . 2
4.2 Requirements for a DCR . 3
4.3 Data category profiles . 3
4.4 Data category selections . 4
5 Roles with governance responsibilities . 4
5.1 Communities of practice . 4
5.1.1 Description . 4
5.1.2 Procedures for establishing a DCR . 4
5.2 Profile experts . 5
5.2.1 Description . 5
5.2.2 Roles and responsibilities . 5
5.3 Profile managers . . 5
5.3.1 Description . 5
5.3.2 Roles and responsibilities . 5
5.4 DCR management board . 6
5.4.1 Description . 6
5.4.2 Roles and responsibilities . 6
6 Workflow for data category specification and maintenance . 6
7 Public information and interactions . 7
Bibliography . 8
iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO’s adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see
www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology,
Subcommittee SC 3, Management of terminology resources.
This first edition of ISO 12620-2, together with ISO 12620-1:2022, cancels and replaces ISO 12620:2019,
which has been divided into parts and technically revised. The main changes are as follows:
— ISO 12620:2019 described procedures for defining data categories used in language resources and
described requirements for maintaining a pragmatic, consensus-based repository of harmonized
data category specifications for use in language resources. ISO 12620-1 has been narrowed to focus
on the structure and rationale associated with data category specifications per se.
— The sections of ISO 12620:2019 that dealt with the creation and maintenance of data category
repositories have been moved to this document.
A list of all parts in the ISO 12620 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
iv
Introduction
ISO 12620-1 provides requirements and recommendations governing data category specifications for
language resources. It specifies mechanisms for creating, documenting, harmonizing and maintaining
data category specifications in a data category repository (DCR) with the goal of increasing the
interoperability of language resources such as terminological resources, lexicographical resources and
annotated text corpora. This document specifies procedures and practices for the creation, management
and maintenance of DCRs.
Interoperability of language resources is a key factor for supporting innovation and progress in various
focus areas of the language industry, such as terminology management, natural language processing
and annotation schemes. These areas support important sectors of the economy and social development
such as global communication and trade, knowledge extraction and content management.
Researchers and software developers working with language resources benefit greatly from being
able to access a trusted source of information about data categories. Providing a precise description
of the data categories that are used within a given data collection allows for a quick diagnosis of its
compatibility with other data collections or its suitability for use in computer processes. A DCR
containing vetted data category specifications provides users with the information they need in
order to implement data categories in a manner that is consistent with other users. Consequently, the
interoperability of language resources is greatly enhanced.
Data category specifications are normally stored in electronic format in a specially designed database.
This database is called a “data category repository (DCR)”. Today, it is essential for DCRs to be sharable
for all stakeholders. See, for instance, Reference [3], a DCR for language resource descriptions named
DatCatInfo.
v
INTERNATIONAL STANDARD ISO 12620-2:2022(E)
Management of terminology resources — Data
categories —
Part 2:
Repositories
1 Scope
This document establishes criteria for the management of data categories for use in the creation and
maintenance of language resources within a given community of practice (CoP). It defines the roles
and responsibilities associated with the creation and maintenance of such repositories. It also specifies
procedures to establish a governance structure for the management of a data category repository
(DCR), including the addition of new data category specifications and continuous quality assurance.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 12620-1, Management of terminology resources — Data categories — Part 1: Specifications
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 12620-1 and the following
apply.
ISO and IEC maintain terminology databases for use i
...


INTERNATIONAL ISO
STANDARD 12620-2
First edition
2022-07
Management of terminology
resources — Data categories —
Part 2:
Repositories
Gestion des ressources terminologiques — Catégories de données —
Partie 2: Répertoires
Reference number
© ISO 2022
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Data category repositories. 2
4.1 Data category environments . . 2
4.2 Requirements for a DCR . 3
4.3 Data category profiles . 3
4.4 Data category selections . 4
5 Roles with governance responsibilities . 4
5.1 Communities of practice . 4
5.1.1 Description . 4
5.1.2 Procedures for establishing a DCR . 4
5.2 Profile experts . 5
5.2.1 Description . 5
5.2.2 Roles and responsibilities . 5
5.3 Profile managers . . 5
5.3.1 Description . 5
5.3.2 Roles and responsibilities . 5
5.4 DCR management board . 6
5.4.1 Description . 6
5.4.2 Roles and responsibilities . 6
6 Workflow for data category specification and maintenance . 6
7 Public information and interactions . 7
Bibliography . 8
iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO’s adherence to
the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see
www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology,
Subcommittee SC 3, Management of terminology resources.
This first edition of ISO 12620-2, together with ISO 12620-1:2022, cancels and replaces ISO 12620:2019,
which has been divided into parts and technically revised. The main changes are as follows:
— ISO 12620:2019 described procedures for defining data categories used in language resources and
described requirements for maintaining a pragmatic, consensus-based repository of harmonized
data category specifications for use in language resources. ISO 12620-1 has been narrowed to focus
on the structure and rationale associated with data category specifications per se.
— The sections of ISO 12620:2019 that dealt with the creation and maintenance of data category
repositories have been moved to this document.
A list of all parts in the ISO 12620 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
iv
Introduction
ISO 12620-1 provides requirements and recommendations governing data category specifications for
language resources. It specifies mechanisms for creating, documenting, harmonizing and maintaining
data category specifications in a data category repository (DCR) with the goal of increasing the
interoperability of language resources such as terminological resources, lexicographical resources and
annotated text corpora. This document specifies procedures and practices for the creation, management
and maintenance of DCRs.
Interoperability of language resources is a key factor for supporting innovation and progress in various
focus areas of the language industry, such as terminology management, natural language processing
and annotation schemes. These areas support important sectors of the economy and social development
such as global communication and trade, knowledge extraction and content management.
Researchers and software developers working with language resources benefit greatly from being
able to access a trusted source of information about data categories. Providing a precise description
of the data categories that are used within a given data collection allows for a quick diagnosis of its
compatibility with other data collections or its suitability for use in computer processes. A DCR
containing vetted data category specifications provides users with the information they need in
order to implement data categories in a manner that is consistent with other users. Consequently, the
interoperability of language resources is greatly enhanced.
Data category specifications are normally stored in electronic format in a specially designed database.
This database is called a “data category repository (DCR)”. Today, it is essential for DCRs to be sharable
for all stakeholders. See, for instance, Reference [3], a DCR for language resource descriptions named
DatCatInfo.
v
INTERNATIONAL STANDARD ISO 12620-2:2022(E)
Management of terminology resources — Data
categories —
Part 2:
Repositories
1 Scope
This document establishes criteria for the management of data categories for use in the creation and
maintenance of language resources within a given community of practice (CoP). It defines the roles
and responsibilities associated with the creation and maintenance of such repositories. It also specifies
procedures to establish a governance structure for the management of a data category repository
(DCR), including the addition of new data category specifications and continuous quality assurance.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 12620-1, Management of terminology resources — Data categories — Part 1: Specifications
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 12620-1 and the following
apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
data category specification
DC specification
complete descriptive record of a data category
[SOURCE: ISO 12620-1:2022, 3.5]
3.2
data category repository
DCR
digital collection of data category specifications (3.1)
EXAMPLE DatCatInfo, a DCR for language resources (see Reference [3]).
Note 1 to entry: Data category repositories are used as references when specif
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...