Language resource management — Corpus Annotation Project Management — Part 1: Core model

This standard describes the basic principles and recommended procedures for corpus annotation project management as its core model. The core model of corpus project management consists of a series of recommended work packages to fulfill the basic requirements for error-free corpus annotation with training the involved human and validating the intermediate results. Thus, the core model contains recommendations below: - corpus annotation project organization, - internal structures and work-packages for corpus annotation project management, - project team members' qualification, - workflow among the internal structures of project.

Gestion des ressources linguistiques — Gestion de projet d'annotation de corpus — Partie 1: Modèle de base

Upravljanje jezikovnih virov - Projektno vodenje anotacije korpusa - 1. del: Jedrni model

General Information

Status
Not Published
Public Enquiry End Date
26-Jun-2025
Current Stage

Buy Standard

Draft
ISO/DIS 24635-1:2025 - BARVE
English language
24 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day

Standards Content (Sample)


SLOVENSKI STANDARD
01-junij-2025
Upravljanje jezikovnih virov - Projektno vodenje anotacije korpusa - 1. del: Jedrni
model
Language resource management — Corpus Annotation Project Management — Part 1:
Core model
Gestion des ressources linguistiques — Gestion de projet d'annotation de corpus —
Partie 1: Modèle de base
Ta slovenski standard je istoveten z: ISO/PRF 24635-1
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

DRAFT
International
Standard
ISO/DIS 24635-1
ISO/TC 37/SC 4
Language resource management —
Secretariat: KATS
Corpus Annotation Project
Voting begins on:
Management —
2024-11-05
Part 1:
Voting terminates on:
2025-01-28
Core model
Gestion des ressources linguistiques — Gestion de projet
d'annotation de corpus —
Partie 1: Modèle de base
ICS: 01.020
THIS DOCUMENT IS A DRAFT CIRCULATED
FOR COMMENTS AND APPROVAL. IT
IS THEREFORE SUBJECT TO CHANGE
AND MAY NOT BE REFERRED TO AS AN
INTERNATIONAL STANDARD UNTIL
PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL,
TECHNOLOGICAL, COMMERCIAL AND
USER PURPOSES, DRAFT INTERNATIONAL
STANDARDS MAY ON OCCASION HAVE TO
This document is circulated as received from the committee secretariat.
BE CONSIDERED IN THE LIGHT OF THEIR
POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN
NATIONAL REGULATIONS.
RECIPIENTS OF THIS DRAFT ARE INVITED
TO SUBMIT, WITH THEIR COMMENTS,
NOTIFICATION OF ANY RELEVANT PATENT
RIGHTS OF WHICH THEY ARE AWARE AND TO
PROVIDE SUPPORTING DOCUMENTATION.
Reference number
ISO/DIS 24635-1:2024(en)
DRAFT
ISO/DIS 24635-1:2024(en)
International
Standard
ISO/DIS 24635-1
ISO/TC 37/SC 4
Language resource management —
Secretariat: KATS
Corpus Annotation Project
Voting begins on:
Management —
Part 1:
Voting terminates on:
Core model
Gestion des ressources linguistiques — Gestion de projet
d'annotation de corpus —
Partie 1: Modèle de base
ICS: 01.020
THIS DOCUMENT IS A DRAFT CIRCULATED
FOR COMMENTS AND APPROVAL. IT
IS THEREFORE SUBJECT TO CHANGE
AND MAY NOT BE REFERRED TO AS AN
INTERNATIONAL STANDARD UNTIL
PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL,
© ISO 2024
TECHNOLOGICAL, COMMERCIAL AND
USER PURPOSES, DRAFT INTERNATIONAL
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
STANDARDS MAY ON OCCASION HAVE TO
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
This document is circulated as received from the committee secretariat. BE CONSIDERED IN THE LIGHT OF THEIR
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN
or ISO’s member body in the country of the requester.
NATIONAL REGULATIONS.
ISO copyright office
RECIPIENTS OF THIS DRAFT ARE INVITED
CP 401 • Ch. de Blandonnet 8
TO SUBMIT, WITH THEIR COMMENTS,
CH-1214 Vernier, Geneva
NOTIFICATION OF ANY RELEVANT PATENT
Phone: +41 22 749 01 11
RIGHTS OF WHICH THEY ARE AWARE AND TO
PROVIDE SUPPORTING DOCUMENTATION.
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland Reference number
ISO/DIS 24635-1:2024(en)
ii
ISO/DIS 24635-1:2024(en)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 2
3.1 Terms and definitions for corpus annotation .2
3.2 Terms and definitions for project management.3
4 Purpose and justification . 6
5 Core Model . 6
5.1 Project organization and role .7
5.1.1 Project manager .7
5.1.2 Project technical manager .7
5.1.3 Work package manager .7
5.1.4 Process team leader.7
5.1.5 Team member .8
5.2 Process groups for corpus annotation project .8
5.3 Corpus annotation project work package and process .8
5.3.1 Integrated management of corpus annotation project .8
5.3.2 Corpus annotation work management . 12
5.3.3 Corpus annotation project quality control . 13
6 Publication and archiving of the corpus annotation (optional) .15
Annex A (informative) Process flow in the scope of process groups and work packages .16
Bibliography . 19

iii
ISO/DIS 24635-1:2024(en)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any patent
rights identified during the development of the document will be in the Introduction and/or on the ISO list of
patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and Terminology, Subcommittee
SC 4, Language Resource Management.
A list of all parts in the ISO 24635 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.

iv
ISO/DIS 24635-1:2024(en)
Introduction
Corpus annotation is a process of annotating additional linguistic information to primary data. The goal
of corpus annotation projects is to achieve high quality deliverables following the annotation specification
within limited resource environments.
Language resource management – Corpus Annotation Project Management is a serialized proposal of standards
that aim to give recommendations to construct high quality annotated corpora effectively and efficiently.
The proposal consists of three parts of model – Core Model, Validation Model, and Training Model.
Part 1: Core Model presents the basic principles including considerations of corpus annotation, procedures
of corpus annotation project, project organization, work packages and tasks that can be applied to corpus
annotation project regardless of the scale, complexity, and duration of the corpus annotation projects.
Part 2: Training Model presents the basic principles to train the project participants and to maintain their
ability to execute the project.
Part 3: Validation Model presents the basic principles for quality control of deliverables achieving error- free
annotation following the specification of annotation.

v
DRAFT International Standard ISO/DIS 24635-1:2024(en)
Language resource management — Corpus Annotation
Project Management —
Part 1:
Core model
1 Scope
This standard is a part of series of standards for corpus annotation project management. This part 1
describes the core model of project management for corpus annotation, to specify the work packages of
project teams, required processes and deliverables. The other parts of this series of standards shall describe
the training model of human resources involved and the validation model as parts 2 and 3.
This document does not specify the methodology to solve the issues such as quality control, human training,
reusability, licensing and copyright, but present the necessary components for such issues and specify what
work packages, their subtasks and workflow among them are required to manage the corpus annotation
project to handle such issues. This document presents the required components to deal with the quality
control, human training, reusability, licensing, copyright and other area for corpus project management by
specifying what work packages, their subtasks and workflow among them.
Thus, this core model of project management for corpus annotation shall specify recommendations on what
work packages and deliverables are required under the project in which workflows and processes deal with:
— Integration and Communication Among Work Packages: This includes ensuring that all work
packages are well-coordinated, particularly in terms of the adoption of broader annotation standards
and integration with ontologies to enhance interoperability. Effective communication across work
packages is crucial for the seamless sharing of annotated documents with other projects.
— Human Resource Management and Interrater Reliability: This covers the management of human
resources, focusing on training and qualification, as well as the implementation of interrater reliability
practices. These practices include training, testing, and the use of appropriate tools to ensure consistency
across annotations.
— Annotation Guideline Management and Software Utilization: This involves managing the guidelines
for annotation tasks and utilizing annotation software and tools, particularly in environments leveraging
artificial intelligence (AI) and machine learning (ML) techniques. It includes the cautious application of
AI/ML methods, such as weak supervised learning, to support the annotation process.
— Quality Control, Validation, and Structured Documentation: This encompasses the processes
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.