Language resource management -- Semantic annotation framework (SemAF) - Part 11: Measurable Quantitative information (MQI)

Gestion des ressources linguistiques -- Cadre d'annotation sémantique - Partie 11: Mesurer l'information quantitative (MQI)

Upravljanje jezikovnih virov - Ogrodje za semantično označevanje (SemAF) - 11. del: Merljive kvantitativne informacije (MQI)

General Information

Status
Not Published
Public Enquiry End Date
14-Mar-2021
Current Stage
4020 - Public enquire (PE) (Adopted Project)
Start Date
01-Feb-2021
Due Date
21-Jun-2021
Completion Date
09-Apr-2021

Buy Standard

Draft
ISO/FDIS 24617-11 - Language resource management -- Semantic annotation framework (SemAF)
English language
21 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
oSIST ISO/DIS 24617-11:2021 - BARVE na PDF-str 15
English language
29 pages
sale 10% off
Preview
sale 10% off
Preview

e-Library read for
1 day

Standards Content (sample)

FINAL
INTERNATIONAL ISO/FDIS
DRAFT
STANDARD 24617-11
ISO/TC 37/SC 4
Language resource management —
Secretariat: KATS
Semantic annotation framework
Voting begins on:
2021-05-10 (SemAF) —
Voting terminates on:
Part 11:
2021-07-05
Measurable quantitative information
(MQI)
Gestion des ressources linguistiques — Cadre d'annotation
sémantique (SemAF) —
Partie 11: Mesurer l'information quantitative (MQI)
RECIPIENTS OF THIS DRAFT ARE INVITED TO
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
Reference number
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
ISO/FDIS 24617-11:2021(E)
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
NATIONAL REGULATIONS. ISO 2021
---------------------- Page: 1 ----------------------
ISO/FDIS 24617-11:2021(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2021

All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may

be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting

on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address

below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2021 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/FDIS 24617-11:2021(E)
Contents Page

Foreword ........................................................................................................................................................................................................................................iv

Introduction ..................................................................................................................................................................................................................................v

1 Scope ................................................................................................................................................................................................................................. 1

2 Normative references ...................................................................................................................................................................................... 1

3 Terms and definitions ..................................................................................................................................................................................... 1

4 Abstract specification of QML ................................................................................................................................................................. 3

4.1 Overview ...................................................................................................................................................................................................... 3

4.2 Characteristics of QML ..................................................................................................................................................................... 4

4.3 Metamodel .................................................................................................................................................................................................. 4

4.4 Abstract syntax of QML (QML_as) .......................................................................................................................................... 5

4.5 Concrete syntaxes of QML (QML_cs) and its subsets ............................................................................................. 6

5 XML-based concrete syntax of QML (QML_csx) ..................................................................................................................... 6

5.1 General ........................................................................................................................................................................................................... 6

5.2 Tag names with ID prefixes .......................................................................................................................................................... 6

5.3 Attribute specification of the root ........................................................................................................................ 7

5.4 Attribute specification of the basic element types ................................................................................................... 7

5.5 Attribute specification of the link types ............................................................................................................................ 8

5.6 Illustrations of QML_csx .................................................................................................................................................................. 8

5.6.1 General...................................................................................................................................................................................... 8

5.6.2 Sample data .......................................................................................................................................................................... 8

5.6.3 Procedure of annotation ........................................................................................................................................... 9

6 TEI-based concrete syntax of QML (QML_cst) .....................................................................................................................11

6.1 Concrete syntaxes of QML (QML_cst) ...............................................................................................................................11

6.1.1 Overall ....................................................................................................................................................................................11

6.1.2 Tag names with ID prefixes ..................................................................................................................................11

6.1.3 Attribute specification of the basic element types ..........................................................................11

6.1.4 Attribute specification of the two link types ........................................................................................12

6.2 Illustrations of QML_cst ................................................................................................................................................................12

6.2.1 Overall ....................................................................................................................................................................................12

6.2.2 Sample data .......................................................................................................................................................................12

6.2.3 Illustrations of TEI-based Concrete Syntax............................................................................................13

Annex A (informative) Illustrations of QML_csx with more samples ...............................................................................16

Annex B (informative) Informal statements of MQI ..........................................................................................................................19

Annex C (informative) The representation of units ...........................................................................................................................20

Bibliography .............................................................................................................................................................................................................................21

© ISO 2021 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/FDIS 24617-11:2021(E)
Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards

bodies (ISO member bodies). The work of preparing International Standards is normally carried out

through ISO technical committees. Each member body interested in a subject for which a technical

committee has been established has the right to be represented on that committee. International

organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.

ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of

electrotechnical standardization.

The procedures used to develop this document and those intended for its further maintenance are

described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the

different types of ISO documents should be noted. This document was drafted in accordance with the

editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).

Attention is drawn to the possibility that some of the elements of this document may be the subject of

patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of

any patent rights identified during the development of the document will be in the Introduction and/or

on the ISO list of patent declarations received (see www .iso .org/ patents).

Any trade name used in this document is information given for the convenience of users and does not

constitute an endorsement.

For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and

expressions related to conformity assessment, as well as information about ISO's adherence to the

World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/

iso/ foreword .html.

This document was prepared by Technical Committee ISO/TC 37, Language and terminology,

Subcommittee SC 4, Language resource management.
A list of all parts in the ISO 24617 series can be found on the ISO website.

Any feedback or questions on this document should be directed to the user’s national standards body. A

complete listing of these bodies can be found at www .iso .org/ members .html.
iv © ISO 2021 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/FDIS 24617-11:2021(E)
Introduction

Measurable quantitative information (MQI) such as ‘165 cm’ or ‘60 kg’ of ‘John’ that applies to the height

or weight of the person is very common in ordinary language. MQI describes one of basic properties

that is associated with the magnitude aspect of quantity. The main characteristics of MQI is that

quantitative information is presented as measures expressed in terms of a pair , consisting of

a numerically expressed quantity n and a unit u, which is either basic or derived, or either normalized

or conventionally used. Such information is much more abundant in scientific publications or technical

reports to the extent that it constitutes an essential part of communicative segments of language in

general. The processing of such information is thus required for any successful language resource

management.

In such a big data era, demands from industry and academic communities for a precise acquisition of

measurable quantitative information have increased. For example, business investment companies

frequently need to aggregate various sorts of information covering net sales, gross profit, operating

expenses, operating profit, interest expense, net profit before taxes, net income, etc., of the target

companies from their annual reports. The fast-growing medical informatics research also needs

to process a large amount of medical texts to analyze the dose of medicine, the eligibility criteria of

[8]

clinical trial, the phenotype characters of patients, the lab tests in clinical records, etc. . All these

demands either in industry or in medical research require the accurate and consistent representation

of measurable quantitative information for automated processing, computation, and exchange.

However, in the IR and NLP areas, there is no standardized way of representing measurable quantitative

information currently available. Each application system developed in industrial sectors has hitherto

used its own format to annotate measurable quantitative information. A flexible, interoperable and

standardized measurable quantitative information representation format for IR and NLP tasks to work

with many different application systems is called for.

This document aims at formulating a general annotation scheme with following the principles of

semantic annotation laid down in ISO 24617-6 in general and the basic requirements of ISO 24611,

that facilitates the processing of MQI in scientific and technical language and to make it interoperable

with other semantic annotation schemes, such as ISO 24617. The annotation scheme is designed to be

interoperable with other parts of ISO 24617. It also utilizes various ISO standards on lexical resources

and morpho-syntactic annotation frameworks. It aims at being compatible with other existing relevant

standards.

NOTE ISO 24617-1 and ISO 24617-7, for instance, have proposed a way of annotating measures on time

(durations or time amounts) and space (distances), respectively. ISO 24612 provides a pivotal form (graphic

annotation framework) that makes all the annotation of temporal or spatial measures in these two annotation

schemes.

QML is normalized at the abstract level that allows various serialization formats representing annotated

measurable quantitative information such as an XML-based representation. The normalization of QI

(quantitative information) annotation is stated at the abstract level of annotation, and the standoff

annotation format is adopted at the concrete level of serialization.

Focusing on measurements in scientifico-technological language, this document is expected to

[9]

contribute to information extraction (IR) , question answering (QA), text summarization (TS), and

[10]
other natural language processing (NLP) applications .
© ISO 2021 – All rights reserved v
---------------------- Page: 5 ----------------------
FINAL DRAFT INTERNATIONAL STANDARD ISO/FDIS 24617-11:2021(E)
Language resource management — Semantic annotation
framework (SemAF) —
Part 11:
Measurable Quantitative information (MQI)
1 Scope

This document covers the measurable or magnitudinal aspect of quantity so that it can focus on the

technical or practical use of measurements in IR (information retrieval), QA (question answering), TS

(text summarization), and other NLP (natural language processing) applications. It is applicable to the

domains of technology that carry more applicational relevance than some theoretical issues found in

the ordinary use of language.

NOTE ISO 24617-12 deals with more general and theoretical issues of quantification and quantitative

information.

This document also treats temporal durations that are discussed in ISO 24617-1, and spatial

measures such as distances that are treated ISO 24617-7, while making them interoperable with other

measure types. It also accommodates the treatment of measures or amounts that are introduced in

ISO 24617-6:2016, 8.3.
2 Normative references

The following documents are referred to in the text in such a way that some or all of their content

constitutes requirements of this document. For dated references, only the edition cited applies. For

undated references, the latest edition of the referenced document (including any amendments) applies.

ISO 24612, Language resource management — Linguistic annotation framework (LAF)
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.

ISO and IEC maintain terminological databases for use in standardization at the following addresses:

— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
quantity
property of a measurable object referring to its magnitude or multitude

[SOURCE: ISO/IEC Guide 99:2007, 1.1, modified — Definition substantially redrafted, and Notes

removed.]
© ISO 2021 – All rights reserved 1
---------------------- Page: 6 ----------------------
ISO/FDIS 24617-11:2021(E)
3.2
base quantity

quantity (3.1) in a conventionally chosen subset of a given system of quantities, where no quantity in

the subset can be expressed in terms of the other quantities within that subset

Note 1 to entry: Kinds of quantities include seven base quantities defined by the International System of

Quantities (ISQ).

[SOURCE: ISO/IEC Guide 99:2007, 1.4, modified — "no subset quantity" replaced with "no quantity in

the subset", "the others" replaced with "the other quantities within that subset", and Notes and Example

removed.]
3.3
derived quantity

quantity (3.1), in a system of quantities, defined in terms of the base quantities (3.2) of that system

EXAMPLE Speed is a derived quantity defined by length (distance) over time (LT ), where length (L) and

time (T) are base quantities.
[SOURCE: ISO/IEC Guide 99:2009, 1.5, modified — Example replaced.]
3.4
quantitative information
measurement associated with the quantity (3.1) of a measurable object
3.5
measurable quantitative information
MQI
quantitative information (3.4) that can be expressed in unitized numeric terms
3.6
measurable quantitative information markup language
markup language of measurable quantitative information
quantitative markup language
QML

specification language for the annotation of measurable quantitative information (3.5) extractable from

text or other medium types of language
3.7
measurement unit
unit of measurement
unit

scalar basis, defined and adopted by convention, of measuring objects by multiplying their quantitative

values expressed in real numbers

Note 1 to entry: The expressions that are used in measurement such as “metre”, “litre”, and “µmol/kg” are units

by the definition given above. The multitude expressions such as “bottles”, “boxes”, or “two” as in “two bottles of

milk”, “a box of apples”, and “two coffees” sometimes fail to be regarded as units, but they can also be if they are

accepted as units by convention or agreement in some communities. ISO 24617 SemAF Part 12: Quantification

treats such multitude expressions as genuine units.

[SOURCE: ISO/IEC Guide 99:2007, 1.9, modified — Definition substantilly redrafted, original Notes

removed, new Note 1 to entry added.]
3.8
base unit
measurement unit (3.7) that is adopted by convention for a base quantity (3.2)

Note 1 to entry: There are seven base units chosen by the International System of Units (SI) associated with

seven ISQ base quantities to measure quantities, as shown in Table 1.
2 © ISO 2021 – All rights reserved
---------------------- Page: 7 ----------------------
ISO/FDIS 24617-11:2021(E)
Table 1 — Base units
SI base unit Associated ISQ base quantity
(unit symbol) (base quantity symbol)
metre (m) length (L)
kilogram (kg) mass (M)
second (s) time (T)
ampere (A) electric current (I)
kelvin (K) thermodynamic temperature (È)
mole (mol) amount of substance (N)
candela (cd) luminous intensity (J)

[SOURCE: ISO/IEC Guide 99:2007, 1.10, modified — Notes and Examples removes, new Note 1 to entry

and Table 1 added.]
3.9
derived unit
measurement unit (3.7) for a derived quantity (3.3)

EXAMPLE The unit “newton” (N) is a derived unit for a derived quantity “force” (F), which is defined to be

“mass times acceleration” (MLT ), where the quantity “acceleration” is a derived quantity defined by “velocity

-1 -1

divided by time” (VT ) and “velocity” defined by “length (distance) divided by time” (LT ).

Note 1 to entry: Table 2 illustrates some of the derived units.

[SOURCE: ISO/IEC Guide 99:2007, 1.11, modified — Examples removed, new Example and Note 1 to

entry added.]
Table 2 — derived units
Derived unit Associated derived quantity
(unit symbol)
kilometre per minute(km/min) speed = length(L)/ time(T)
3 3
gram per cubic metre (gram/m ) density = mass(M)/volume(L )
2 2

kilogram metre per square second (kg x m/s ) force = mass (M) x length(L)/time(T )

2 2
lumen per square metre (lm/m ) Illuminance = luminous intensity (J)/area(M )
4 Abstract specification of QML
4.1 Overview

The quantitative markup language (QML) (3.6) is specified at two levels, abstract and concrete. Some

characteristics of QML are listed in 4.2. The overall structure of QML is represented by a metamodel, as

introduced in 4.3. The abstract syntax of QML as QML_as shall be a set-theoretic specification of QML in

conceptual terms that are independent of ways of representing the annotation (content) of measurable

quantitative information. The concrete syntax of QML as QML_cs shall be a specification of a set of

representation formats, based on QML_as, for the annotation of measurable quantitative information

in a computationally tractable way. The QML_as is introduced in 4.4, while QML_cs is presented in

4.5. Equivalent concrete syntaxes, including an XML-based concrete syntax QML_csx and a TEI-based

concrete syntax QML_cst, are described in Clause 5 and Clause 6, respectively.

NOTE There can be many equivalent concrete syntaxes defined on a single abstract syntax.

© ISO 2021 – All rights reserved 3
---------------------- Page: 8 ----------------------
ISO/FDIS 24617-11:2021(E)
4.2 Characteristics of QML
QML shall have the following characteristics.

a) QML shall focus on the annotation of the measurable attributes of entities. For example, “BMI

between 10-20 kg/ m ”

b) QML shall provide a way to annotate the relations of measures. For example, “age 40 or older” and

“fpg>=100 mg/dl or a1c not less than 5,8 %”

c) QML shall cover the complex uses of unitized numeric quantities. For example, “14,0 × 109”,

“glycosylated haemoglobin (hba1c) <1,15 times the upper limit of normal”.

d) QML shall facilitate the identification of normalized numeric, units, as the measurable attribute of

an associated entity.

NOTE QML does not specify ways of annotating the normalization (e.g. “millimoles per litre” is normalized

to “mmol/L”) or complete specification (e.g. “kg/m” is “kg/m2” for BMI) of units, which will be dealt with in

another part of ISO 24617 addressing automated implementation of MQI.
4.3 Metamodel

The overall structure of measurable quantitative information is represented by the metamodel in

Figure 1.
Figure 1 — Metamodel of measurable quantitative information

This metamodel shall consist of seven class components, represented as square boxes in Figure 1:

a) source data as input to the annotation of MQI,
b) markables extracted from data sources,
c) three types of basic elements: entity, measure, and relator,
d) two types of links: measure link and comparison link.

The element “entity” shall be any object that has the property of a measurable quantity, represented by

“@quantity”, as one of its properties. The “entity”, as is used in this document, shall be a very general

term that refers to any object, not just to individual entities, but also to their properties, such as

4 © ISO 2021 – All rights reserved
---------------------- Page: 9 ----------------------
ISO/FDIS 24617-11:2021(E)

“height” of a building or “speed” of a car, and also to any kinds of eventualities such as states, processes

or transitions.
EXAMPLE 1 We drove at more than 200 kilometres per hour on a German autobahn.

The speed mentioned by “more than 200 kilometres per hour” applies to the quantitative property of a

motion: e.g. the measure “over 200 kilometres per hour” applies to the motion of driving mentioned in

the example.

The element “measure” represents a measurable quantity of an entity in terms of three attributes:

quantity, unit, and type.
EXAMPLE 2 The height of Mt. Hall is 1 950 metres.

The measure shall consist of a quantity referred to by a numeric expression “1 950” and a unit “metre”.

It applies to the “height” quantity of the geographical object, named “Mt. Hall”.

The element “relator” which is associated with markables such as “equal to”, “greater than”, “<=”,

“between”, or “at least” has only a functional status of relating two or more measures.

EXAMPLE 3 One pound equals 16 ounces.
It is a relator of identity between two measures, “one pound” and “16 ounces”.
EXAMPLE 4 1 foot is less than 1 metre, for it is exactly equal to 30,48 cm.

This example illustrates two types of links between measures: the relation of being “less than”, and that

of being an identity.

A link of the type “measure” shall relate a measure to the quantitative property of an entity. Such a link

is triggered by a measure element.

A link of the type “comparison” shall relate a measure to another or other more measures. Such a link is

often triggered by an element “comparison”.
4.4 Abstract syntax of QML (QML_as)

A markup language QML shall be a specification language for the annotation of MQI. The abstract syntax

of QML shall specifies an annotation scheme in set-theoretic terms based on a conceptual understanding

of MQI. The abstract syntax QML_as is understood to be structured as a triple such that

a) B is a set of three basic element types: entity, measure, and relator;
b) R is a set of two link types: measure and comparison types;

c) @ is a set of assignments that specify the list of attributes and their value types associated with

each of the basic element types in B and each of the link types in R.

Every element in B shall have at least one attribute, @type, and so does every link. The values of @

type are CDATA associated with each of the elements. For instance, the entity of “mountain” is of the

“geographical” type, and the entity named “John” is of the “person” type.

The values of @quantity for an entity are CDATA that may include values such as height, width, or

weight, and so on.

The assignment of measure shall have three attributes: @numeric, @unit, and @type. A possible value

of the attribute @numeric is a real number. A possible value of @unit is one of the units in a system

conventionally accepted such as one of the SI base units or derived units. A possible value of @type is

one of the quantities listed as ISQ base quantities or derived quantities, such as length, mass, voltage,

and so on.
© ISO 2021 – All rights reserved 5
---------------------- Page: 10 ----------------------
ISO/FDIS 24617-11:2021(E)
4.5 Concrete syntaxes of QML (QML_cs) and its subsets

An abstract syntax shall allow several semantically equivalent concrete syntaxes. QML_as likewise

allows a set of equivalent concrete syntaxes of QML(QML_cs). This document introduces two kinds of

concrete syntaxes, QML_csx and QML_csf, in Clause 5 and Clause 6, respectively.

The two concrete syntaxes, QML_csx and QML_csf, are both based on the abstract syntax QML_as, while

adopting XML as their representation language. They shall comply with the requirement of standoff

annotation in ISO 24612.

These two concrete syntaxes do, however, differ from each other in at least two aspects. Just like the

other Parts of ISO 24617 on semantic annotation, such as ISO 24617-1, ISO 24617-7, and ISO 24617-6,

QML_csx does not separate annotation content structures from their anchoring (referencing)

structures, although this separation is required by LAF for linguistic annotation.

In contrast, QML_csf is feature-structure-based. It shall follow LAF for the separation of the two

structures, anchoring and content structures in representing measurement information in feature

structures. Furthermore, QML_cst, as specified in this document, shall adopt the names of XML

elements and attributes with value type specifications from the TEI P 5 Guidelines of the Text Encoding

Initiative Consortium for the representation of MQI.
5 XML-based concrete syntax of QML (QML_csx)
5.1 General

The XML-based concrete syntax QML_csx is introduced in two steps. The first step is to list the tag

names and ID prefixes of QML_csx in 5.2. The second step is to specify the attribute assignments for the

XML root in 5.3, for each of the basic element types listed in 5.4, and for each of the link types listed in

5.5.

NOTE The root tag is introduced in XML to embed a list of XML elements into a single structure.

5.2 Tag names with ID prefixes

Corresponding to each of the basic element types and the link types for QML_csx, there is a unique tag

and a unique ID prefix, as shown in Table 3.
Table 3 — List of tags and ID prefixes of QML_csx
Tags ID prefixes Comment
Root mqi XML root tag
Basic element types
Entity x object to which a measure applies
Measure me unitized numeric quantities only
Relator c triggers a link relating measures
Link types
Measure link mL relates a measure to an entity and is triggered
by a measure
Comparison link cL relates a measure to another or other more
measures

NOTE The attribute name for each ID in XML is xml:id and each of its values is an ID prefix followed by a

positive integer, e.g. .
6 © ISO 2021 – All rights reserved
---------------------- Page: 11 ----------------------
ISO/FDIS 24617-11:2021(E)
5.3 Attribute specification of the root
List 1: A list of attributes for in extended BNF (Backus-Naur form)
attributes = identifier, target, [lang], [mediumTyp
...

SLOVENSKI STANDARD
oSIST ISO/DIS 24617-11:2021
01-marec-2021
Upravljanje jezikovnih virov - Ogrodje za semantično označevanje (SemAF) - 11.
del: Merljive kvantitativne informacije (MQI)
Language resource management -- Semantic annotation framework (SemAF) - Part 11:
Measurable Quantitative information (MQI)

Gestion des ressources linguistiques -- Cadre d'annotation sémantique - Partie 11:

Mesurer l'information quantitative (MQI)
Ta slovenski standard je istoveten z: ISO/DIS 24617-11
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
01.140.20 Informacijske vede Information sciences
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
oSIST ISO/DIS 24617-11:2021 en

2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

---------------------- Page: 1 ----------------------
oSIST ISO/DIS 24617-11:2021
---------------------- Page: 2 ----------------------
oSIST ISO/DIS 24617-11:2021
DRAFT INTERNATIONAL STANDARD
ISO/DIS 24617-11
ISO/TC 37/SC 4 Secretariat: KATS
Voting begins on: Voting terminates on:
2020-03-16 2020-06-08
Language resource management — Semantic annotation
framework (SemAF) —
Part 11:
Measurable Quantitative information (MQI)
Gestion des ressources linguistiques — Cadre d'annotation sémantique —
Partie 11: Mesurer l'information quantitative (MQI)
ICS: 01.020
THIS DOCUMENT IS A DRAFT CIRCULATED
FOR COMMENT AND APPROVAL. IT IS
THEREFORE SUBJECT TO CHANGE AND MAY
NOT BE REFERRED TO AS AN INTERNATIONAL
STANDARD UNTIL PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL,
This document is circulated as received from the committee secretariat.
TECHNOLOGICAL, COMMERCIAL AND
USER PURPOSES, DRAFT INTERNATIONAL
STANDARDS MAY ON OCCASION HAVE TO
BE CONSIDERED IN THE LIGHT OF THEIR
POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN
Reference number
NATIONAL REGULATIONS.
ISO/DIS 24617-11:2020(E)
RECIPIENTS OF THIS DRAFT ARE INVITED
TO SUBMIT, WITH THEIR COMMENTS,
NOTIFICATION OF ANY RELEVANT PATENT
RIGHTS OF WHICH THEY ARE AWARE AND TO
PROVIDE SUPPORTING DOCUMENTATION. ISO 2020
---------------------- Page: 3 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2020

All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may

be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting

on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address

below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2020 – All rights reserved
---------------------- Page: 4 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)
Contents Page

Foreword ........................................................................................................................................................................................................................................iv

Introduction ..................................................................................................................................................................................................................................v

1 Scope ................................................................................................................................................................................................................................. 1

2 Normative references ...................................................................................................................................................................................... 1

3 Terms and definitions ..................................................................................................................................................................................... 2

4 Background and Motivations ................................................................................................................................................................... 4

5 Purposes and Requirements .................................................................................................................................................................... 5

6 Abstract Specification of SemAF-MQI .............................................................................................................................................. 6

6.1 Overview ...................................................................................................................................................................................................... 6

6.2 Characteristics of SemAF-MQI ................................................................................................................................................... 6

6.3 Metamodel .................................................................................................................................................................................................. 6

6.4 Abstract syntax of QML (QML_as) .......................................................................................................................................... 8

6.5 Concrete Syntaxes of QML (QML_cs) .................................................................................................................................... 8

7 XML-based Concrete Syntax of QML (QML_csx) .................................................................................................................... 9

7.1 Overall ............................................................................................................................................................................................................ 9

7.2 Tag names with ID prefixes .......................................................................................................................................................... 9

7.3 Attribute specification of the root ........................................................................................................................ 9

7.4 Attribute specification of the basic element types ................................................................................................... 9

7.5 Attribute specification of the link types and ................................................................10

7.6 Illustrations of QML_csx ...............................................................................................................................................................11

7.6.1 Overall ....................................................................................................................................................................................11

7.6.2 Sample data .......................................................................................................................................................................11

7.6.3 Procedure of annotation ........................................................................................................................................11

8 TEI-based Concrete Syntax of QML (QML_cst).....................................................................................................................13

8.1 Concrete syntaxes of QML (QML_cst) ...............................................................................................................................13

8.1.1 Overall ....................................................................................................................................................................................13

8.1.2 Tag names with ID prefixes ..................................................................................................................................14

8.1.3 Attribute specification of the basic element types ..........................................................................14

8.1.4 Attribute specification of the two link types ........................................................................................15

8.2 Illustrations of QML_cst ................................................................................................................................................................15

8.2.1 Overall ....................................................................................................................................................................................15

8.2.2 Sample data .......................................................................................................................................................................15

8.2.3 Illustrations of TEI-based Concrete Syntax............................................................................................15

Annex A (informative) Illustrations of QML_csx with more samples ...............................................................................19

Annex B (informative) Informal statements of Measurable Quantitative Information ................................22

Annex C (informative) The representation of units ...........................................................................................................................23

Bibliography .............................................................................................................................................................................................................................24

© ISO 2020 – All rights reserved iii
---------------------- Page: 5 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)
Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards

bodies (ISO member bodies). The work of preparing International Standards is normally carried out

through ISO technical committees. Each member body interested in a subject for which a technical

committee has been established has the right to be represented on that committee. International

organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.

ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of

electrotechnical standardization.

The procedures used to develop this document and those intended for its further maintenance are

described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for the

different types of ISO documents should be noted. This document was drafted in accordance with the

editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).

Attention is drawn to the possibility that some of the elements of this document may be the subject of

patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of

any patent rights identified during the development of the document will be in the Introduction and/or

on the ISO list of patent declarations received (see www .iso .org/ patents).

Any trade name used in this document is information given for the convenience of users and does not

constitute an endorsement.

For an explanation on the meaning of ISO specific terms and expressions related to conformity

assessment, as well as information about ISO's adherence to the World Trade Organization (WTO)

principles in the Technical Barriers to Trade (TBT) see the following URL: Error! Hyperlink reference

not valid..

The committee responsible for this document is ISO/TC 37, Language and Terminology, Subcommittee

SC 4, Language resource management

ISO 24617 consists of the following parts under the general title Language resource management —

Semantic annotation framework (SemAF):
— Part 1: Time and events (TimeML)
— Part 2: Dialogue acts (DA)
— Part 3: Named entity
— Part 4: Semantic roles (SR)
— Part 5: Discourse structures (DS)
— Part 6: Principles of semantic annotation (SemAF Principles)
— Part 7: Spatial information
— Part 8: Semantic relations in discourse, core annotation schema (DR-core)
— Part 9: Reference annotation framework (RAF)
— Part 10: Visual information (VoxML)
— Part 11: Measurable quantitative information (MQI)
— Part 12: Quantification
— Part 13: Gestures
iv © ISO 2020 – All rights reserved
---------------------- Page: 6 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)
Introduction

Measurable quantitative information (MQI) such as ‘165 cm’ or ‘60 kg’ of ‘John’ that applies to the height

or weight of the person is very common in ordinary language. MQI describes one of basic properties

which is associated with the magnitude aspect of quantity. Such information is much more abundant

in scientific publications or technical reports to the extent that it constitutes an essential part of

communicative segments of language in general. The processing of such information is thus required

for any successful language resource management.

This document, named ‘SemAF-MQI’, thus aims to focus on specifying a general annotation scheme

with following the principles of semantic annotation laid down in ISO 24617-6 in general and the basic

requirements of ISO 24611 Linguistic annotation framework (LAF), that facilitates the processing of

MQI in scientific and technical language and to make it interoperable with other semantic annotation

schemes, such as ISO 24617 etc.

NOTE 1 ISO 24617-1:2012 (E) TimeML and ISO 24617-7: 2014 (E) Spatial information, for instance,

have proposed a way of annotating measures on time (durations or time amounts) and space (distances),

respectively. The serious disucssion of annotating measures as part of ISO 24617 was initiated at the 11 joint

[1]

ACL-ISO/TC 37/SC 4/WG 2 Workshop on Interoperable Semantic Annotation (ISA-11) and was continued at

[2] [3] [4]

the ISA-13 , ISA-14 , and ISA-15 workshops. ISO 24612: 2012 (E) LAF provides a pivotal form (GrAF, graphic

annotation framework) that makes all the annotation of temporal or spatial measures in these two annotation

schemes interchangeable with those measure annotations in the new document SemAF-MQI.

Focusing on measurements in scientifico-technological language, SemAF-MQI as an ISO standard is

[5]

expected to contribute to information extraction (IR) , question answering (QA), text summarization

[6]
(TS), and other natural language processing (NLP) applications .

NOTE 2 To enhance the readability of this document and to correct some obvisous editorial errors, some

editorial changes were made on the earlier version of CD 24617-11 MQI that had been submitted to the successful

CD ballot (2019-09-11 ~ 2019-11-06) with a 100% approval but with no comments.

• Each item in Bibliography as well as in Clause 2 Normative references was made to be referred to in

the main part of the current version of the docment.

• Three of the illustrative examples in clause 7.6 Illustrations of QML_csx were moved to a newly

created Annex A (informative) without any change of content change in order to lighten the burden

of reading that clause 7.6.
• Incorrect wordings or obvious typos were corrected.

• The white and black coloing of Figure 1 — Metamodel of QML was changed to the multiple coloring

to bring out each of the different components of the metamodel.
© ISO 2020 – All rights reserved v
---------------------- Page: 7 ----------------------
oSIST ISO/DIS 24617-11:2021
---------------------- Page: 8 ----------------------
oSIST ISO/DIS 24617-11:2021
DRAFT INTERNATIONAL STANDARD ISO/DIS 24617-11:2020(E)
Language resource management — Semantic annotation
framework (SemAF) —
Part 11:
Measurable Quantitative information (MQI)
1 Scope

As one of the basic physical properties, quantity is associated with multitude (how many) and magnitude

(how much). Focusing on the magnitudinal aspect of quantity, this document, which is named “SemAF-

MQI” henceforth, aims at formulating a specification language for the construction of an annotation

scheme for measurable quantitative information (MQI) in scientifico-technological language. The main

characteristics of SemAF-MQI is that quantitative information is presented as measures expressed in

terms of a pair , consisting of a numerically expressed quantity n and a unit u, which is either

basic or derived, or either normalized or conventionally used.

NOTE 1 MQI stands for “measurable quantitative information”, whereas SemAF-MQI refers to the part 11 of

ISO 24617-11. [See 3.4 for the definition of MQI.]

The scope of SemAF-MQI is restricted to the measurable or magnitudinal aspect of quantity so that it

can focus on the technical or practical use of measurements in IR (information retrieval), QA (question

answering), TS (text summarization), and other NLP (natural language processing) applications. The

scope is restricted to the domains of technology that carry more applicational relevance than some

theoretical issues found in the ordinary use of language. The subsequent part of ISO 24617 (Part 12)

deals with more general and theoretical issues of quantification and quantitative information.

NOTE 2 The scope of this document is intentionally restricted to the measurable or magnitudinal aspect of

quantity so that SemAF-MQI focuses on the technical or practical use of measurements in IR, QA, TS, and other

NLP applications. The scope is restricted to domains of technology that carry more applicational relevance than

theoretical issues found in the ordinary use of language. Fruit as well as meat is, for instance, sold at markets

in terms of weight but not of pieces. Furthermore, the subsequent part of ISO 24617 (Part 12) deals with more

general and theoretical issues of quantification and plurals (e.g., “three apples) including quantitative information

that includes multitudinal aspects.

The scope of SemAF-MQI also treats temporal durations that are discussed in Part 1 of ISO 24617

SemAF-Time (ISO-TimeML) and spatial measures such as distances that are treated in Part 7 of

ISO 24617 Spatial information (ISO-Space), while making them interoperable with other measure types.

It also accommodates the treatment of measures or amounts that are introduced in ISO 24617-6 SemAF

Principles (Clause 8.3).

NOTE 3 The scope of this document (Part 11) also treats temporal durations that are discussed in Part 1 of

ISO 24617 SemAF-Time (TimeML) and spatial measures such as distances that are treated in Part 7 of ISO 24617

Spatial information, while making them interoperable with other measure types. It also accommodates the

treatment of measures or amounts that are introduced in ISO 24617-6 SemAF Principles. Its scope thus covers

temporal durations treated in XSchema and the TEI Guidelines.
2 Normative references

The following documents, in whole or in part, are normatively referenced in this document and are

indispensable for its application. For dated references, only the edition cited applies. For undated

references, the latest edition of the referenced document (including any amendments) applies.

ISO 24612:2012, Language resource management — Linguistic annotation framework (LAF)

© ISO 2020 – All rights reserved 1
---------------------- Page: 9 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)

ISO 24617-1:2012, Language resource management — Semantic annotation framework (SemAF) — Part 1:

Time and events (SemAF-Time, ISO-TimeML)

ISO 24617-6:2016, Language resource management — Semantic annotation framework — Part 6:

Principles of semantic annotation (SemAF Principles)

ISO 24617-7:2014, Language resource management — Semantic annotation framework — Part 7: Spatial

information (ISOspace)

ISO/IEC 14977:1996, Information technology - Syntactic metalanguage - Extended BNF

ISO 80000-1:2009, Quantities and units — Part 1: General

NOTE 1 The following two documents are de-facto standards to be followed by SemAF-MQI:

[7]

TEI P5: Guidelines for Electronic Text Encoding and Interchange, The TEI Consortium, 2019 .

[8]

XML Schema, Part 2: Datatypes, 2nd edition, W3C Recommendation, 28 October 2004 .

3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.

ISO and IEC maintain terminological databases for use in standardization at the following addresses:

— IEC Electropedia: available at http:// www .electropedia .org/
— ISO Online browsing platform: available at https:// www .iso .org/ obp
3.1
quantity

property of a measureable object referring to its magnitude (how much) or multitude (how many).

Note 1 to entry: Compare with ISO 80000-1:2009, 3 Terms and Definitions, 3.1: property of a phenomenon, body,

or substance, where the property has a magnitude that can be expressed by means of a number and a reference.

3.2
base quantity

quantity in a conventionally chosen subset of a given system of quantities, where no quantity in the

subset can be expressed in terms of the other quantities within that subset

Note 1 to entry: Kinds of quantities include seven base quantities defined by the International System of

Quantities (ISQ), as listed in Table 1
Table 1 — ISQ base quantities
base quantities base quantity symbols
length L
mass M
time T
electric current I
thermodynamic temperature Θ
amount of substance N
luminous intensity J

Note 2 to entry: In ISO 80000-1:2009, 3 Terms and Definition, the symbols such as L and M, which are called base

quantity symbols in this document, are called as dimension symbols of quantity
2 © ISO 2020 – All rights reserved
---------------------- Page: 10 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)
3.3
derived quantity

quantity, in a system of quantities, defined in terms of the base quantities of that system

EXAMPLE Speed is a derived quantity defined by length (distance) over time (LT ), where length (L) and

time (T) are base quantities.
[SOURCE: ISO 80000-1:2009, 3 Terms and Definition, 3.5 derived quantity]
3.4
quantitative information
measure associated with the quantity (3.1) of a measurable object
3.5
measurable quantitative information
MQI
quantitative information (3.3) that can be expressed in unitized numeric terms
3.6
measurable quantitative information markup language
markup language of measurable quantitative information
QML

specification language for the annotation of measurable quantitative information (3.5) extractable

from text or other medium types of language
3.7
unit
unit of measurement
measurement unit

scalar basis, defined and adopted by convention, of measuring objects by multiplying their quantitative

values expressed in real numbers

Note 1 to entry: The expressions that are used in measurement such as “meter”, “liter”, and “µmol/kg” are units

by the definition given above. The multitude expressions such as “bottles”, “boxes”, or “two” as in “two bottles of

milk”, “a box of apples”, and “two coffees” sometimes fail to be regarded as units, but they can also be if they are

accepted as units by convention or agreement in some communities. ISO 24617 SemAF Part 12: Quantification

treats such multitude expressions as genuine units.
Note 2 to entry: There are two major types of units, base and derived

[Refer to ISO 80000-1:2009, 3 Terms and Definitions, 3.9 Unit, 3.10 Base unit, and 3.11 Derived unit.]

[SOURCE: Refer to: ISO 80000-1:2009, 3 Terms and Definitions, 3.9, real scalar quantity, defined and

adopted by convention, with which any other quantity of the same kind can be compared to express the

ratio of the second quantity to the first one as a number.]
3.8
base unit
measurement unit that is adopted by convention for a base quantity (3.2)

Note 1 to entry: There are seven base units chosen by the International System of Units (SI) associated with

seven ISQ base quantities to measure quantities, as shown in Table 2.
Table 2 — base units
SI base unit Associated ISQ base quantity
(unit symbol) (base quantity dimension symbol)
meter (m) length (L)
kilogram (kg) mass (M)
© ISO 2020 – All rights reserved 3
---------------------- Page: 11 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)
Table 2 (continued)
SI base unit Associated ISQ base quantity
(unit symbol) (base quantity dimension symbol)
second (s) time (T)
ampere (A) electric current (I)
kelvin (K) thermodynamic temperature (Θ)
mole (mol) amount of substance (N)
candela (cd) luminous intensity (J)

[SOURCE: ISO 80000-1:2009, 3 Terms and Definitions, 3.9 Unit, 3.10 Base unit, and 3.11 Derived unit.]

3.9
derived unit
measurement unit for a derived quantity

EXAMPLE The unit “newton” (N) is a derived unit for a derived quantity “force” (F), which is defined to be

“mass times acceleration” (MLT ), where the quantity “acceleration” is a derived quantity defined by “velocity

-1 -1

divided by time” (VT ) and “velocity” defined by “length (distance) divided by time” (LT ).

Note 1 to entry: Table 3 illustrates some of the derived units.

[Refer to ISO 80000-1:2009, 3 Terms and Definitions, 3.9 Unit, 3.10 Base unit, and 3.11 Derived unit.]

Table 3 — derived units
Derived unit Associated derived quantity
(unit symbol)
kilo-meter per minute(km/min) speed= length(L)/ time(T)
3 3
gram per cubic meter (gram/m ) density=mass(M)/volume(L )
kilo- gram, meter per square second force = mass (M) x length(L)/time(T )
(kg x m/s )
lumen per square meter (lm/m ) Illuminance = luminous intensity (J)/
area(M )
4 Background and Motivations

Quantity exists as a multitude (e.g., “two watermelons”) or magnitude (“one kilogram of watermelon”).

The two basic divisions of quantity imply the principal distinction between continuity (continuum)

and discontinuity, which are two ways of determining quantity. SemAF-MQI only focuses on the

measurement information in scientific and technical texts. Therefore, quantity is regarded as a

magnitude property in the document, which is consistent with ISO 80000 - 1:2009 Quantities and units.

As in ISO 80000-1:2009, the term “unit” is defined in relation to quantity and is used for real scalar

quantity, defined and adopted by convention, with which any other quantity of the same kind can be

compared to express the ratio of the second quantity to the first one as a number. There are two types

of units: base unit and derived unit.

This document treats complex derived units as unanalyzed wholes. It does not annotate their internal

structures and components, unless it is required by some special use cases. Neither does the standard

require to specify ways of converting one unit to another. Here are some reasons:

1) Complex derived units such as speed “km/h” (LT-1) or acceleration “m/s2” (LT-2) are understood as

they are in ordinary situations.

2) Certain domain specific units cannot be decomposed during their conversion to other equivalent

units. For example, Estimated Glomerular Filtration Rate (eGFR) frequently uses the unit “mL/

min/1.73m ” in a medical domain. Thus, a kidney function can be classified into various stages

4 © ISO 2020 – All rights reserved
---------------------- Page: 12 ----------------------
oSIST ISO/DIS 24617-11:2021
ISO/DIS 24617-11:2020(E)

depending on eGFR, where the stage 1 defines “normal eGFR greater than or equal to 90 mL/

2 2

min/1.73m ”. In some cases, the unit can be written as “mL/min/((173/100).m )”. In all these cases,

“1.43” or “173/100” in the units cannot be annotated separately for automatic conversion since they

are combined with other parts together to be a complete unit.

3) Units can be converted automatically in an effective way such as with the use of a conversion

table. For example, by using directly “1 mmol/l” that equals to “18 mg/dl”, the computer can more

effectively convert the unit into another with one single computation rather than convert each part

of unit and then compute the total value.

4) Incomplete units exist. During language processing, there are incomplete units which need to

be detected by using different methods such as by formulating some specific rules or guidelines.

Such rules could be designed to extend a unit into a more complete representation or to complete

missing parts of a derived unit according to some clues such as contextual information or variable-

specific default unit information.

With the recent advent of artificial intelligence technologies, many applications in IR and NLP have been

developed to acquire meta information from unstructured texts as a core module, such as question

answering systems, automatic speech translation systems, and intelligent assistant systems. In the

process of running such systems, texts are usually found containing a large amount of measurable

quantitative information, constituting an essential portion of meta information for information

extraction, text understanding, and data analysis.

Particularly, in such a big data era, demands from industry and academic communities for a precise

acquisition of measurable quantitative information have increased. For example, business investment

companies frequently need to aggregate various sorts of information covering net sales, gross profit,

operating expenses, operating profit, interest expense, net profit before taxes, net income, etc., of the

target companies from their annual reports. The fast-growing medical informatics research also needs

to process a large amount of medical texts
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.