Language resource management — Semantic annotation framework (SemAF) — Part 15: Measurable quantitative information extraction (MQIE)

This document establishes a measurable quantitative information extraction (MQIE) scheme, which is based on the semantic annotation scheme specified in ISO 24617-11. It is applicable to the domains of technology that carry more applicational relevance than some theoretical issues found in the ordinary use of language.
NOTE ISO 24617-12 deals with more general and theoretical issues of quantification and quantitative information.
This document also treats temporal durations that are discussed in ISO 24617-1, and spatial measures such as distances that are treated in ISO 24617-7, while making them interoperable with other measure types. It also accommodates the treatment of measures or amounts that are introduced in ISO 24617-6:2016, 8.3.

Gestion des ressources linguistiques — Cadre d’annotation sémantique (SemAF) — Partie 15: Extraction d’informations quantitatives mesurables (MQIE)

Le présent document établit un schéma d’extraction des informations quantitatives mesurables (MQIE), qui est basé sur le schéma d’annotation sémantique spécifié dans l’ISO 24617-11. Il s’applique aux domaines technologiques qui présentent plus d’intérêt sur le plan de l’application que certains problèmes théoriques rencontrés dans l’utilisation ordinaire du langage.
NOTE L’ISO 24617-12 traite des questions plus générales et théoriques de la quantification et de l’information quantitative.
Le présent document traite également des durées temporelles qui sont abordées dans l’ISO 24617-1 et des mesures spatiales telles que les distances qui sont traitées dans l’ISO 24617-7, tout en les rendant interopérables avec d’autres types de mesures. Il intègre également le traitement des mesures ou des quantités qui sont introduits dans l’ISO 24617-6:2016, 8.3.

Upravljanje jezikovnih virov - Ogrodje za semantično označevanje (SemAF) - 15. del: Ekstrakcija merljivih kvantitativnih informacij (MQIE)

General Information

Status
Not Published
Public Enquiry End Date
21-Nov-2024
Current Stage
5020 - Formal vote (FV) (Adopted Project)
Start Date
10-Apr-2025
Due Date
29-May-2025

Buy Standard

Standard
ISO 24617-15:2025 - Language resource management — Semantic annotation framework (SemAF) — Part 15: Measurable quantitative information extraction (MQIE) Released:1. 05. 2025
English language
15 pages
sale 15% off
Preview
sale 15% off
Preview
Standard
ISO 24617-15:2025 - Gestion des ressources linguistiques — Cadre d’annotation sémantique (SemAF) — Partie 15: Extraction d’informations quantitatives mesurables (MQIE) Released:1. 05. 2025
French language
16 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
ISO/DIS 24617-15:2024 - BARVE
English language
20 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day

Standards Content (Sample)


International
Standard
ISO 24617-15
First edition
Language resource management —
2025-05
Semantic annotation framework
(SemAF) —
Part 15:
Measurable quantitative
information extraction (MQIE)
Gestion des ressources linguistiques — Cadre d’annotation
sémantique (SemAF) —
Partie 15: Extraction d’informations quantitatives
mesurables (MQIE)
Reference number
© ISO 2025
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 General framework of MQIE . 2
4.1 Overview .2
4.2 Primary requirements of MQIE .2
4.3 Framework .3
4.4 Preprocessing .4
4.5 Basic element identification.4
4.6 Link identification .5
4.7 Measure normalization .6
4.8 Verification and filtering .7
5 Examples . 7
5.1 General .7
5.2 Sample data .7
5.3 Procedure of extraction .8
5.3.1 Overview .8
5.3.2 Preprocessing.8
5.3.3 Basic element extraction .8
5.3.4 Link identification .8
5.3.5 Measure normalization .9
5.3.6 Verification and filtering .9
Annex A (informative) Examples of applications extended based on MQIE .11
Annex B (informative) Informal statements of MQI during extraction . 14
Bibliography .15

iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO’s adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology, Subcommittee
SC 4, Language resource management.
A list of all parts in the ISO 24617 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.

iv
Introduction
Measurable quantitative information (MQI) describes one of basic properties that is associated with the
magnitude aspect of quantity, and is very common in ordinary language. The main characteristics of MQI, as
described in ISO 24617-11, is that quantitative information is presented as measures expressed in terms of
a pair of a numerically expressed quantity and a unit. Such information is much more abundant in scientific
publications or technical reports to the extent that it constitutes an essential part of communicative
segments of language in general. The processing of such information is thus required for any successful
language resource management.
In such a big data era, demands from industry and academic communities for an accurate extraction of MQI
[1]
have increased. For example, business investment companies frequently need to identify and aggregate
various information covering net sales, gross profit, operating expenses, operating profit, interest expense,
net profit before taxes, net income, etc., of the target companies from their annual reports. The fast-growing
medical informatics research also needs to process a large amount of medical text to analyse the dose of
medicine, the eligibility criteria of clinical trial, the phenotype characters of patients, the laboratory tests in
[2][3]
clinical records, etc. All these demands either in industry or in medical research require the effective
[4]
extraction of MQI for automated identification, aggregation, computation and analysis.
However, in the information retrieval and natural language processing areas, there is no standardized way
of extracting measurable quantitative information currently available. Each application system developed
in industrial sectors has hitherto used common NLP models or their own models to identify measurable
quantitative information from unstructured text. There is no standard extraction procedure for ensuring
the quality of the extraction currently. A general, interoperable and standardized measurable quantitative
information extraction scheme for IR and NLP tasks to work with many different application systems is
called for.
This document formulates a general extraction scheme while following the basic requirements of semantic
annotation laid down in ISO 24617-11, which facilitates the annotation of MQI in scientific and technical
language and makes it interoperable with other semantic annotation schemes such as those given in the
parts of the ISO 24617 series. The extraction scheme also utilizes various International Standards on lexical
resources and morpho-syntactic annotation frameworks. It aims at being compatible with other existing
relevant standards such as ISO 24617-9.
NOTE ISO 24617-11 provides a standardized schema of annotating measurable quantitative information from
unstructured text.
Focusing on measurements in scientifico-technological language, this document is expected to contribute
to information retrieval (IR), question answering (QA), text summarization (TS) and other natural language
[5][6][7]
processing (NLP) applications.

v
International Standard ISO 24617-15:2025(en)
Language resource management — Semantic annotation
framework (SemAF) —
Part 15:
Measurable quantitative information extraction (MQIE)
1 Scope
This document establishes a measurable quantitative information extraction (MQIE) scheme, which is based
on the semantic annotation scheme specified in ISO 24617-11. It is applicable to the domains of technology
that carry more applicational relevance than some theoretical issues found in the ordinary use of language.
NOTE ISO 24617-12 deals with more general and theoretical issues of quantification and quantitative information.
This document also treats temporal durations that are discussed in ISO 24617-1, and spatial measures such
as distances that are treated in ISO 24617-7, while making them interoperable with other measure types. It
also accommodates the treatment of measures or amounts that are introduced in ISO 24617-6:2016, 8.3.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO 24617-6:2016, Language resource management — Semantic annotation framework — Part 6: Principles of
semantic annotation (SemAF Principles)
ISO 24617-11:2021, Language resource management — Semantic annotation framework (SemAF) — Part 11:
Measurable quantitative information (MQI)
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 24617-6:2016, ISO 24617-11:2021
and the following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
information extraction
IE
identifying specific structured information from natural language, semi-structured texts an
...


Norme
internationale
ISO 24617-15
Première édition
Gestion des ressources
2025-05
linguistiques — Cadre d’annotation
sémantique (SemAF) —
Partie 15:
Extraction d’informations
quantitatives mesurables (MQIE)
Language resource management — Semantic annotation
framework (SemAF) —
Part 15: Measurable quantitative information extraction (MQIE)
Numéro de référence
DOCUMENT PROTÉGÉ PAR COPYRIGHT
© ISO 2025
Tous droits réservés. Sauf prescription différente ou nécessité dans le contexte de sa mise en œuvre, aucune partie de cette
publication ne peut être reproduite ni utilisée sous quelque forme que ce soit et par aucun procédé, électronique ou mécanique,
y compris la photocopie, ou la diffusion sur l’internet ou sur un intranet, sans autorisation écrite préalable. Une autorisation peut
être demandée à l’ISO à l’adresse ci-après ou au comité membre de l’ISO dans le pays du demandeur.
ISO copyright office
Case postale 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Genève
Tél.: +41 22 749 01 11
E-mail: copyright@iso.org
Web: www.iso.org
Publié en Suisse
ii
Sommaire Page
Avant-propos .iv
Introduction .v
1 Domaine d’application . 1
2 Références normatives . 1
3 Termes et définitions . 1
4 Cadre général de MQIE . . 2
4.1 Vue d’ensemble .2
4.2 Exigences principales de MQIE .2
4.3 Cadre.3
4.4 Prétraitement .4
4.5 Identification des éléments de base .4
4.6 Identification des liens .6
4.7 Normalisation des mesures .7
4.8 Vérification et filtrage .7
5 Exemples . 8
5.1 Généralités .8
5.2 Échantillons de données .8
5.3 Mode opératoire d’extraction .8
5.3.1 Vue d’ensemble .8
5.3.2 Prétraitement .8
5.3.3 Extraction des éléments de base .9
5.3.4 Identification des liens .9
5.3.5 Normalisation des mesures .10
5.3.6 Vérification et filtrage .10
Annexe A (informative) Exemples d’applications étendues basées sur MQIE .12
Annexe B (informative) Énoncés informels de MQI pendant l’extraction .15
Bibliographie .16

iii
Avant-propos
L’ISO (Organisation internationale de normalisation) est une fédération mondiale d’organismes nationaux
de normalisation (comités membres de l’ISO). L’élaboration des Normes internationales est en général
confiée aux comités techniques de l’ISO. Chaque comité membre intéressé par une étude a le droit de faire
partie du comité technique créé à cet effet. Les organisations internationales, gouvernementales et non
gouvernementales, en liaison avec l’ISO participent également aux travaux. L’ISO collabore étroitement avec
la Commission électrotechnique internationale (IEC) en ce qui concerne la normalisation électrotechnique.
Les procédures utilisées pour élaborer le présent document et celles destinées à sa mise à jour sont
décrites dans les Directives ISO/IEC, Partie 1. Il convient, en particulier, de prendre note des différents
critères d’approbation requis pour les différents types de documents ISO. Le présent document a
été rédigé conformément aux règles de rédaction données dans les Directives ISO/IEC, Partie 2 (voir
www.iso.org/directives).
L’ISO attire l’attention sur le fait que la mise en application du présent document peut entraîner l’utilisation
d’un ou de plusieurs brevets. L’ISO ne prend pas position quant à la preuve, à la validité et à l’applicabilité de
tout droit de brevet revendiqué à cet égard. À la date de publication du présent document, l’ISO n’avait pas
reçu notification qu’un ou plusieurs brevets pouvaient être nécessaires à sa mise en application. Toutefois,
il y a lieu d’avertir les responsables de la mise en application du présent document que des informations
plus récentes sont susceptibles de figurer dans la base de données de brevets, disponible à l’adresse
www.iso.org/brevets. L’ISO ne saurait être tenue pour responsable de ne pas avoir identifié tout ou partie de
tels droits de propriété.
Les appellations commerciales éventuellement mentionnées dans le présent document sont données pour
information, par souci de commodité, à l’intention des utilisateurs et ne sauraient constituer un engagement.
Pour une explication de la nature volontaire des normes, la signification des termes et expressions
spécifiques de l’ISO liés à l’évaluation de la conformité, ou pour toute information au sujet de l’adhésion de
l’ISO aux principes de l’Organisation mondiale du commerce (OMC) concernant les obstacles techniques au
commerce (OTC), voir www.iso.org/avant-propos.
Le présent document a été élaboré par le comité technique ISO/TC 37, Langage et terminologie, sous-
comité SC 4, Gestion des ressources linguistiques.
Une liste de toutes les parties de la série ISO 24617 se trouve sur le site web de l’ISO.
Il convient que l’utilisateur adresse tout retour d’information ou toute question concernant le présent
document à l’organisme national de normalisation de son pays. Une liste exhaustive desdits organismes se
trouve à l’adresse www.iso.org/fr/members.html.

iv
Introduction
Les informations quantitatives mesurables (MQI, Measurable Quantitative Information) décrivent l’une des
propriétés de base qui est associée à l’aspect quantitatif d’une grandeur. Elles sont très courantes dans le
langage ordinaire. Les principales caractéristiques de la norme MQI, telles que décrites dans l’ISO 24617-11,
sont que les informations quantitatives sont présentées sous forme de mesures exprimées sous forme de
paire, consistant en une grandeur exprimée numériquement et une unité. Ces informations sont beaucoup
plus abondantes dans les publications scientifiques ou les rapports techniques au point qu’elles constituent
une part essentielle des segments communicatifs du langage en général. Le traitement de ces informations
est donc nécessaire pour une gestion réussie des ressources linguistiques.
À l’époque du «big data», les demandes de l’industrie et des milieux universitaires pour une extraction précise
[1]
des MQI ont augmenté. Par exemple, les sociétés d’investissement dans les entreprises ont fréquemment
besoin d’identifier et d’agréger différentes informations couvrant les ventes nettes, la marge brute, les frais
d’exploitation, le bénéfice d’exploitation, les frais d’intérêt, le bénéfice net avant impôts, le revenu net, etc.
des sociétés cibles à partir de leurs rapports annuels. La recherche en informatique médicale, en plein essor,
a également besoin de traiter une grande quantité de textes médicaux pour analyser la dose de médicament,
les critères d’éligibilité des essais cliniques, les caractères phénotypiques des patients, les essais en
[2][3]
laboratoire dans les dossiers cliniques, etc. Toutes ces demandes, qu’elles soient liées à l’industrie ou
à la recherche médicale, exigent l’extraction efficace des MQI afin de permettre une identification, une
[4]
agrégation, un calcul et une analyse automatisés.
Cependant, dans les domaines de la recherche d’informations et du traitement du langage naturel, il
n’existe actuellement aucun moyen normalisé d’extraire les informations quantitatives mesurables. Chaque
système d’application développé dans les secteurs industriels utilise jusqu’à présent des modèles communs
de traitement automatique des langues (TAL) ou ses propres modèles pour identifier les informations
quantitatives mesurables à partir de textes non structurés. À l’heure actuelle, il n’existe aucun mode
opératoire d’extraction normalisé permettant de garantir la qualité de l’extraction. Un schéma d’extraction
des informations quantitatives mesurables qui soit général, interopérable et normalisé est nécessaire pour
permettre aux tâches de recherche d’informations (IR) et de traitement automatique des langues (TAL)
de fonctionner avec de nombreux systèmes d’application différents.
Le présent document formule un schéma d’extraction général en suivant les exigences de base de l’annotation
sémantique définies dans l’ISO 24617-11, qui facilite l’annotation des MQI dans le langage scientifique et
technique et le rend interopérable avec d’autres schémas d’annotation sémantique tels que ceux décrits
dans les parties de la série ISO 24617. Le schéma d’extraction s’appuie également sur diverses Normes
internationales relatives aux ressources lexicales et aux cadres d’annotation morpho-syntaxique. Il vise à
être compatible avec les autres normes pertinentes existantes telles que l’ISO 24617-9.
NOTE L’ISO 24617-11 fournit un schéma normalisé d’annotation des informations quantitatives mesurables à
partir de textes non structurés.
Axé sur les mesures en langage scientifico-technologique, le présent document est censé contribuer aux
applications de recherche d’informations (IR), de question-réponse (QA), de résumé de texte (TS) et autres
[5][6][7]
applications de traitement automatique des langues (TAL) .

v
Norme internationale ISO 24617-15:2025(fr)
Gestion des ressources linguistiques — Cadre d’annotation
sémantique (SemAF) —
Partie 15:
Extraction d’informations quantitatives mesurables (MQIE)
1 Domaine d’application
Le présent document établit un schéma d’extraction des informations quantitatives mesurables (MQIE),
qui est basé sur le schéma d’annotation sémantique spécifié dans l’ISO 24617-11. Il s’applique aux domaines
technologiques qui présentent plus d’intérêt sur le plan de l’application que certains problèmes théoriques
rencontrés dans l’utilisation ordinaire du langage.
NOTE L’ISO 24617-12 traite des questions plus générales et théoriques de la quantification et de l’information
quantitative.
Le présent document traite également des durées temporelles qui sont abordées dans l’ISO 24617-1
et des mesures spatiales telles que les distances qui sont traitées dans l’ISO 24617-7, tout en les rendant
interopérables avec d’autres types de mesures. Il intègre é
...


SLOVENSKI STANDARD
oSIST ISO/DIS 24617-15:2024
01-november-2024
Upravljanje jezikovnih virov - Ogrodje za semantično označevanje (SemAF) - 15.
del: Ekstrakcija merljivih kvantitativnih informacij (MQIE)
Language resource management — Semantic annotation framework (SemAF) — Part
15: Measurable quantitative information extraction (MQIE)
Gestion des ressources linguistiques — Cadre d’annotation sémantique (SemAF) —
Partie 15: Extraction d’informations quantitatives mesurables (MQIE)
Ta slovenski standard je istoveten z: ISO/DIS 24617-15
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
01.140.20 Informacijske vede Information sciences
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
oSIST ISO/DIS 24617-15:2024 en,fr
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

oSIST ISO/DIS 24617-15:2024
oSIST ISO/DIS 24617-15:2024
DRAFT
International
Standard
ISO/DIS 24617-15
ISO/TC 37/SC 4
Language resource management —
Secretariat: KATS
Semantic annotation framework
Voting begins on:
(SemAF) —
2024-08-13
Part 15:
Voting terminates on:
2024-11-05
Measurable quantitative
information extraction (MQIE)
ICS: 01.020
THIS DOCUMENT IS A DRAFT CIRCULATED
FOR COMMENTS AND APPROVAL. IT
IS THEREFORE SUBJECT TO CHANGE
AND MAY NOT BE REFERRED TO AS AN
INTERNATIONAL STANDARD UNTIL
PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL,
TECHNOLOGICAL, COMMERCIAL AND
USER PURPOSES, DRAFT INTERNATIONAL
STANDARDS MAY ON OCCASION HAVE TO
This document is circulated as received from the committee secretariat.
BE CONSIDERED IN THE LIGHT OF THEIR
POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN
NATIONAL REGULATIONS.
RECIPIENTS OF THIS DRAFT ARE INVITED
TO SUBMIT, WITH THEIR COMMENTS,
NOTIFICATION OF ANY RELEVANT PATENT
RIGHTS OF WHICH THEY ARE AWARE AND TO
PROVIDE SUPPORTING DOCUMENTATION.
Reference number
ISO/DIS 24617-15:2024(en)
oSIST ISO/DIS 24617-15:2024
DRAFT
ISO/DIS 24617-15:2024(en)
International
Standard
ISO/DIS 24617-15
ISO/TC 37/SC 4
Language resource management —
Secretariat: KATS
Semantic annotation framework
Voting begins on:
(SemAF) —
Part 15:
Voting terminates on:
Measurable quantitative
information extraction (MQIE)
ICS: 01.020
THIS DOCUMENT IS A DRAFT CIRCULATED
FOR COMMENTS AND APPROVAL. IT
IS THEREFORE SUBJECT TO CHANGE
AND MAY NOT BE REFERRED TO AS AN
INTERNATIONAL STANDARD UNTIL
PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL,
© ISO 2024
TECHNOLOGICAL, COMMERCIAL AND
USER PURPOSES, DRAFT INTERNATIONAL
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
STANDARDS MAY ON OCCASION HAVE TO
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
This document is circulated as received from the committee secretariat. BE CONSIDERED IN THE LIGHT OF THEIR
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN
or ISO’s member body in the country of the requester.
NATIONAL REGULATIONS.
ISO copyright office
RECIPIENTS OF THIS DRAFT ARE INVITED
CP 401 • Ch. de Blandonnet 8
TO SUBMIT, WITH THEIR COMMENTS,
CH-1214 Vernier, Geneva
NOTIFICATION OF ANY RELEVANT PATENT
Phone: +41 22 749 01 11
RIGHTS OF WHICH THEY ARE AWARE AND TO
PROVIDE SUPPORTING DOCUMENTATION.
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland Reference number
ISO/DIS 24617-15:2024(en)
ii
oSIST ISO/DIS 24617-15:2024
ISO/DIS 24617-15:2024(en)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 General framework of MQIE . 2
4.1 Overview .2
4.2 Primary requirements of MQIE .2
4.3 Framework .2
4.4 preprocessing .4
4.5 Basic element identification.4
4.6 Link identification .5
4.7 Measure normalization .6
4.8 Verification and Filtering .6
5 Examples . 7
5.1 General .7
5.2 Sample data .7
5.3 Procedure of extraction .7
5.3.1 Overview .7
5.3.2 Pre-processing .7
5.3.3 Basic element extraction .7
5.3.4 Link identification .8
5.3.5 Measure normalization .8
5.3.6 Verification and Filtering .8
Annex A (informative) The examples of applications extended based on MQIE .11
Annex B (informative) Informal statements of MQI during extraction . 14
Bibliography .15

iii
oSIST ISO/DIS 24617-15:2024
ISO/DIS 24617-15:2024(en)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any patent
rights identified during the development of the document will be in the Introduction and/or on the ISO list of
patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology, Subcommittee
SC 4, Language resource management.
A list of all parts in the ISO 24617 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.

iv
oSIST ISO/DIS 24617-15:2024
ISO/DIS 24617-15:2024(en)
Introduction
Measurable quantitative information (MQI) describes one of basic properties that is associated with the
magnitude aspect of quantity, and is very common in ordinary language. The main characteristics of MQI, as
[1]
described in ISO 24617-11, is that quantitative information is presented as measures expressed in terms of
a pair of a numerically expressed quantity and a unit. Such information is much more abundant in scientific
publications or technical reports to the extent that it constitutes an essential part of communicative
segments of language in general. The processing of such information is thus required for any successful
language resource management.
In such a big data era, demands from industry and academic communities for an accurate extraction of MQI
[2]
have increased. For example, business investment companies frequently need to identify and aggregate
various information covering net sales, gross profit, operating expenses, operating profit, interest expense,
net profit before taxes, net income, etc., of the target companies from their annual reports. The fast-growing
medical informatics research also needs to process a large amount of medical text to analyze the dose of
medicine, the eligibility criteria of clinical trial, the phenotype characters of patients, the lab tests in clinical
[3,4]
records, etc. All these demands either in industry or in medical research require the effective extraction
[5]
of MQI for automated identification, aggregation, computation, and analysis .
However, in the information retrieval and natural language processing areas, there is no standardized way
of extracting measurable quantitative information currently available. Each application system developed
in industrial sectors has hitherto used common NLP models or their own models to identify measurable
quantitative information from unstructured text. There is no standard extraction procedure for ensuring
the quality of the extraction currently. A general, interoperable and standardized measurable quantitative
information extraction scheme for IR and NLP tasks to work with many different application systems is
called for.
This document, named ‘SemAF-MQIE’, aims at formulating a general extraction scheme with following the
basic requirements of semantic annotation laid down in ISO 24617-11, which facilitates the annotation
of MQI in scientific and technical language and to make it interoperable with other semantic annotation
schemes such as ISO 24617. The extracti
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.