SIST ISO 24616:2013
(Main)Language resources management -- Multilingual information framework
Language resources management -- Multilingual information framework
This International Standard provides a generic platform for modelling and managing multilingual information in various domains: localization, translation, multimedia annotation, document management, digital library support, and information or business modelling applications. MLIF (multilingual information framework) provides a metamodel and a set of generic data categories [ISO 12620:2009] for various application domains. MLIF also provides strategies for the interoperability and/or linking of models including, but not limited to, XLIFF, TMX, smilText and ITS.
Gestion des ressources langagières -- Plateforme d'informations multilingues
Upravljanje z jezikovnimi viri - Ogrodje za večjezične informacije
Ta mednarodni standard zagotavlja splošno platformo za modeliranje večjezikovnih informacij in upravljanje z njimi na različnih področjih: lokalizacija, prevajanje, multimedijsko označevanje, upravljanje z dokumenti, podpora digitalni knjižnici in aplikacije za modeliranje poslovanja. MLIF (ogrodje za večjezične informacije) zagotavlja metamodel in sklop splošnih podatkovnih kategorij [ISO 12620:2009] za različna področja uporabe. MLIF zagotavlja tudi strategije za interoperabilnost in/ali povezovanje modelov, med drugim XLIFF, TMX, smilText in ITS.
General Information
Buy Standard
Standards Content (sample)
SLOVENSKI STANDARD
SIST ISO 24616:2013
01-julij-2013
Upravljanje z jezikovnimi viri - Ogrodje za večjezične informacije
Language resources management -- Multilingual information framework
Gestion des ressources langagières -- Plateforme d'informations multilingues
Ta slovenski standard je istoveten z: ISO 24616:2012
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
01.140.20 Informacijske vede Information sciences
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
SIST ISO 24616:2013 en,fr,de
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
---------------------- Page: 1 ----------------------SIST ISO 24616:2013
---------------------- Page: 2 ----------------------
SIST ISO 24616:2013
INTERNATIONAL ISO
STANDARD 24616
First edition
2012-09-01
Language resources management —
Multilingual information framework
Gestion des ressources langagières — Plateforme d'informations
multilingues
Reference number
ISO 24616:2012(E)
ISO 2012
---------------------- Page: 3 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2012
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.ISO copyright office
Case postale 56 CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2012 – All rights reserved
---------------------- Page: 4 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
Contents Page
Foreword ............................................................................................................................................................ iv
1 Scope ...................................................................................................................................................... 1
2 Normative references ............................................................................................................................ 1
3 Terms and definitions ........................................................................................................................... 1
4 Specification principles ........................................................................................................................ 2
4.1 Key standard used in the specification: Unified Modeling Language (UML) .................................. 2
4.2 Metamodel and adornment ................................................................................................................... 2
4.3 XML serialization ................................................................................................................................... 2
5 Metamodel specification ....................................................................................................................... 2
6 MLIF compliance ................................................................................................................................... 3
7 Metamodel adornment .......................................................................................................................... 3
7.1 Introduction ............................................................................................................................................ 3
7.2 General principles concerning the use of W3C generic attributes .................................................. 3
7.3 Recommended adornment for GI ........................................................................................................ 4
7.4 Recommended adornment for GroupC ............................................................................................... 4
7.5 Recommended adornment for MultiC ................................................................................................. 4
7.6 Recommended and mandatory adornment for MonoC ..................................................................... 5
7.7 Recommended adornment for SegC ................................................................................................... 5
7.8 Recommended adornment for HistoC ................................................................................................. 5
7.9 Recommended online annotation adornment .................................................................................... 5
7.10 Recommended adornment for localization......................................................................................... 6
7.11 Recommended adornment for internationalization ........................................................................... 6
7.12 Recommended adornment for temporal synchronization ................................................................ 6
8 Relation with other standards .............................................................................................................. 6
Annex A (informative) Example using MLIF for Computer-Assisted Translation (CAT) ............................. 8
Annex B (informative) Example: representing TMX data .............................................................................. 11
Annex C (informative) Example of XLIFF data representation ..................................................................... 14
Annex D (informative) Example: representing smilText data ....................................................................... 18
Annex E (informative) Example of MLIF usage for subtitles (captioning) .................................................. 20
Annex F (informative) Using MLIF for MAF data ............................................................................................ 26
Annex G (normative) Detailed specification .................................................................................................. 27
Bibliography ...................................................................................................................................................... 42
© ISO 2012 – All rights reserved iii---------------------- Page: 5 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies
(ISO member bodies). The work of preparing International Standards is normally carried out through ISO
technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of technical committees is to prepare International Standards. Draft International Standards
adopted by the technical committees are circulated to the member bodies for voting. Publication as an
International Standard requires approval by at least 75 % of the member bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights.
ISO 24616 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content
resources, Subcommittee SC 4, Language resource management.iv © ISO 2012 – All rights reserved
---------------------- Page: 6 ----------------------
SIST ISO 24616:2013
INTERNATIONAL STANDARD ISO 24616:2012(E)
Language resources management — Multilingual information
framework
1 Scope
This International Standard provides a generic platform for modelling and managing multilingual information in
various domains: localization, translation, multimedia annotation, document management, digital library
support, and information or business modelling applications. MLIF (multilingual information framework)
provides a metamodel and a set of generic data categories [ISO 12620:2009] for various application domains.
MLIF also provides strategies for the interoperability and/or linking of models including, but not limited to,
XLIFF, TMX, smilText and ITS.2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.ISO 12620:2009; Terminology and other language and content resources — Specification of data categories
and management of a Data Category Registry for language resourcesISO 8879, Information processing — Text and office systems —Generalized Markup Language (SGML)
Extensible Markup Language. Fifth Edition, T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, F. Yergeau
Editors, W3C Recommendation, 26 November 2008, http://www.w3.org/TR/xml3 Terms and definitions
For the purposes of this document, the following terms and definitions apply:
3.1
adornment
data category attached to a component of a metamodel
3.2
inline code
inline instructions inserted in a source document
Note to entry: Native code can, for instance, provide presentational instructions (e.g. HTML codes).
3.3subtitle
textual versions of the dialog in films, television programs, video games, etc., usually displayed at the bottom
of the screen3.4
working language
language in which linguistic sequences are expressed
© ISO 2012 – All rights reserved 1
---------------------- Page: 7 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
4 Specification principles
4.1 Key standard used in the specification: Unified Modeling Language (UML)
The MLIF specification complies with the modelling principles of UML as defined by the Object Management
Group (OMG) [UML]. The specification uses the UML subset that is relevant for the purposes of MLIF.
4.2 Metamodel and adornmentIn line with Terminological Markup Framework (TMF) as defined in ISO 16642, MLIF defines a metamodel that
is adorned by data categories, as defined in ISO 12620.4.3 XML serialization
Associated with the metamodel and its adornment, MLIF proposes a representation in XML called “XML
serialization”, in line with Extensible Markup Language (XML) as defined in ISO 8879.
5 Metamodel specificationThe MLIF metamodel is specified in the UML object diagram in Figure 1.
Figure 1 — MLIF metamodel
2 © ISO 2012 – All rights reserved
---------------------- Page: 8 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
The MLIF metamodel is defined by the following seven "core components". These components are listed as
follows, according to their XML serialization: (Multilingual Data Collection), which represents a collection of data containing global information
and several multilingual units; (Global Information), which represents technical and administrative information applying to the
entire multilingual data collection; (Grouping components), which represents a sub-collection of multilingual data that have a
common origin or purpose within a given project; (Multilingual Component), which groups together all variants of a given textual content;
(Monolingual Component), which groups together information related to one language and is
part of a multilingual component (MultiC); (History Component), which traces modifications to the component to which it is anchored (i.e.
versioning); (Segmentation Component), which allows any level of segmentation for textual information,
possibly in a recursive manner.6 MLIF compliance
Any format compliant with this International Standard may use the MLIF metamodel in two possible ways:
by fully implementing the MLIF metamodel starting at the level of ; by specifically embedding MLIF-compliant information within another model, by implementing one of the
lower level MLIF elements, namely , or .7 Metamodel adornment
7.1 Introduction
The MLIF XML serialization proposes a set of XML elements and XML attributes, which are described in the
following sections, where the characters “<” and “>” delimit the name of the element. Following the TEI
guidelines (http://www.tei-c.org), some attributes are specified by means of a class attribute, with the
convention that the name of the class attribute is prefixed by “att.” (e.g. “att.xlink”). The other XML attributes
are listed with the convention that two quotes delimit the name of the attribute (e.g. “xml:lang”). The
specifications in Annex G shall be applied.7.2 General principles concerning the use of W3C generic attributes
The following W3C attributes are to be used by all MLIF-compliant applications:
the attribute xml:lang shall be used in accordance with W3C recommendations to represent the working
language of any relevant element and, in particular, shall be used systematically for any implementation
of MonoC; the attribute xml:id shall be used in accordance with W3C recommendations to provide a unique identifier
to an element of the MLIF metamodel.© ISO 2012 – All rights reserved 3
---------------------- Page: 9 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
7.3 Recommended adornment for GI
7.4 Recommended adornment for GroupC
7.5 Recommended adornment for MultiC
4 © ISO 2012 – All rights reserved
---------------------- Page: 10 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
7.6 Recommended and mandatory adornment for MonoC
att.lang
att.xlink
The language attribute is mandatory on MonoC. All other adornments are optional.
7.7 Recommended adornment for SegC
att.linguistic
att.xlink
7.8 Recommended adornment for HistoC
The HistoC component is a generic component that traces modifications made on the component to which it is
anchored (e.g. creation, modification and validation). In the MLIF metamodel, the HistoC component may be
anchored to the GI, MultiC or MonoC component. This makes it possible for all evolutions of, or
enhancements to, the component to be recorded.HistoC may be adorned by four elements:
7.9 Recommended online annotation adornment
Multilingual text documents are often only one stage in a complex workflow that involves external document
sources in a wide variety of formats. From these, it is often necessary to keep inline markup indicating the
presentational features that have to be retained in a translated target document. To this end, MLIF-compliant
applications should use the following elements, in relation to the element, that map onto similar
subsets in TMX and XLIFF:© ISO 2012 – All rights reserved 5
---------------------- Page: 11 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
7.10 Recommended adornment for localization
All the following elements should be used to provide localization-related information:
7.11 Recommended adornment for internationalization
7.12 Recommended adornment for temporal synchronization
The following elements should be used when textual content has to be conveyed (in written or spoken form)
together with some constraints:
8 Relation with other standards
As with the “Terminological Markup Framework” TMF [ISO 16642] in terminology, MLIF introduces a
metamodel that combines with selected data categories as a way of ensuring interoperability between several
multilingual applications and corpora. MLIF deals with multilingual corpora, multilingual fragments, and the
translation relations between them. In each domain where MLIF is applicable, a specific granularity may be
considered for segmentation and description. These two last processes may rely on MAF [ISO 24611], SynAF
[ISO 24615] and TMF for morphological description, syntactical annotation and terminological description
respectively.MLIF supports the construction and the interoperability of localization and translation memories resources,
and also deals with the description of a metamodel for multilingual content. MLIF does not propose a closed
list of description features. Rather, it provides a list of data categories that is much easier to update and
extend. This list represents a point of reference for multilingual information in the context of various application
scenarios.However, MLIF not only describes elementary linguistic segments (e.g. sentence, syntactical fragment, word
and part of speech), but may also be used to represent document structure (e.g. title, abstract, paragraph and
section). In addition, MLIF allows for external and internal links (annotations and references).
MLIF is designed to provide a common framework that facilitates the interoperability with formats such as
TMX (LISA OSCAR) and XLIFF (OASIS). MLIF can be seen as a parent of these formats, since both of them
6 © ISO 2012 – All rights reserved---------------------- Page: 12 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
deal with multilingual data expressed in the form of segments or text units. Both can be stored, manipulated
and translated in a similar manner.Examples of using MLIF are given in Annexes A to F.
© ISO 2012 – All rights reserved 7
---------------------- Page: 13 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
Annex A
(informative)
Example using MLIF for Computer-Assisted Translation (CAT)
The main reason for lemma, part-of-speech and morphological features is to allow CAT tools based on
translation memory to produce translations of new words and sentences that are not in the translation
database.For example, using a translation memory that contains the English sentence "The meal is nice." and its
translation in French "Le repas est bon.", current CAT tools such as SDL TRADOS Translator's Workbench
are not able to provide the predicted translation for the sentence "The meals are nice." even though the word
lemmas of "The meal is nice." and "The meals are nice." are matching. This weakness is due to the fact that
these tools use limited linguistic criteria during the translation process.The data produced by TRADOS Translator's Workbench is as follows:
creationtool="TRADOS Translator's Workbench for Windows"
creationtoolversion="Edition 8 Build 863"
segtype="sentence"
o-tmf="TW4Win 2.0 Format"
adminlang="EN-US"
srclang="EN-GB"
datatype="rtf"
creationdate="20100528T144322Z"
creationid="USER"/>
The meal is nice.
Le repas est bon.
To translate the sentence "The meals are nice.", an MLIF-compliant tool should implement the following
procedure:Step-1 Represent in MLIF and add linguistic properties to all the words within the translation memory.
Step-2 Run a part-of-speech tagger on the sentence in order to obtain the right morphosyntactic word
categories.Step-3 Translate the lemmas using an English-to-French bilingual lexicon.
SDL TRADOS Translator's Workbench is an example of a suitable product available commercially. This information is
given for the convenience of users of this International Standard and does not constitute an endorsement by ISO of this
product.8 © ISO 2012 – All rights reserved
---------------------- Page: 14 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
Step-4 Consult a French lexicon of inflected forms in order to retrieve the correct inflected form using the
lemma and morphological features.Step-5 Generate the translation of "The meals are nice." by substituting each English word with its French
inflected form as follows:"The meals are nice." => "Les repas sont bons."
The XML data will include a feature structure declaration defining a tagset (e.g. for "nS"), with a word
segmentation and tagset defined in MAF:SEMMAR
20090922T140653Z
The meal is nice.
Le repas est bon.
The
class="word"
lemma="meal"
pos="commonNoun"
tag="#nS">meal
class="word"
lemma="be"
pos="verb"
tag="#mP #p1 #nS">is
nice
.
class="word"
lemma="le"
pos="definiteArticle"
tag="#gM #nS">Le
class="word"
lemma="repas"
pos="commonNoun"
tag="#gM #nS">repas
class="word"
lemma="être"
© ISO 2012 – All rights reserved 9
---------------------- Page: 15 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
pos="verb"
tag="#mP #p1 #nS">est
class="word"
lemma="bon"
pos="qualifierAdjective"
tag="#gM #nS">bon
.
10 © ISO 2012 – All rights reserved
---------------------- Page: 16 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
Annex B
(informative)
Example: representing TMX data
B.1 Introduction
TMX (Translation Memory eXchange) is the vendor-neutral open XML standard for the exchange of
Translation Memory (TM) data created by computer-aided translation (CAT) and localization tools. The
purpose of TMX is to allow easier exchange of translation memory data between tools and/or translation
vendors with little or no loss of critical data during the process. TMX, which has been on the market since
1998, is a certifiable standard format. It was developed, and is maintained by, OSCAR (Open Standards for
Container/Content Allowing Re-use), a LISA Special Interest Group.B.2 Mapping TMX to MLIF
TMX is nearly isomorphic to the MLIF metamodel. The core elements of the TMX macro-structure map to
MLIF as follows: maps onto the element;
is a container for the element and maps onto the element;
maps onto the element; maps onto the element;
maps onto the element;
of type term maps onto the element of type term.
Further TMX elements and attributes map onto MLIF elements as follows:
The "creationtool" attribute maps onto the element;
The "creationdate" attribute maps onto the element;
The "tuid" attribute maps onto the element within MultiC.
The element does not map onto any specific element as it represents a generic placeholder for
application-dependent data. When applicable, a specific element is explicitly mapped onto MLIF
elements or onto a standardized ISO/TC 37 data category as available from ISOCat.
© ISO 2012 – All rights reserved 11---------------------- Page: 17 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
B.3 Example of data
The following example, based on TMX version 1.4, focuses on the multilingual units of a TMX document and
does not translate all the details of the header.adminlang="en"
creationdate="20040731T164933Z"
creationtool="Heartsome TM Server"
creationtoolversion="1.0.1"
datatype="xml"
o-tmf="unknown"
segtype="block"
srclang="*all*"/>
Le processus de contrôle de
qualité en dix étapes qu'il a créé il y a plus
de 1300 ans est beaucoup plus complet et précis que ceux
existant aujourd'hui.
His 10-stage quality
control process initiated more than 1300 years
ago is far more thorough and exacting than any existing
today.
El proceso de control de
calidad en diez pasos que inició hace más de
1300 años es mucho más completo y preciso que los que
existen en la actualidad.
Il suo metodo di controllo di qualità in 10 fasi risale a più
di 1300 anni fa ed è molto più accurato e preciso diqualsiasi metodo attuale.
그가 1300여년 전 시작한 10단계 품질
관리 방법은 현존하는 것보다 훨씬 더 철저하고 정확하다.
The corresponding representation in MLIF default representation is as follows:
TMX
1.4
20040731T164933Z
Heartsome TM Server
1.0.1
12 © ISO 2012 – All rights reserved
---------------------- Page: 18 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
1091303313515
20020930T004233Z
Le processus de contrôle
de qualité en dix étapes qu'il a créé il y a
plus de 1300 ans est beaucoup plus complet et précis que
ceux existant aujourd'hui.
His 10-stage quality
control process initiated more than 1300
years ago is far more thorough and exacting than any
existing today.
B.4 Example of TMX and MLIF interaction
Figure B.1 illustrates the interaction between TMX and MLIF. This process involves subsequent steps of
extraction, translation and merging. The process begins with a TMX document containing linguistic content in
English (en) and German (de). The extraction process (1) generates a “Skeleton File” (2) containing all TM
formatting information, and an MLIF Document Linguistic Content (3) in which only relevant linguistic
information is stored. As most translators (human beings or automatic software modules) work with TMX
software-oriented tools, an XSL style-sheet makes it possible to transform an MLIF document into a TMX
document. This file does not contain any formatting information. Once the translator has added the
appropriate Japanese (ja) translation, another XSL style-sheet transforms the TMX document into an MLIF
document (4). Finally, the new MLIF document (containing the Japanese translation) is merged with the
“Skeleton File” to produce a new TMX formatted document (5).Figure B.1 — TMX and MLIF interaction
© ISO 2012 – All rights reserved 13
---------------------- Page: 19 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
Annex C
(informative)
Example of XLIFF data representation
C.1 Introduction
The purpose of the XLIFF is to define and promote the adoption of a specification for the interchange of
localizable software- and document-based objects and related metadata.C.2 Ma
...
INTERNATIONAL ISO
STANDARD 24616
First edition
2012-09-01
Language resources management —
Multilingual information framework
Gestion des ressources langagières — Plateforme d'informations
multilingues
Reference number
ISO 24616:2012(E)
ISO 2012
---------------------- Page: 1 ----------------------
ISO 24616:2012(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2012
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.ISO copyright office
Case postale 56 CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2012 – All rights reserved
---------------------- Page: 2 ----------------------
ISO 24616:2012(E)
Contents Page
Foreword ............................................................................................................................................................ iv
1 Scope ...................................................................................................................................................... 1
2 Normative references ............................................................................................................................ 1
3 Terms and definitions ........................................................................................................................... 1
4 Specification principles ........................................................................................................................ 2
4.1 Key standard used in the specification: Unified Modeling Language (UML) .................................. 2
4.2 Metamodel and adornment ................................................................................................................... 2
4.3 XML serialization ................................................................................................................................... 2
5 Metamodel specification ....................................................................................................................... 2
6 MLIF compliance ................................................................................................................................... 3
7 Metamodel adornment .......................................................................................................................... 3
7.1 Introduction ............................................................................................................................................ 3
7.2 General principles concerning the use of W3C generic attributes .................................................. 3
7.3 Recommended adornment for GI ........................................................................................................ 4
7.4 Recommended adornment for GroupC ............................................................................................... 4
7.5 Recommended adornment for MultiC ................................................................................................. 4
7.6 Recommended and mandatory adornment for MonoC ..................................................................... 5
7.7 Recommended adornment for SegC ................................................................................................... 5
7.8 Recommended adornment for HistoC ................................................................................................. 5
7.9 Recommended online annotation adornment .................................................................................... 5
7.10 Recommended adornment for localization......................................................................................... 6
7.11 Recommended adornment for internationalization ........................................................................... 6
7.12 Recommended adornment for temporal synchronization ................................................................ 6
8 Relation with other standards .............................................................................................................. 6
Annex A (informative) Example using MLIF for Computer-Assisted Translation (CAT) ............................. 8
Annex B (informative) Example: representing TMX data .............................................................................. 11
Annex C (informative) Example of XLIFF data representation ..................................................................... 14
Annex D (informative) Example: representing smilText data ....................................................................... 18
Annex E (informative) Example of MLIF usage for subtitles (captioning) .................................................. 20
Annex F (informative) Using MLIF for MAF data ............................................................................................ 26
Annex G (normative) Detailed specification .................................................................................................. 27
Bibliography ...................................................................................................................................................... 42
© ISO 2012 – All rights reserved iii---------------------- Page: 3 ----------------------
ISO 24616:2012(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies
(ISO member bodies). The work of preparing International Standards is normally carried out through ISO
technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of technical committees is to prepare International Standards. Draft International Standards
adopted by the technical committees are circulated to the member bodies for voting. Publication as an
International Standard requires approval by at least 75 % of the member bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights.
ISO 24616 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content
resources, Subcommittee SC 4, Language resource management.iv © ISO 2012 – All rights reserved
---------------------- Page: 4 ----------------------
INTERNATIONAL STANDARD ISO 24616:2012(E)
Language resources management — Multilingual information
framework
1 Scope
This International Standard provides a generic platform for modelling and managing multilingual information in
various domains: localization, translation, multimedia annotation, document management, digital library
support, and information or business modelling applications. MLIF (multilingual information framework)
provides a metamodel and a set of generic data categories [ISO 12620:2009] for various application domains.
MLIF also provides strategies for the interoperability and/or linking of models including, but not limited to,
XLIFF, TMX, smilText and ITS.2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.ISO 12620:2009; Terminology and other language and content resources — Specification of data categories
and management of a Data Category Registry for language resourcesISO 8879, Information processing — Text and office systems —Generalized Markup Language (SGML)
Extensible Markup Language. Fifth Edition, T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, F. Yergeau
Editors, W3C Recommendation, 26 November 2008, http://www.w3.org/TR/xml3 Terms and definitions
For the purposes of this document, the following terms and definitions apply:
3.1
adornment
data category attached to a component of a metamodel
3.2
inline code
inline instructions inserted in a source document
Note to entry: Native code can, for instance, provide presentational instructions (e.g. HTML codes).
3.3subtitle
textual versions of the dialog in films, television programs, video games, etc., usually displayed at the bottom
of the screen3.4
working language
language in which linguistic sequences are expressed
© ISO 2012 – All rights reserved 1
---------------------- Page: 5 ----------------------
ISO 24616:2012(E)
4 Specification principles
4.1 Key standard used in the specification: Unified Modeling Language (UML)
The MLIF specification complies with the modelling principles of UML as defined by the Object Management
Group (OMG) [UML]. The specification uses the UML subset that is relevant for the purposes of MLIF.
4.2 Metamodel and adornmentIn line with Terminological Markup Framework (TMF) as defined in ISO 16642, MLIF defines a metamodel that
is adorned by data categories, as defined in ISO 12620.4.3 XML serialization
Associated with the metamodel and its adornment, MLIF proposes a representation in XML called “XML
serialization”, in line with Extensible Markup Language (XML) as defined in ISO 8879.
5 Metamodel specificationThe MLIF metamodel is specified in the UML object diagram in Figure 1.
Figure 1 — MLIF metamodel
2 © ISO 2012 – All rights reserved
---------------------- Page: 6 ----------------------
ISO 24616:2012(E)
The MLIF metamodel is defined by the following seven "core components". These components are listed as
follows, according to their XML serialization: (Multilingual Data Collection), which represents a collection of data containing global information
and several multilingual units; (Global Information), which represents technical and administrative information applying to the
entire multilingual data collection; (Grouping components), which represents a sub-collection of multilingual data that have a
common origin or purpose within a given project; (Multilingual Component), which groups together all variants of a given textual content;
(Monolingual Component), which groups together information related to one language and is
part of a multilingual component (MultiC); (History Component), which traces modifications to the component to which it is anchored (i.e.
versioning); (Segmentation Component), which allows any level of segmentation for textual information,
possibly in a recursive manner.6 MLIF compliance
Any format compliant with this International Standard may use the MLIF metamodel in two possible ways:
by fully implementing the MLIF metamodel starting at the level of ; by specifically embedding MLIF-compliant information within another model, by implementing one of the
lower level MLIF elements, namely , or .7 Metamodel adornment
7.1 Introduction
The MLIF XML serialization proposes a set of XML elements and XML attributes, which are described in the
following sections, where the characters “<” and “>” delimit the name of the element. Following the TEI
guidelines (http://www.tei-c.org), some attributes are specified by means of a class attribute, with the
convention that the name of the class attribute is prefixed by “att.” (e.g. “att.xlink”). The other XML attributes
are listed with the convention that two quotes delimit the name of the attribute (e.g. “xml:lang”). The
specifications in Annex G shall be applied.7.2 General principles concerning the use of W3C generic attributes
The following W3C attributes are to be used by all MLIF-compliant applications:
the attribute xml:lang shall be used in accordance with W3C recommendations to represent the working
language of any relevant element and, in particular, shall be used systematically for any implementation
of MonoC; the attribute xml:id shall be used in accordance with W3C recommendations to provide a unique identifier
to an element of the MLIF metamodel.© ISO 2012 – All rights reserved 3
---------------------- Page: 7 ----------------------
ISO 24616:2012(E)
7.3 Recommended adornment for GI
7.4 Recommended adornment for GroupC
7.5 Recommended adornment for MultiC
4 © ISO 2012 – All rights reserved
---------------------- Page: 8 ----------------------
ISO 24616:2012(E)
7.6 Recommended and mandatory adornment for MonoC
att.lang
att.xlink
The language attribute is mandatory on MonoC. All other adornments are optional.
7.7 Recommended adornment for SegC
att.linguistic
att.xlink
7.8 Recommended adornment for HistoC
The HistoC component is a generic component that traces modifications made on the component to which it is
anchored (e.g. creation, modification and validation). In the MLIF metamodel, the HistoC component may be
anchored to the GI, MultiC or MonoC component. This makes it possible for all evolutions of, or
enhancements to, the component to be recorded.HistoC may be adorned by four elements:
7.9 Recommended online annotation adornment
Multilingual text documents are often only one stage in a complex workflow that involves external document
sources in a wide variety of formats. From these, it is often necessary to keep inline markup indicating the
presentational features that have to be retained in a translated target document. To this end, MLIF-compliant
applications should use the following elements, in relation to the element, that map onto similar
subsets in TMX and XLIFF:© ISO 2012 – All rights reserved 5
---------------------- Page: 9 ----------------------
ISO 24616:2012(E)
7.10 Recommended adornment for localization
All the following elements should be used to provide localization-related information:
7.11 Recommended adornment for internationalization
7.12 Recommended adornment for temporal synchronization
The following elements should be used when textual content has to be conveyed (in written or spoken form)
together with some constraints:
8 Relation with other standards
As with the “Terminological Markup Framework” TMF [ISO 16642] in terminology, MLIF introduces a
metamodel that combines with selected data categories as a way of ensuring interoperability between several
multilingual applications and corpora. MLIF deals with multilingual corpora, multilingual fragments, and the
translation relations between them. In each domain where MLIF is applicable, a specific granularity may be
considered for segmentation and description. These two last processes may rely on MAF [ISO 24611], SynAF
[ISO 24615] and TMF for morphological description, syntactical annotation and terminological description
respectively.MLIF supports the construction and the interoperability of localization and translation memories resources,
and also deals with the description of a metamodel for multilingual content. MLIF does not propose a closed
list of description features. Rather, it provides a list of data categories that is much easier to update and
extend. This list represents a point of reference for multilingual information in the context of various application
scenarios.However, MLIF not only describes elementary linguistic segments (e.g. sentence, syntactical fragment, word
and part of speech), but may also be used to represent document structure (e.g. title, abstract, paragraph and
section). In addition, MLIF allows for external and internal links (annotations and references).
MLIF is designed to provide a common framework that facilitates the interoperability with formats such as
TMX (LISA OSCAR) and XLIFF (OASIS). MLIF can be seen as a parent of these formats, since both of them
6 © ISO 2012 – All rights reserved---------------------- Page: 10 ----------------------
ISO 24616:2012(E)
deal with multilingual data expressed in the form of segments or text units. Both can be stored, manipulated
and translated in a similar manner.Examples of using MLIF are given in Annexes A to F.
© ISO 2012 – All rights reserved 7
---------------------- Page: 11 ----------------------
ISO 24616:2012(E)
Annex A
(informative)
Example using MLIF for Computer-Assisted Translation (CAT)
The main reason for lemma, part-of-speech and morphological features is to allow CAT tools based on
translation memory to produce translations of new words and sentences that are not in the translation
database.For example, using a translation memory that contains the English sentence "The meal is nice." and its
translation in French "Le repas est bon.", current CAT tools such as SDL TRADOS Translator's Workbench
are not able to provide the predicted translation for the sentence "The meals are nice." even though the word
lemmas of "The meal is nice." and "The meals are nice." are matching. This weakness is due to the fact that
these tools use limited linguistic criteria during the translation process.The data produced by TRADOS Translator's Workbench is as follows:
creationtool="TRADOS Translator's Workbench for Windows"
creationtoolversion="Edition 8 Build 863"
segtype="sentence"
o-tmf="TW4Win 2.0 Format"
adminlang="EN-US"
srclang="EN-GB"
datatype="rtf"
creationdate="20100528T144322Z"
creationid="USER"/>
The meal is nice.
Le repas est bon.
To translate the sentence "The meals are nice.", an MLIF-compliant tool should implement the following
procedure:Step-1 Represent in MLIF and add linguistic properties to all the words within the translation memory.
Step-2 Run a part-of-speech tagger on the sentence in order to obtain the right morphosyntactic word
categories.Step-3 Translate the lemmas using an English-to-French bilingual lexicon.
SDL TRADOS Translator's Workbench is an example of a suitable product available commercially. This information is
given for the convenience of users of this International Standard and does not constitute an endorsement by ISO of this
product.8 © ISO 2012 – All rights reserved
---------------------- Page: 12 ----------------------
ISO 24616:2012(E)
Step-4 Consult a French lexicon of inflected forms in order to retrieve the correct inflected form using the
lemma and morphological features.Step-5 Generate the translation of "The meals are nice." by substituting each English word with its French
inflected form as follows:"The meals are nice." => "Les repas sont bons."
The XML data will include a feature structure declaration defining a tagset (e.g. for "nS"), with a word
segmentation and tagset defined in MAF:SEMMAR
20090922T140653Z
The meal is nice.
Le repas est bon.
The
class="word"
lemma="meal"
pos="commonNoun"
tag="#nS">meal
class="word"
lemma="be"
pos="verb"
tag="#mP #p1 #nS">is
nice
.
class="word"
lemma="le"
pos="definiteArticle"
tag="#gM #nS">Le
class="word"
lemma="repas"
pos="commonNoun"
tag="#gM #nS">repas
class="word"
lemma="être"
© ISO 2012 – All rights reserved 9
---------------------- Page: 13 ----------------------
ISO 24616:2012(E)
pos="verb"
tag="#mP #p1 #nS">est
class="word"
lemma="bon"
pos="qualifierAdjective"
tag="#gM #nS">bon
.
10 © ISO 2012 – All rights reserved
---------------------- Page: 14 ----------------------
ISO 24616:2012(E)
Annex B
(informative)
Example: representing TMX data
B.1 Introduction
TMX (Translation Memory eXchange) is the vendor-neutral open XML standard for the exchange of
Translation Memory (TM) data created by computer-aided translation (CAT) and localization tools. The
purpose of TMX is to allow easier exchange of translation memory data between tools and/or translation
vendors with little or no loss of critical data during the process. TMX, which has been on the market since
1998, is a certifiable standard format. It was developed, and is maintained by, OSCAR (Open Standards for
Container/Content Allowing Re-use), a LISA Special Interest Group.B.2 Mapping TMX to MLIF
TMX is nearly isomorphic to the MLIF metamodel. The core elements of the TMX macro-structure map to
MLIF as follows: maps onto the element;
is a container for the element and maps onto the element;
maps onto the element; maps onto the element;
maps onto the element;
of type term maps onto the element of type term.
Further TMX elements and attributes map onto MLIF elements as follows:
The "creationtool" attribute maps onto the element;
The "creationdate" attribute maps onto the element;
The "tuid" attribute maps onto the element within MultiC.
The element does not map onto any specific element as it represents a generic placeholder for
application-dependent data. When applicable, a specific element is explicitly mapped onto MLIF
elements or onto a standardized ISO/TC 37 data category as available from ISOCat.
© ISO 2012 – All rights reserved 11---------------------- Page: 15 ----------------------
ISO 24616:2012(E)
B.3 Example of data
The following example, based on TMX version 1.4, focuses on the multilingual units of a TMX document and
does not translate all the details of the header.adminlang="en"
creationdate="20040731T164933Z"
creationtool="Heartsome TM Server"
creationtoolversion="1.0.1"
datatype="xml"
o-tmf="unknown"
segtype="block"
srclang="*all*"/>
Le processus de contrôle de
qualité en dix étapes qu'il a créé il y a plus
de 1300 ans est beaucoup plus complet et précis que ceux
existant aujourd'hui.
His 10-stage quality
control process initiated more than 1300 years
ago is far more thorough and exacting than any existing
today.
El proceso de control de
calidad en diez pasos que inició hace más de
1300 años es mucho más completo y preciso que los que
existen en la actualidad.
Il suo metodo di controllo di qualità in 10 fasi risale a più
di 1300 anni fa ed è molto più accurato e preciso diqualsiasi metodo attuale.
그가 1300여년 전 시작한 10단계 품질
관리 방법은 현존하는 것보다 훨씬 더 철저하고 정확하다.
The corresponding representation in MLIF default representation is as follows:
TMX
1.4
20040731T164933Z
Heartsome TM Server
1.0.1
12 © ISO 2012 – All rights reserved
---------------------- Page: 16 ----------------------
ISO 24616:2012(E)
1091303313515
20020930T004233Z
Le processus de contrôle
de qualité en dix étapes qu'il a créé il y a
plus de 1300 ans est beaucoup plus complet et précis que
ceux existant aujourd'hui.
His 10-stage quality
control process initiated more than 1300
years ago is far more thorough and exacting than any
existing today.
B.4 Example of TMX and MLIF interaction
Figure B.1 illustrates the interaction between TMX and MLIF. This process involves subsequent steps of
extraction, translation and merging. The process begins with a TMX document containing linguistic content in
English (en) and German (de). The extraction process (1) generates a “Skeleton File” (2) containing all TM
formatting information, and an MLIF Document Linguistic Content (3) in which only relevant linguistic
information is stored. As most translators (human beings or automatic software modules) work with TMX
software-oriented tools, an XSL style-sheet makes it possible to transform an MLIF document into a TMX
document. This file does not contain any formatting information. Once the translator has added the
appropriate Japanese (ja) translation, another XSL style-sheet transforms the TMX document into an MLIF
document (4). Finally, the new MLIF document (containing the Japanese translation) is merged with the
“Skeleton File” to produce a new TMX formatted document (5).Figure B.1 — TMX and MLIF interaction
© ISO 2012 – All rights reserved 13
---------------------- Page: 17 ----------------------
ISO 24616:2012(E)
Annex C
(informative)
Example of XLIFF data representation
C.1 Introduction
The purpose of the XLIFF is to define and promote the adoption of a specification for the interchange of
localizable software- and document-based objects and related metadata.C.2 Mapping XLIFF to MLIF
XLIFF differs from the MLIF metamodel in that it draws a clear distinction between source and target language
for monolingual information. This is handled through the appropriate use of the data
category in together with the language declarations ( and ) in
.The core elements of the XLIFF macro-structure map to MLIF as follows:
maps onto the element;
is a container for the element and maps onto the element;
the element maps onto the element; maps onto the element;
element to . The corresponding textual content is placed in a element;
maps onto the element and simultaneously sets the value of the
element to . The corresponding textual content is placed in a element;
maps onto the element and simultaneously sets the value of the
element to alternate.XLIFF further elements and attrib
...
SLOVENSKI STANDARD
SIST ISO 24616:2013
01-julij-2013
8SUDYOMDQMH]MH]LNRYQLPLYLUL2JURGMH]DYHþMH]LþQHLQIRUPDFLMH
Language resources management -- Multilingual information framework
Gestion des ressources langagières -- Plateforme d'informations multilingues
Ta slovenski standard je istoveten z: ISO 24616:2012
ICS:
01.020 7HUPLQRORJLMDQDþHODLQ Terminology (principles and
NRRUGLQDFLMD coordination)
SIST ISO 24616:2013 en,fr,de
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
---------------------- Page: 1 ----------------------SIST ISO 24616:2013
---------------------- Page: 2 ----------------------
SIST ISO 24616:2013
INTERNATIONAL ISO
STANDARD 24616
First edition
2012-09-01
Language resources management —
Multilingual information framework
Gestion des ressources langagières — Plateforme d'informations
multilingues
Reference number
ISO 24616:2012(E)
ISO 2012
---------------------- Page: 3 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2012
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.ISO copyright office
Case postale 56 CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2012 – All rights reserved
---------------------- Page: 4 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
Contents Page
Foreword ............................................................................................................................................................ iv
1 Scope ...................................................................................................................................................... 1
2 Normative references ............................................................................................................................ 1
3 Terms and definitions ........................................................................................................................... 1
4 Specification principles ........................................................................................................................ 2
4.1 Key standard used in the specification: Unified Modeling Language (UML) .................................. 2
4.2 Metamodel and adornment ................................................................................................................... 2
4.3 XML serialization ................................................................................................................................... 2
5 Metamodel specification ....................................................................................................................... 2
6 MLIF compliance ................................................................................................................................... 3
7 Metamodel adornment .......................................................................................................................... 3
7.1 Introduction ............................................................................................................................................ 3
7.2 General principles concerning the use of W3C generic attributes .................................................. 3
7.3 Recommended adornment for GI ........................................................................................................ 4
7.4 Recommended adornment for GroupC ............................................................................................... 4
7.5 Recommended adornment for MultiC ................................................................................................. 4
7.6 Recommended and mandatory adornment for MonoC ..................................................................... 5
7.7 Recommended adornment for SegC ................................................................................................... 5
7.8 Recommended adornment for HistoC ................................................................................................. 5
7.9 Recommended online annotation adornment .................................................................................... 5
7.10 Recommended adornment for localization......................................................................................... 6
7.11 Recommended adornment for internationalization ........................................................................... 6
7.12 Recommended adornment for temporal synchronization ................................................................ 6
8 Relation with other standards .............................................................................................................. 6
Annex A (informative) Example using MLIF for Computer-Assisted Translation (CAT) ............................. 8
Annex B (informative) Example: representing TMX data .............................................................................. 11
Annex C (informative) Example of XLIFF data representation ..................................................................... 14
Annex D (informative) Example: representing smilText data ....................................................................... 18
Annex E (informative) Example of MLIF usage for subtitles (captioning) .................................................. 20
Annex F (informative) Using MLIF for MAF data ............................................................................................ 26
Annex G (normative) Detailed specification .................................................................................................. 27
Bibliography ...................................................................................................................................................... 42
© ISO 2012 – All rights reserved iii---------------------- Page: 5 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies
(ISO member bodies). The work of preparing International Standards is normally carried out through ISO
technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of technical committees is to prepare International Standards. Draft International Standards
adopted by the technical committees are circulated to the member bodies for voting. Publication as an
International Standard requires approval by at least 75 % of the member bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights.
ISO 24616 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content
resources, Subcommittee SC 4, Language resource management.iv © ISO 2012 – All rights reserved
---------------------- Page: 6 ----------------------
SIST ISO 24616:2013
INTERNATIONAL STANDARD ISO 24616:2012(E)
Language resources management — Multilingual information
framework
1 Scope
This International Standard provides a generic platform for modelling and managing multilingual information in
various domains: localization, translation, multimedia annotation, document management, digital library
support, and information or business modelling applications. MLIF (multilingual information framework)
provides a metamodel and a set of generic data categories [ISO 12620:2009] for various application domains.
MLIF also provides strategies for the interoperability and/or linking of models including, but not limited to,
XLIFF, TMX, smilText and ITS.2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.ISO 12620:2009; Terminology and other language and content resources — Specification of data categories
and management of a Data Category Registry for language resourcesISO 8879, Information processing — Text and office systems —Generalized Markup Language (SGML)
Extensible Markup Language. Fifth Edition, T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, F. Yergeau
Editors, W3C Recommendation, 26 November 2008, http://www.w3.org/TR/xml3 Terms and definitions
For the purposes of this document, the following terms and definitions apply:
3.1
adornment
data category attached to a component of a metamodel
3.2
inline code
inline instructions inserted in a source document
Note to entry: Native code can, for instance, provide presentational instructions (e.g. HTML codes).
3.3subtitle
textual versions of the dialog in films, television programs, video games, etc., usually displayed at the bottom
of the screen3.4
working language
language in which linguistic sequences are expressed
© ISO 2012 – All rights reserved 1
---------------------- Page: 7 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
4 Specification principles
4.1 Key standard used in the specification: Unified Modeling Language (UML)
The MLIF specification complies with the modelling principles of UML as defined by the Object Management
Group (OMG) [UML]. The specification uses the UML subset that is relevant for the purposes of MLIF.
4.2 Metamodel and adornmentIn line with Terminological Markup Framework (TMF) as defined in ISO 16642, MLIF defines a metamodel that
is adorned by data categories, as defined in ISO 12620.4.3 XML serialization
Associated with the metamodel and its adornment, MLIF proposes a representation in XML called “XML
serialization”, in line with Extensible Markup Language (XML) as defined in ISO 8879.
5 Metamodel specificationThe MLIF metamodel is specified in the UML object diagram in Figure 1.
Figure 1 — MLIF metamodel
2 © ISO 2012 – All rights reserved
---------------------- Page: 8 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
The MLIF metamodel is defined by the following seven "core components". These components are listed as
follows, according to their XML serialization: (Multilingual Data Collection), which represents a collection of data containing global information
and several multilingual units; (Global Information), which represents technical and administrative information applying to the
entire multilingual data collection; (Grouping components), which represents a sub-collection of multilingual data that have a
common origin or purpose within a given project; (Multilingual Component), which groups together all variants of a given textual content;
(Monolingual Component), which groups together information related to one language and is
part of a multilingual component (MultiC); (History Component), which traces modifications to the component to which it is anchored (i.e.
versioning); (Segmentation Component), which allows any level of segmentation for textual information,
possibly in a recursive manner.6 MLIF compliance
Any format compliant with this International Standard may use the MLIF metamodel in two possible ways:
by fully implementing the MLIF metamodel starting at the level of ; by specifically embedding MLIF-compliant information within another model, by implementing one of the
lower level MLIF elements, namely , or .7 Metamodel adornment
7.1 Introduction
The MLIF XML serialization proposes a set of XML elements and XML attributes, which are described in the
following sections, where the characters “<” and “>” delimit the name of the element. Following the TEI
guidelines (http://www.tei-c.org), some attributes are specified by means of a class attribute, with the
convention that the name of the class attribute is prefixed by “att.” (e.g. “att.xlink”). The other XML attributes
are listed with the convention that two quotes delimit the name of the attribute (e.g. “xml:lang”). The
specifications in Annex G shall be applied.7.2 General principles concerning the use of W3C generic attributes
The following W3C attributes are to be used by all MLIF-compliant applications:
the attribute xml:lang shall be used in accordance with W3C recommendations to represent the working
language of any relevant element and, in particular, shall be used systematically for any implementation
of MonoC; the attribute xml:id shall be used in accordance with W3C recommendations to provide a unique identifier
to an element of the MLIF metamodel.© ISO 2012 – All rights reserved 3
---------------------- Page: 9 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
7.3 Recommended adornment for GI
7.4 Recommended adornment for GroupC
7.5 Recommended adornment for MultiC
4 © ISO 2012 – All rights reserved
---------------------- Page: 10 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
7.6 Recommended and mandatory adornment for MonoC
att.lang
att.xlink
The language attribute is mandatory on MonoC. All other adornments are optional.
7.7 Recommended adornment for SegC
att.linguistic
att.xlink
7.8 Recommended adornment for HistoC
The HistoC component is a generic component that traces modifications made on the component to which it is
anchored (e.g. creation, modification and validation). In the MLIF metamodel, the HistoC component may be
anchored to the GI, MultiC or MonoC component. This makes it possible for all evolutions of, or
enhancements to, the component to be recorded.HistoC may be adorned by four elements:
7.9 Recommended online annotation adornment
Multilingual text documents are often only one stage in a complex workflow that involves external document
sources in a wide variety of formats. From these, it is often necessary to keep inline markup indicating the
presentational features that have to be retained in a translated target document. To this end, MLIF-compliant
applications should use the following elements, in relation to the element, that map onto similar
subsets in TMX and XLIFF:© ISO 2012 – All rights reserved 5
---------------------- Page: 11 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
7.10 Recommended adornment for localization
All the following elements should be used to provide localization-related information:
7.11 Recommended adornment for internationalization
7.12 Recommended adornment for temporal synchronization
The following elements should be used when textual content has to be conveyed (in written or spoken form)
together with some constraints:
8 Relation with other standards
As with the “Terminological Markup Framework” TMF [ISO 16642] in terminology, MLIF introduces a
metamodel that combines with selected data categories as a way of ensuring interoperability between several
multilingual applications and corpora. MLIF deals with multilingual corpora, multilingual fragments, and the
translation relations between them. In each domain where MLIF is applicable, a specific granularity may be
considered for segmentation and description. These two last processes may rely on MAF [ISO 24611], SynAF
[ISO 24615] and TMF for morphological description, syntactical annotation and terminological description
respectively.MLIF supports the construction and the interoperability of localization and translation memories resources,
and also deals with the description of a metamodel for multilingual content. MLIF does not propose a closed
list of description features. Rather, it provides a list of data categories that is much easier to update and
extend. This list represents a point of reference for multilingual information in the context of various application
scenarios.However, MLIF not only describes elementary linguistic segments (e.g. sentence, syntactical fragment, word
and part of speech), but may also be used to represent document structure (e.g. title, abstract, paragraph and
section). In addition, MLIF allows for external and internal links (annotations and references).
MLIF is designed to provide a common framework that facilitates the interoperability with formats such as
TMX (LISA OSCAR) and XLIFF (OASIS). MLIF can be seen as a parent of these formats, since both of them
6 © ISO 2012 – All rights reserved---------------------- Page: 12 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
deal with multilingual data expressed in the form of segments or text units. Both can be stored, manipulated
and translated in a similar manner.Examples of using MLIF are given in Annexes A to F.
© ISO 2012 – All rights reserved 7
---------------------- Page: 13 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
Annex A
(informative)
Example using MLIF for Computer-Assisted Translation (CAT)
The main reason for lemma, part-of-speech and morphological features is to allow CAT tools based on
translation memory to produce translations of new words and sentences that are not in the translation
database.For example, using a translation memory that contains the English sentence "The meal is nice." and its
translation in French "Le repas est bon.", current CAT tools such as SDL TRADOS Translator's Workbench
are not able to provide the predicted translation for the sentence "The meals are nice." even though the word
lemmas of "The meal is nice." and "The meals are nice." are matching. This weakness is due to the fact that
these tools use limited linguistic criteria during the translation process.The data produced by TRADOS Translator's Workbench is as follows:
creationtool="TRADOS Translator's Workbench for Windows"
creationtoolversion="Edition 8 Build 863"
segtype="sentence"
o-tmf="TW4Win 2.0 Format"
adminlang="EN-US"
srclang="EN-GB"
datatype="rtf"
creationdate="20100528T144322Z"
creationid="USER"/>
The meal is nice.
Le repas est bon.
To translate the sentence "The meals are nice.", an MLIF-compliant tool should implement the following
procedure:Step-1 Represent in MLIF and add linguistic properties to all the words within the translation memory.
Step-2 Run a part-of-speech tagger on the sentence in order to obtain the right morphosyntactic word
categories.Step-3 Translate the lemmas using an English-to-French bilingual lexicon.
SDL TRADOS Translator's Workbench is an example of a suitable product available commercially. This information is
given for the convenience of users of this International Standard and does not constitute an endorsement by ISO of this
product.8 © ISO 2012 – All rights reserved
---------------------- Page: 14 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
Step-4 Consult a French lexicon of inflected forms in order to retrieve the correct inflected form using the
lemma and morphological features.Step-5 Generate the translation of "The meals are nice." by substituting each English word with its French
inflected form as follows:"The meals are nice." => "Les repas sont bons."
The XML data will include a feature structure declaration defining a tagset (e.g. for "nS"), with a word
segmentation and tagset defined in MAF:SEMMAR
20090922T140653Z
The meal is nice.
Le repas est bon.
The
class="word"
lemma="meal"
pos="commonNoun"
tag="#nS">meal
class="word"
lemma="be"
pos="verb"
tag="#mP #p1 #nS">is
nice
.
class="word"
lemma="le"
pos="definiteArticle"
tag="#gM #nS">Le
class="word"
lemma="repas"
pos="commonNoun"
tag="#gM #nS">repas
class="word"
lemma="être"
© ISO 2012 – All rights reserved 9
---------------------- Page: 15 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
pos="verb"
tag="#mP #p1 #nS">est
class="word"
lemma="bon"
pos="qualifierAdjective"
tag="#gM #nS">bon
.
10 © ISO 2012 – All rights reserved
---------------------- Page: 16 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
Annex B
(informative)
Example: representing TMX data
B.1 Introduction
TMX (Translation Memory eXchange) is the vendor-neutral open XML standard for the exchange of
Translation Memory (TM) data created by computer-aided translation (CAT) and localization tools. The
purpose of TMX is to allow easier exchange of translation memory data between tools and/or translation
vendors with little or no loss of critical data during the process. TMX, which has been on the market since
1998, is a certifiable standard format. It was developed, and is maintained by, OSCAR (Open Standards for
Container/Content Allowing Re-use), a LISA Special Interest Group.B.2 Mapping TMX to MLIF
TMX is nearly isomorphic to the MLIF metamodel. The core elements of the TMX macro-structure map to
MLIF as follows: maps onto the element;
is a container for the element and maps onto the element;
maps onto the element; maps onto the element;
maps onto the element;
of type term maps onto the element of type term.
Further TMX elements and attributes map onto MLIF elements as follows:
The "creationtool" attribute maps onto the element;
The "creationdate" attribute maps onto the element;
The "tuid" attribute maps onto the element within MultiC.
The element does not map onto any specific element as it represents a generic placeholder for
application-dependent data. When applicable, a specific element is explicitly mapped onto MLIF
elements or onto a standardized ISO/TC 37 data category as available from ISOCat.
© ISO 2012 – All rights reserved 11---------------------- Page: 17 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
B.3 Example of data
The following example, based on TMX version 1.4, focuses on the multilingual units of a TMX document and
does not translate all the details of the header.adminlang="en"
creationdate="20040731T164933Z"
creationtool="Heartsome TM Server"
creationtoolversion="1.0.1"
datatype="xml"
o-tmf="unknown"
segtype="block"
srclang="*all*"/>
Le processus de contrôle de
qualité en dix étapes qu'il a créé il y a plus
de 1300 ans est beaucoup plus complet et précis que ceux
existant aujourd'hui.
His 10-stage quality
control process initiated more than 1300 years
ago is far more thorough and exacting than any existing
today.
El proceso de control de
calidad en diez pasos que inició hace más de
1300 años es mucho más completo y preciso que los que
existen en la actualidad.
Il suo metodo di controllo di qualità in 10 fasi risale a più
di 1300 anni fa ed è molto più accurato e preciso diqualsiasi metodo attuale.
그가 1300여년 전 시작한 10단계 품질
관리 방법은 현존하는 것보다 훨씬 더 철저하고 정확하다.
The corresponding representation in MLIF default representation is as follows:
TMX
1.4
20040731T164933Z
Heartsome TM Server
1.0.1
12 © ISO 2012 – All rights reserved
---------------------- Page: 18 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
1091303313515
20020930T004233Z
Le processus de contrôle
de qualité en dix étapes qu'il a créé il y a
plus de 1300 ans est beaucoup plus complet et précis que
ceux existant aujourd'hui.
His 10-stage quality
control process initiated more than 1300
years ago is far more thorough and exacting than any
existing today.
B.4 Example of TMX and MLIF interaction
Figure B.1 illustrates the interaction between TMX and MLIF. This process involves subsequent steps of
extraction, translation and merging. The process begins with a TMX document containing linguistic content in
English (en) and German (de). The extraction process (1) generates a “Skeleton File” (2) containing all TM
formatting information, and an MLIF Document Linguistic Content (3) in which only relevant linguistic
information is stored. As most translators (human beings or automatic software modules) work with TMX
software-oriented tools, an XSL style-sheet makes it possible to transform an MLIF document into a TMX
document. This file does not contain any formatting information. Once the translator has added the
appropriate Japanese (ja) translation, another XSL style-sheet transforms the TMX document into an MLIF
document (4). Finally, the new MLIF document (containing the Japanese translation) is merged with the
“Skeleton File” to produce a new TMX formatted document (5).Figure B.1 — TMX and MLIF interaction
© ISO 2012 – All rights reserved 13
---------------------- Page: 19 ----------------------
SIST ISO 24616:2013
ISO 24616:2012(E)
Annex C
(informative)
Example of XLIFF data representation
C.1 Introduction
The purpose of the XLIFF is to define and promote the adoption of a specification for the interchange of
localizable software- and document-based objects and related metadata.C.2 Mapping XLIFF to MLIF
XLIFF differs from the MLIF metamodel in that it draws a clear distinction between source and target language
for monolingual inform...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.