Language resource management -- Feature structures -- Part 2: Feature system declaration

This part of ISO 24610 provides a format to represent, store or exchange feature structures in natural language applications, for both annotation and production of linguistic data. It is ultimately designed to provide a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature specifications and operations on feature structures, thus offering means to check the conformance of each feature structure with regards to a reference specification. Feature structures are an essential part of many linguistic formalisms as well as an underlying mechanism for representing the information consumed or produced by and for language engineering applications. A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that makes use of fs (that is, feature structure) elements. The FSD serves four purposes. - It provides an encoding by which types and their subtyping and inheritance relationships can be introduced and defined, thus laying the basis for constructing a feature system. - It provides a mechanism by which the encoder can list all of the feature names and feature values and give a prose description as to what each represents. - It provides a mechanism by which type constraints can be declared, against which typed feature structures are validated relative to a given theory stated in typed feature logic. These constraints may involve constraints on the range of a feature's value, constraints on which features are permitted within certain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value pairs. The source of these constraints is normally the empirical domain being modelled. - It provides a mechanism by which the encoder can define the intended interpretation of underspecified feature structures. This involves defining default values (whether literal or computed) for missing features. The scheme described in this part of ISO 24610 may be used to document any feature system, but is primarily intended for use with the typed feature structure representation defined in ISO 24610-1. The feature structure representations of ISO 24610-1 specify data structures that are subject to the typing conventions and constraints specified using ISO 24610-2. The feature structure representations of ISO 24610-1 are also used within some of the elements defined in ISO 24610-2.

Gestion des ressources langagières -- Structures de traits -- Partie 2: Déclaration de système de structures de traits

Upravljanje z jezikovnimi viri - Strukture lastnosti - 2. del: Deklaracija sistema lastnosti

Ta del standarda ISO 24610 zagotavlja format za predstavitev, shranjevanje in izmenjavo struktur lastnosti v aplikacijah za naravni jezik za označevanje in oblikovanje jezikovnih podatkov. Njegov glavni namen je zagotovitev računalniškega formata za določanje hierarhije tipov in opis omejitev, ki veljajo za sklop specifikacij lastnosti in operacij na strukturah lastnosti, in tako nudi način za preverjanje skladnosti vsake strukture lastnosti glede na referenčno specifikacijo. Strukture lastnosti so ključni sestavni del številnih jezikovnih formalizmov in osnovni mehanizmi za predstavitev informacij, ki jih predelajo ali oblikujejo aplikacije za jezikovno inženirstvo, ali ki so oblikovane za te aplikacije. Deklaracija sistema lastnosti (FSD) je pomožna datoteka, ki se uporablja z določeno vrsto besedila, in uporablja elemente struktur lastnosti. FSD ima štiri namene. – Zagotavlja kodiranje, prek katerega se lahko uvedejo in definirajo tipi in njihovi podtipi ter odnosi dedovanja, s čimer se postavijo temelji za gradnjo sistema lastnosti. – Zagotavlja mehanizem, s katerim lahko kodirnik oblikuje seznam vseh imen lastnosti in vrednosti lastnosti ter besedilu doda opis o tem, kaj predstavlja. – Zagotavlja mehanizem, s katerim se lahko navedejo omejitve tipa in s čimer se potrdijo tipizirane strukture lastnosti glede na dano teorijo, navedeno v logiki tipiziranih lastnosti. Te omejitve lahko vključujejo omejitve glede razpona vrednosti lastnosti, omejitve glede tega, katere lastnosti so dovoljene v določenih vrstah struktur lastnosti, ali omejitve, ki onemogočajo sopojavitev določenih parov lastnost-vrednost. Vir teh omejitev je običajno empirično modeliranje. – Zagotavlja mehanizem, s katerim lahko kodirnik določi želeno interpretacijo premalo specificiranih struktur lastnosti. To vključuje določanje privzetih vrednosti (dobesednih ali izračunanih) za manjkajoče vrednosti. Shema, ki jo opisuje ta del standarda ISO 24610, se lahko uporablja za dokumentiranje katerega koli sistema lastnosti, a je v prvi vrsti namenjena uporabi pri predstavitvi tipiziranih struktur lastnosti iz standarda ISO 24610-1. Predstavitve struktur lastnosti iz standarda ISO 24610-1 določajo strukture podatkov, za katere veljajo tipološke norme in omejitve, določene s pomočjo standarda ISO 24610-2. Predstavitve struktur lastnosti iz standarda ISO 24610-1 se uporabljajo tudi pri nekaterih elementih, določenih v standardu ISO 24610-2.

General Information

Status
Published
Publication Date
06-Jun-2013
Current Stage
6060 - National Implementation/Publication (Adopted Project)
Start Date
30-May-2013
Due Date
04-Aug-2013
Completion Date
07-Jun-2013

Buy Standard

Standard
SIST ISO 24610-2:2013
English language
55 pages
sale 10% off
Preview
sale 10% off
Preview

e-Library read for
1 day
Standard
ISO 24610-2:2011 - Language resource management -- Feature structures
English language
50 pages
sale 15% off
Preview
sale 15% off
Preview
Standard
SIST ISO 24610-2:2013
English language
55 pages
sale 10% off
Preview
sale 10% off
Preview

e-Library read for
1 day

Standards Content (sample)

SLOVENSKI STANDARD
SIST ISO 24610-2:2013
01-julij-2013

Upravljanje z jezikovnimi viri - Strukture lastnosti - 2. del: Deklaracija sistema

lastnosti
Language resource management -- Feature structures -- Part 2: Feature system
declaration

Gestion des ressources langagières -- Structures de traits -- Partie 2: Déclaration de

système de structures de traits
Ta slovenski standard je istoveten z: ISO 24610-2:2011
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
01.140.20 Informacijske vede Information sciences
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
SIST ISO 24610-2:2013 en,fr,de

2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

---------------------- Page: 1 ----------------------
SIST ISO 24610-2:2013
---------------------- Page: 2 ----------------------
SIST ISO 24610-2:2013
INTERNATIONAL ISO
STANDARD 24610-2
First edition
2011-10-01
Language resource management —
Feature structures —
Part 2:
Feature system declaration
Gestion des ressources langagières — Structures de traits —
Partie 2: Déclaration de système de structures de traits
Reference number
ISO 24610-2:2011(E)
ISO 2011
---------------------- Page: 3 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2011

All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,

electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or

ISO's member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2011 – All rights reserved
---------------------- Page: 4 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
Contents Page

Foreword ............................................................................................................................................................ iv

Introduction ......................................................................................................................................................... v

1  Scope ...................................................................................................................................................... 1

2  Normative references ............................................................................................................................ 1

3  Terms and definitions ........................................................................................................................... 2

4  Overall structure .................................................................................................................................... 5

5  Basic concepts ...................................................................................................................................... 6

5.1  Typed feature structures reviewed ...................................................................................................... 6

5.2  Types ...................................................................................................................................................... 7

5.3  Type inheritance hierarchies ................................................................................................................ 9

5.4  Type constraints .................................................................................................................................. 11

5.5  Optional (default) values and underspecification ............................................................................ 12

5.6  Subsumption ........................................................................................................................................ 12

6  Defining well-formedness versus validity......................................................................................... 14

6.1  Overview ............................................................................................................................................... 14

6.2  ISO 24610 ............................................................................................................................................. 14

7  A feature system for a grammar ........................................................................................................ 19

7.1  Overview ............................................................................................................................................... 19

7.2  Sample FSDs ........................................................................................................................................ 20

8  Declaration of a feature system ......................................................................................................... 23

8.1  Overview ............................................................................................................................................... 24

8.2  Linking a text to feature system declarations .................................................................................. 24

8.3  Overall structure of a feature system declaration ........................................................................... 25

8.4  Feature declarations ........................................................................................................................... 27

8.5  Feature structure constraints ............................................................................................................ 33

Annex A (normative) XML schema for feature structures ............................................................................ 36

Annex B (informative) A complete example ................................................................................................... 46

Bibliography ...................................................................................................................................................... 50

© ISO 2011 – All rights reserved iii
---------------------- Page: 5 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies

(ISO member bodies). The work of preparing International Standards is normally carried out through ISO

technical committees. Each member body interested in a subject for which a technical committee has been

established has the right to be represented on that committee. International organizations, governmental and

non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the

International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.

International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.

The main task of technical committees is to prepare International Standards. Draft International Standards

adopted by the technical committees are circulated to the member bodies for voting. Publication as an

International Standard requires approval by at least 75 % of the member bodies casting a vote.

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent

rights. ISO shall not be held responsible for identifying any or all such patent rights.

ISO 24610-2 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content

resources, Subcommittee SC 4, Language resource management.

ISO 24610 consists of the following parts, under the general title Language resource management — Feature

structures:
 Part 1: Feature structure representation
 Part 2: Feature system declaration
iv © ISO 2011 – All rights reserved
---------------------- Page: 6 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
Introduction
ISO 24610 is organized in two separate main parts.

 Part 1, Feature structure representation, is dedicated to the description of feature structures, providing an

informal and yet explicit outline of their characteristics, as well as an XML-based structured way of

representing feature structures in general and typed feature structures in particular. It is designed to lay a

basis for constructing an XML-based reference format for exchanging (typed) feature structures between

applications.

 Part 2, Feature system declaration, will provide an implementation standard for XML-based typed feature

structures, first by defining a set of types and their hierarchy, then by formulating type constraints on a set

of features and their respective admissible feature values and finally by introducing a set of validity

conditions on feature structures for particular applications, especially related to the goal of language

resource management.

A feature structure is a general-purpose data structure that identifies and groups together individual features

by assigning a particular value to each. Because of the generality of feature structures, they can be used to

represent many different kinds of information. Interrelations among various pieces of information and their

instantiation in markup provide a meta-language for representing linguistic content. Moreover, this

instantiation allows a specification of a set of features and values associated with specific types and their

restrictions, by means of feature system declarations, or other XML mechanisms to be discussed in this part

of ISO 24610.

Some of the statements here are copied from ISO 24610-1:2006 in order to make this part standalone without

referring to part 1.
© ISO 2011 – All rights reserved v
---------------------- Page: 7 ----------------------
SIST ISO 24610-2:2013
---------------------- Page: 8 ----------------------
SIST ISO 24610-2:2013
INTERNATIONAL STANDARD ISO 24610-2:2011(E)
Language resource management — Feature structures —
Part 2:
Feature system declaration
1 Scope

This part of ISO 24610 provides a format to represent, store or exchange feature structures in natural

language applications, for both annotation and production of linguistic data. It is ultimately designed to provide

a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature

specifications and operations on feature structures, thus offering means to check the conformance of each

feature structure with regards to a reference specification. Feature structures are an essential part of many

linguistic formalisms as well as an underlying mechanism for representing the information consumed or

produced by and for language engineering applications.

A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that

makes use of fs (that is, feature structure) elements. The FSD serves four purposes.

 It provides an encoding by which types and their subtyping and inheritance relationships can be

introduced and defined, thus laying the basis for constructing a feature system.

 It provides a mechanism by which the encoder can list all of the feature names and feature values and

give a prose description as to what each represents.

 It provides a mechanism by which type constraints can be declared, against which typed feature

structures are validated relative to a given theory stated in typed feature logic. These constraints may

involve constraints on the range of a feature's value, constraints on which features are permitted within

certain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value

pairs. The source of these constraints is normally the empirical domain being modelled.

 It provides a mechanism by which the encoder can define the intended interpretation of underspecified

feature structures. This involves defining default values (whether literal or computed) for missing features.

The scheme described in this part of ISO 24610 may be used to document any feature system, but is primarily

intended for use with the typed feature structure representation defined in ISO 24610-1. The feature structure

representations of ISO 24610-1 specify data structures that are subject to the typing conventions and

constraints specified using ISO 24610-2. The feature structure representations of ISO 24610-1 are also used

within some of the elements defined in ISO 24610-2.
2 Normative references

The following referenced documents are indispensable for the application of this document. For dated

references, only the edition cited applies. For undated references, the latest edition of the referenced

document (including any amendments) applies.

ISO 24610-1:2006, Language resource management — Feature structures — Part 1: Feature structure

representation

ISO/IEC 19757-2, Information technology — Document Schema Definition Language (DSDL) — Part 2:

Regular-grammar-based validation — RELAX NG
© ISO 2011 – All rights reserved 1
---------------------- Page: 9 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
3 Terms and definitions

For the purposes of this document, the terms and definitions given in ISO 19757-2 and the following apply.

3.1
admissibility constraint
feature admissibility constraint

specification of a set of admissible features (3.2) and admissible feature values (3.3) associated with a

specific type (3.24)
3.2
admissible feature
appropriate feature

feature which any feature structure (3.14) of a given type (3.24) may bear a value (3.17) for

NOTE This term is often interpreted elsewhere to mean obligatory, i.e. feature structures of the given type must bear

a value for every admissible feature. This term does not imply that the feature is obligatory here.

3.3
admissible feature value
admissible value
value restriction
range restriction

value (3.17) that the value of an admissible feature (3.2) must be subsumed by in feature structures (3.14)

of a given type (3.24)
3.4
atomic type
user-defined type (3.24) with no admissible features (3.2) declared or inherited
3.5
bag
multiset

triple of an integer n, a set S and a function that maps the integers in the range, 1 to n, to elements of S

NOTE A bag is halfway between a set (in that its elements are unordered) and a list (in that particular elements can

occur more than once).
3.6
built-in

non-user-defined element that may appear in place of a feature structure (3.14), for example, as a feature

value (3.17)

NOTE Built-ins can be atomic or complex. The atomic built-ins are numeric, string, symbol and binary. The complex

built-ins are collections (3.7) and applications of the operators, i.e. alternation, negation and merge (5.2.4).

3.7
collection

feature value (3.17) consisting of potentially many values, organized as a list, set or bag (3.5)

3.8
constraint

unit of specification that identifies some collection of feature structures (3.14) as invalid

NOTE 1 All constraints are implicational in their syntactic form, although some are distinguished as admissibility

constraints. See validity (3.31) and 5.4. All feature structures not explicitly excluded as invalid are considered to be valid.

NOTE 2 A feature structure that has not been so identified by any of the constraints in a feature system is considered

to be valid.
2 © ISO 2011 – All rights reserved
---------------------- Page: 10 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
3.9
default value
value (3.17) otherwise assigned to a feature (3.12) when one is not specified
EXAMPLE Masculine is the default value of the grammatical gender in Dutch.
NOTE A feature structure may not bear a feature without a corresponding value.
3.10
empty feature structure
feature structure (3.14) that contains no information
NOTE An empty feature structure subsumes all other feature structures.
3.11
extension
converse of subsumption (3.21)
NOTE A feature structure F extends G if and only if G subsumes F.
3.12
feature

property or aspect of an entity that is formally represented as a function mapping the entity to a corresponding

value (3.17)
3.13
feature specification

pairing of a feature (3.12) with a value (3.17) in a feature structure description

3.14
feature structure

record structure that associates one value (3.17) to each of a collection of features

NOTE 1 Each value is either a feature structure or a simpler built-in (3.6) such as a string.

NOTE 2 Feature structures are partially ordered. The minimal feature structures in this ordering are the empty feature

structures.
3.15
feature system

type hierarchy (3.26) in which each type (3.24) has been associated with a collection of admissibility

constraints (3.1) and implicational constraints (3.18)
NOTE cf. type declaration (3.25)
3.16
feature system declaration
FSD
specification of a particular feature system (3.15)
3.17
feature value
value

entity or aggregation of entities that characterize some property or aspect of another entity

© ISO 2011 – All rights reserved 3
---------------------- Page: 11 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
3.18
implicational constraint

constraint of the form, “if G, then H,” where G and H are feature structures (3.14)

NOTE This identifies any feature structure F as invalid for which G subsumes F, and yet F and H have no valid

extension in common. See subsumption (3.21) and 8.5. Often used to refer to implicational constraints that are not also

admissibility constraints.
3.19
interpretation

minimally informative (or equivalently, most general) extension (3.11) of a feature structure (3.14) that is

consistent with a set of constraints declared by an FSD (3.16)
3.20
partial order
partially ordered set

set S equipped with a relation  over S  S that is (1) reflexive (for all s  S, s  s), (2) anti-symmetric (for all p,

q  S, if p  q and q  p, then p  q), and (3) transitive (for all p, q, r  S, if p  q and q  r, then p  r)

NOTE The set of integers Z is partially ordered, but it has an additional property: for every p, q  Z, either p  q or

q  p. Not all partial orders have this property. The taxonomical classification of organisms into phyla, genera and species,

for example, is a partial order that does not. Type hierarchies may not necessarily. The typed feature structures of a

feature system do not, unless (a) their type hierarchy does, and (b) either the type hierarchy has exactly one type, or every

y type is constrained to have exactly one appropriate feature.
3.21
subsumption

property that holds between two feature structures, G and F, such that G is said to subsume F if and only if F

carries all of the information with it that G does
NOTE A formal definition is provided in 5.6.
3.22
subtype

type (3.24) to which another type confers its constraints and appropriate features

3.23
supertype
base type

type (3.24) from which another type inherits constraints and appropriate features

NOTE s is a subtype of t iff t is a supertype of s. Every type is a subtype and supertype of itself.

3.24
semantic type
type

referring expression that distinguishes a collection of feature structures (3.14) as an identifiable and

conceptually significant class

NOTE As implied by the name semantic type, types in this part of ISO 24610 do not serve to distinguish feature

structures or their specifications syntactically.
3.25
type declaration

structure that declares the supertypes (3.23), admissible features (3.2), admissible feature values (3.3),

admissibility constraints (3.1) and implicational constraints (3.18) for a given type (3.24)

NOTE The constraints on a type in the resulting feature system are those that have been declared in its declaration,

in addition to those that it has inherited from its supertypes.
4 © ISO 2011 – All rights reserved
---------------------- Page: 12 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
3.26
type hierarchy
partial order (3.20) over a set of types (3.24)
NOTE See ISO 24610-1:2006, Annex C, Type inheritance hierarchies.
3.27
typed feature structure
TFS
feature structure (3.14) that bears a type (3.24)
3.28
typing

assignment of a semantic type (3.24) to a built-in (3.6) or feature structure (3.14), either atomic or complex

NOTE Semantic types in feature systems are partially ordered, with multiple inheritance.

3.29
underspecification
provision of partial information about a value (3.17)

NOTE An underspecification generally subsumes one of a range of candidate values that could be resolved to a

single value through subsequent constraint resolution. See subsumption (3.21).
3.30
well-formedness
syntactic conformity of a feature structure (3.14) representation to ISO 24610-1
3.31
validity

conformity of a typed feature structure (3.27) to the constraints (3.8) of a particular feature system (3.15)

NOTE See Clause 6.
4 Overall structure
The main part of the document consists of four clauses: Clauses 5, 6, 7 and 8.

 Clause 5, Basic concepts, reviews the definition of typed feature structures and the notions of atomic and

complex types, collections and other operators that may appear in feature values. It then describes the

notions of type inheritance hierarchies, type constraints, default values and underspecification that are

essential to the construction of feature systems.

 Clause 6, Defining well-formedness versus validity, discusses the conditions of well-formedness and

validity.

 Clause 7, A feature system for a grammar, illustrates how to define types with a type hierarchy and type

constraints which declare what features and values are admissible for specific types.

 Finally, Clause 8, Declaration of a feature system, discusses how a feature system can be declared and

developed into a validator.

The main part of the document is followed by two annexes: Annex A contains the XML schema for this part of

ISO 24610; Annex B contains a complete example.
© ISO 2011 – All rights reserved 5
---------------------- Page: 13 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
5 Basic concepts
5.1 Typed feature structures reviewed

Typed feature structures (TFSs) are introduced as basic records for language resource management.

For more information, refer to ISO 24610-1:2006, 4.7, Typed feature structure, and Annex C, Type inheritance

hierarchies.

Here, a TFS is formally defined as a tuple over a finite set Feat of features, a collection X of

non-feature-structure elements, and a type hierarchy Type, , where Type is a finite set of types and  is a

subtyping relation over Type.
A feature structure is a tuple , in which
a) Q is a set of nodes,
b) γ ∈ Q is the root node of the feature structure,
c) θ : Q → Type is a partial typing function, and
d) δ : Feat × Q → Q ∪ X is a partial feature value function,

such that, for all q ∈ Q, there exists a path of features F , ..., F such that δ[F , ... δ(F , γ) ... ]  q.

1 n n 1

elements denote nodes. This definition deviates from the standard one used in linguistics and theoretical

computer science in that (1) typing is partial, not total, i.e. not all feature structures have types, and (2) feature

values might not be feature structures, but instead be drawn from a collection denoted by other XML elements

such as string, numeric, symbol, and binary (the X above). Note that nodes are typed, but features themselves

are not.

The following XML representation of a feature structure is considered well-formed, where the attribute type is

assigned to each of the two elements.
EXAMPLE Typed feature structure:


had












The feature name ORTH above stands for orthography, the conventional written form of a word or phrase.

This XML representation shows how the morpho-syntactic features of an English word “had” are specified as

a past-tensed and non-auxiliary verb.

In the alternative, “matrix” or “AVM” notation, type names are conventionally in the lower-case, sometimes

italicized or in the text type font, feature names in the upper-case, and strings in quotes. Binary values are

6 © ISO 2011 – All rights reserved
---------------------- Page: 14 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)

indicated with  or . These conventions are followed in this document, too. The above feature structure would

be depicted in matrix notation as shown in Figure 1.
Figure 1 — Matrix notation
5.2 Types
5.2.1 Atomic types

Alongside the built-ins (, , and ), it is possible for a feature structure to

have a type but no features. These are called simple or atomic feature structures, and types that allow for no

features in their feature system declaration (FSD) are called atomic types.

There is, as a result, always the possibility of declaring new atomic types and using these instead of the

above-mentioned built-ins to specify simple values. The above feature structure, for example, could have

instead been rendered as follows, assuming the extra types had, past and false were declared in an FSD.

EXAMPLE Typed feature structure: alternative formulation















There is a difference also noticed between the two classes of built-ins: on the one hand, and

, and on the other. Any kind of string is permissible as the content of the

element, whereas a very restricted set of values is permissible in , and

elements. To reflect this difference, members of the latter class specify their values using the attribute value.

The type , for instance, is associated with four values: true, false, plus (equivalent to true) and minus

(equivalent to false).

NOTE ISO 24610-1:2006 introduced the type binary, but the W3C's XML schema (2001) names it boolean.

It is the duty of the encoder to choose between atomic-type encodings and built-in encodings consistently.

This part of ISO 24610 does not regard one as identical or even consistent with the other.

5.2.2 Complex types

Types that are not atomic are called complex. These include all of the types declared by the encoder in an

FSD that declare or inherit admissible features. A feature is only admissible to a type if feature structures of

that type are permitted by the FSD to have values for that feature. This does not mean that well-formed

feature structures cannot arbitrarily associate types with feature structures regardless of their featural content

– they can. But only those feature structures that use only admissible features to their type, as specified by

© ISO 2011 – All rights reserved 7
---------------------- Page: 15 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)

some FSD, could be validated against that FSD. The distinction between validity and well-formedness is

further elaborated upon in Clause 6.

All user-declared types, no matter whether they are atomic or complex, are semantic, i.e. syntactically, they

look no different from each other, apart from the value of their type attribute. It is the role of a validator to

interpret the real significance of these types through enforcing restrictions on admissibility, restrictions on the

possible values that admissible features can have (), and other constraints that take the form of

logical implications. All of these are specified, for each type, in an FSD.

The built-ins defined by the ISO 24610-1:2006 feature structure representations (FSRs) standard are purely

syntactic. They can be used without declaration in an FSD, and they cannot be declared in an FSD. They can

appear in value range restrictions, or in implicational constraints, but they cannot have such restrictions (since

they have no admissible features) or constraints of their own.
5.2.3 Collections

Not all built-ins are as simple as those mentioned above, however. Some grammatical features such as

specifiers (SPR), complements (COMPS) and arguments (ARGS) are considered as having a list of

[10]

grammatical values, especially in Head-driven Phrase Structure Grammars (Pollard and Sag 1994 ; Sag,

[12]

Wasow, Bender 2003 ). For languages other than English, some of these features may take other kinds of

collections, namely sets or multisets, as their value. In a language (e.g. German, Korean or Japanese) that

allows a relatively free word order, the feature COMPS may be analysed as taking a set or multiset, instead of

a list, of complements. For more general applications, ISO 24610-1:2006 thus introduces sets and multisets

as well as lists as built-in ways of assembling complex feature values.

Collections (; ISO 24610-1:2006, 5.8, Collections as complex feature values) take the organization

(org) attribute, with the values “list”, “set” and “bag”. In lists, order and multi

...

INTERNATIONAL ISO
STANDARD 24610-2
First edition
2011-10-01
Language resource management —
Feature structures —
Part 2:
Feature system declaration
Gestion des ressources langagières — Structures de traits —
Partie 2: Déclaration de système de structures de traits
Reference number
ISO 24610-2:2011(E)
ISO 2011
---------------------- Page: 1 ----------------------
ISO 24610-2:2011(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2011

All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,

electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or

ISO's member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2011 – All rights reserved
---------------------- Page: 2 ----------------------
ISO 24610-2:2011(E)
Contents Page

Foreword ............................................................................................................................................................ iv

Introduction ......................................................................................................................................................... v

1  Scope ...................................................................................................................................................... 1

2  Normative references ............................................................................................................................ 1

3  Terms and definitions ........................................................................................................................... 2

4  Overall structure .................................................................................................................................... 5

5  Basic concepts ...................................................................................................................................... 6

5.1  Typed feature structures reviewed ...................................................................................................... 6

5.2  Types ...................................................................................................................................................... 7

5.3  Type inheritance hierarchies ................................................................................................................ 9

5.4  Type constraints .................................................................................................................................. 11

5.5  Optional (default) values and underspecification ............................................................................ 12

5.6  Subsumption ........................................................................................................................................ 12

6  Defining well-formedness versus validity......................................................................................... 14

6.1  Overview ............................................................................................................................................... 14

6.2  ISO 24610 ............................................................................................................................................. 14

7  A feature system for a grammar ........................................................................................................ 19

7.1  Overview ............................................................................................................................................... 19

7.2  Sample FSDs ........................................................................................................................................ 20

8  Declaration of a feature system ......................................................................................................... 23

8.1  Overview ............................................................................................................................................... 24

8.2  Linking a text to feature system declarations .................................................................................. 24

8.3  Overall structure of a feature system declaration ........................................................................... 25

8.4  Feature declarations ........................................................................................................................... 27

8.5  Feature structure constraints ............................................................................................................ 33

Annex A (normative) XML schema for feature structures ............................................................................ 36

Annex B (informative) A complete example ................................................................................................... 46

Bibliography ...................................................................................................................................................... 50

© ISO 2011 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO 24610-2:2011(E)
Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies

(ISO member bodies). The work of preparing International Standards is normally carried out through ISO

technical committees. Each member body interested in a subject for which a technical committee has been

established has the right to be represented on that committee. International organizations, governmental and

non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the

International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.

International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.

The main task of technical committees is to prepare International Standards. Draft International Standards

adopted by the technical committees are circulated to the member bodies for voting. Publication as an

International Standard requires approval by at least 75 % of the member bodies casting a vote.

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent

rights. ISO shall not be held responsible for identifying any or all such patent rights.

ISO 24610-2 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content

resources, Subcommittee SC 4, Language resource management.

ISO 24610 consists of the following parts, under the general title Language resource management — Feature

structures:
 Part 1: Feature structure representation
 Part 2: Feature system declaration
iv © ISO 2011 – All rights reserved
---------------------- Page: 4 ----------------------
ISO 24610-2:2011(E)
Introduction
ISO 24610 is organized in two separate main parts.

 Part 1, Feature structure representation, is dedicated to the description of feature structures, providing an

informal and yet explicit outline of their characteristics, as well as an XML-based structured way of

representing feature structures in general and typed feature structures in particular. It is designed to lay a

basis for constructing an XML-based reference format for exchanging (typed) feature structures between

applications.

 Part 2, Feature system declaration, will provide an implementation standard for XML-based typed feature

structures, first by defining a set of types and their hierarchy, then by formulating type constraints on a set

of features and their respective admissible feature values and finally by introducing a set of validity

conditions on feature structures for particular applications, especially related to the goal of language

resource management.

A feature structure is a general-purpose data structure that identifies and groups together individual features

by assigning a particular value to each. Because of the generality of feature structures, they can be used to

represent many different kinds of information. Interrelations among various pieces of information and their

instantiation in markup provide a meta-language for representing linguistic content. Moreover, this

instantiation allows a specification of a set of features and values associated with specific types and their

restrictions, by means of feature system declarations, or other XML mechanisms to be discussed in this part

of ISO 24610.

Some of the statements here are copied from ISO 24610-1:2006 in order to make this part standalone without

referring to part 1.
© ISO 2011 – All rights reserved v
---------------------- Page: 5 ----------------------
INTERNATIONAL STANDARD ISO 24610-2:2011(E)
Language resource management — Feature structures —
Part 2:
Feature system declaration
1 Scope

This part of ISO 24610 provides a format to represent, store or exchange feature structures in natural

language applications, for both annotation and production of linguistic data. It is ultimately designed to provide

a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature

specifications and operations on feature structures, thus offering means to check the conformance of each

feature structure with regards to a reference specification. Feature structures are an essential part of many

linguistic formalisms as well as an underlying mechanism for representing the information consumed or

produced by and for language engineering applications.

A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that

makes use of fs (that is, feature structure) elements. The FSD serves four purposes.

 It provides an encoding by which types and their subtyping and inheritance relationships can be

introduced and defined, thus laying the basis for constructing a feature system.

 It provides a mechanism by which the encoder can list all of the feature names and feature values and

give a prose description as to what each represents.

 It provides a mechanism by which type constraints can be declared, against which typed feature

structures are validated relative to a given theory stated in typed feature logic. These constraints may

involve constraints on the range of a feature's value, constraints on which features are permitted within

certain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value

pairs. The source of these constraints is normally the empirical domain being modelled.

 It provides a mechanism by which the encoder can define the intended interpretation of underspecified

feature structures. This involves defining default values (whether literal or computed) for missing features.

The scheme described in this part of ISO 24610 may be used to document any feature system, but is primarily

intended for use with the typed feature structure representation defined in ISO 24610-1. The feature structure

representations of ISO 24610-1 specify data structures that are subject to the typing conventions and

constraints specified using ISO 24610-2. The feature structure representations of ISO 24610-1 are also used

within some of the elements defined in ISO 24610-2.
2 Normative references

The following referenced documents are indispensable for the application of this document. For dated

references, only the edition cited applies. For undated references, the latest edition of the referenced

document (including any amendments) applies.

ISO 24610-1:2006, Language resource management — Feature structures — Part 1: Feature structure

representation

ISO/IEC 19757-2, Information technology — Document Schema Definition Language (DSDL) — Part 2:

Regular-grammar-based validation — RELAX NG
© ISO 2011 – All rights reserved 1
---------------------- Page: 6 ----------------------
ISO 24610-2:2011(E)
3 Terms and definitions

For the purposes of this document, the terms and definitions given in ISO 19757-2 and the following apply.

3.1
admissibility constraint
feature admissibility constraint

specification of a set of admissible features (3.2) and admissible feature values (3.3) associated with a

specific type (3.24)
3.2
admissible feature
appropriate feature

feature which any feature structure (3.14) of a given type (3.24) may bear a value (3.17) for

NOTE This term is often interpreted elsewhere to mean obligatory, i.e. feature structures of the given type must bear

a value for every admissible feature. This term does not imply that the feature is obligatory here.

3.3
admissible feature value
admissible value
value restriction
range restriction

value (3.17) that the value of an admissible feature (3.2) must be subsumed by in feature structures (3.14)

of a given type (3.24)
3.4
atomic type
user-defined type (3.24) with no admissible features (3.2) declared or inherited
3.5
bag
multiset

triple of an integer n, a set S and a function that maps the integers in the range, 1 to n, to elements of S

NOTE A bag is halfway between a set (in that its elements are unordered) and a list (in that particular elements can

occur more than once).
3.6
built-in

non-user-defined element that may appear in place of a feature structure (3.14), for example, as a feature

value (3.17)

NOTE Built-ins can be atomic or complex. The atomic built-ins are numeric, string, symbol and binary. The complex

built-ins are collections (3.7) and applications of the operators, i.e. alternation, negation and merge (5.2.4).

3.7
collection

feature value (3.17) consisting of potentially many values, organized as a list, set or bag (3.5)

3.8
constraint

unit of specification that identifies some collection of feature structures (3.14) as invalid

NOTE 1 All constraints are implicational in their syntactic form, although some are distinguished as admissibility

constraints. See validity (3.31) and 5.4. All feature structures not explicitly excluded as invalid are considered to be valid.

NOTE 2 A feature structure that has not been so identified by any of the constraints in a feature system is considered

to be valid.
2 © ISO 2011 – All rights reserved
---------------------- Page: 7 ----------------------
ISO 24610-2:2011(E)
3.9
default value
value (3.17) otherwise assigned to a feature (3.12) when one is not specified
EXAMPLE Masculine is the default value of the grammatical gender in Dutch.
NOTE A feature structure may not bear a feature without a corresponding value.
3.10
empty feature structure
feature structure (3.14) that contains no information
NOTE An empty feature structure subsumes all other feature structures.
3.11
extension
converse of subsumption (3.21)
NOTE A feature structure F extends G if and only if G subsumes F.
3.12
feature

property or aspect of an entity that is formally represented as a function mapping the entity to a corresponding

value (3.17)
3.13
feature specification

pairing of a feature (3.12) with a value (3.17) in a feature structure description

3.14
feature structure

record structure that associates one value (3.17) to each of a collection of features

NOTE 1 Each value is either a feature structure or a simpler built-in (3.6) such as a string.

NOTE 2 Feature structures are partially ordered. The minimal feature structures in this ordering are the empty feature

structures.
3.15
feature system

type hierarchy (3.26) in which each type (3.24) has been associated with a collection of admissibility

constraints (3.1) and implicational constraints (3.18)
NOTE cf. type declaration (3.25)
3.16
feature system declaration
FSD
specification of a particular feature system (3.15)
3.17
feature value
value

entity or aggregation of entities that characterize some property or aspect of another entity

© ISO 2011 – All rights reserved 3
---------------------- Page: 8 ----------------------
ISO 24610-2:2011(E)
3.18
implicational constraint

constraint of the form, “if G, then H,” where G and H are feature structures (3.14)

NOTE This identifies any feature structure F as invalid for which G subsumes F, and yet F and H have no valid

extension in common. See subsumption (3.21) and 8.5. Often used to refer to implicational constraints that are not also

admissibility constraints.
3.19
interpretation

minimally informative (or equivalently, most general) extension (3.11) of a feature structure (3.14) that is

consistent with a set of constraints declared by an FSD (3.16)
3.20
partial order
partially ordered set

set S equipped with a relation  over S  S that is (1) reflexive (for all s  S, s  s), (2) anti-symmetric (for all p,

q  S, if p  q and q  p, then p  q), and (3) transitive (for all p, q, r  S, if p  q and q  r, then p  r)

NOTE The set of integers Z is partially ordered, but it has an additional property: for every p, q  Z, either p  q or

q  p. Not all partial orders have this property. The taxonomical classification of organisms into phyla, genera and species,

for example, is a partial order that does not. Type hierarchies may not necessarily. The typed feature structures of a

feature system do not, unless (a) their type hierarchy does, and (b) either the type hierarchy has exactly one type, or every

y type is constrained to have exactly one appropriate feature.
3.21
subsumption

property that holds between two feature structures, G and F, such that G is said to subsume F if and only if F

carries all of the information with it that G does
NOTE A formal definition is provided in 5.6.
3.22
subtype

type (3.24) to which another type confers its constraints and appropriate features

3.23
supertype
base type

type (3.24) from which another type inherits constraints and appropriate features

NOTE s is a subtype of t iff t is a supertype of s. Every type is a subtype and supertype of itself.

3.24
semantic type
type

referring expression that distinguishes a collection of feature structures (3.14) as an identifiable and

conceptually significant class

NOTE As implied by the name semantic type, types in this part of ISO 24610 do not serve to distinguish feature

structures or their specifications syntactically.
3.25
type declaration

structure that declares the supertypes (3.23), admissible features (3.2), admissible feature values (3.3),

admissibility constraints (3.1) and implicational constraints (3.18) for a given type (3.24)

NOTE The constraints on a type in the resulting feature system are those that have been declared in its declaration,

in addition to those that it has inherited from its supertypes.
4 © ISO 2011 – All rights reserved
---------------------- Page: 9 ----------------------
ISO 24610-2:2011(E)
3.26
type hierarchy
partial order (3.20) over a set of types (3.24)
NOTE See ISO 24610-1:2006, Annex C, Type inheritance hierarchies.
3.27
typed feature structure
TFS
feature structure (3.14) that bears a type (3.24)
3.28
typing

assignment of a semantic type (3.24) to a built-in (3.6) or feature structure (3.14), either atomic or complex

NOTE Semantic types in feature systems are partially ordered, with multiple inheritance.

3.29
underspecification
provision of partial information about a value (3.17)

NOTE An underspecification generally subsumes one of a range of candidate values that could be resolved to a

single value through subsequent constraint resolution. See subsumption (3.21).
3.30
well-formedness
syntactic conformity of a feature structure (3.14) representation to ISO 24610-1
3.31
validity

conformity of a typed feature structure (3.27) to the constraints (3.8) of a particular feature system (3.15)

NOTE See Clause 6.
4 Overall structure
The main part of the document consists of four clauses: Clauses 5, 6, 7 and 8.

 Clause 5, Basic concepts, reviews the definition of typed feature structures and the notions of atomic and

complex types, collections and other operators that may appear in feature values. It then describes the

notions of type inheritance hierarchies, type constraints, default values and underspecification that are

essential to the construction of feature systems.

 Clause 6, Defining well-formedness versus validity, discusses the conditions of well-formedness and

validity.

 Clause 7, A feature system for a grammar, illustrates how to define types with a type hierarchy and type

constraints which declare what features and values are admissible for specific types.

 Finally, Clause 8, Declaration of a feature system, discusses how a feature system can be declared and

developed into a validator.

The main part of the document is followed by two annexes: Annex A contains the XML schema for this part of

ISO 24610; Annex B contains a complete example.
© ISO 2011 – All rights reserved 5
---------------------- Page: 10 ----------------------
ISO 24610-2:2011(E)
5 Basic concepts
5.1 Typed feature structures reviewed

Typed feature structures (TFSs) are introduced as basic records for language resource management.

For more information, refer to ISO 24610-1:2006, 4.7, Typed feature structure, and Annex C, Type inheritance

hierarchies.

Here, a TFS is formally defined as a tuple over a finite set Feat of features, a collection X of

non-feature-structure elements, and a type hierarchy Type, , where Type is a finite set of types and  is a

subtyping relation over Type.
A feature structure is a tuple , in which
a) Q is a set of nodes,
b) γ ∈ Q is the root node of the feature structure,
c) θ : Q → Type is a partial typing function, and
d) δ : Feat × Q → Q ∪ X is a partial feature value function,

such that, for all q ∈ Q, there exists a path of features F , ..., F such that δ[F , ... δ(F , γ) ... ]  q.

1 n n 1

elements denote nodes. This definition deviates from the standard one used in linguistics and theoretical

computer science in that (1) typing is partial, not total, i.e. not all feature structures have types, and (2) feature

values might not be feature structures, but instead be drawn from a collection denoted by other XML elements

such as string, numeric, symbol, and binary (the X above). Note that nodes are typed, but features themselves

are not.

The following XML representation of a feature structure is considered well-formed, where the attribute type is

assigned to each of the two elements.
EXAMPLE Typed feature structure:


had












The feature name ORTH above stands for orthography, the conventional written form of a word or phrase.

This XML representation shows how the morpho-syntactic features of an English word “had” are specified as

a past-tensed and non-auxiliary verb.

In the alternative, “matrix” or “AVM” notation, type names are conventionally in the lower-case, sometimes

italicized or in the text type font, feature names in the upper-case, and strings in quotes. Binary values are

6 © ISO 2011 – All rights reserved
---------------------- Page: 11 ----------------------
ISO 24610-2:2011(E)

indicated with  or . These conventions are followed in this document, too. The above feature structure would

be depicted in matrix notation as shown in Figure 1.
Figure 1 — Matrix notation
5.2 Types
5.2.1 Atomic types

Alongside the built-ins (, , and ), it is possible for a feature structure to

have a type but no features. These are called simple or atomic feature structures, and types that allow for no

features in their feature system declaration (FSD) are called atomic types.

There is, as a result, always the possibility of declaring new atomic types and using these instead of the

above-mentioned built-ins to specify simple values. The above feature structure, for example, could have

instead been rendered as follows, assuming the extra types had, past and false were declared in an FSD.

EXAMPLE Typed feature structure: alternative formulation















There is a difference also noticed between the two classes of built-ins: on the one hand, and

, and on the other. Any kind of string is permissible as the content of the

element, whereas a very restricted set of values is permissible in , and

elements. To reflect this difference, members of the latter class specify their values using the attribute value.

The type , for instance, is associated with four values: true, false, plus (equivalent to true) and minus

(equivalent to false).

NOTE ISO 24610-1:2006 introduced the type binary, but the W3C's XML schema (2001) names it boolean.

It is the duty of the encoder to choose between atomic-type encodings and built-in encodings consistently.

This part of ISO 24610 does not regard one as identical or even consistent with the other.

5.2.2 Complex types

Types that are not atomic are called complex. These include all of the types declared by the encoder in an

FSD that declare or inherit admissible features. A feature is only admissible to a type if feature structures of

that type are permitted by the FSD to have values for that feature. This does not mean that well-formed

feature structures cannot arbitrarily associate types with feature structures regardless of their featural content

– they can. But only those feature structures that use only admissible features to their type, as specified by

© ISO 2011 – All rights reserved 7
---------------------- Page: 12 ----------------------
ISO 24610-2:2011(E)

some FSD, could be validated against that FSD. The distinction between validity and well-formedness is

further elaborated upon in Clause 6.

All user-declared types, no matter whether they are atomic or complex, are semantic, i.e. syntactically, they

look no different from each other, apart from the value of their type attribute. It is the role of a validator to

interpret the real significance of these types through enforcing restrictions on admissibility, restrictions on the

possible values that admissible features can have (), and other constraints that take the form of

logical implications. All of these are specified, for each type, in an FSD.

The built-ins defined by the ISO 24610-1:2006 feature structure representations (FSRs) standard are purely

syntactic. They can be used without declaration in an FSD, and they cannot be declared in an FSD. They can

appear in value range restrictions, or in implicational constraints, but they cannot have such restrictions (since

they have no admissible features) or constraints of their own.
5.2.3 Collections

Not all built-ins are as simple as those mentioned above, however. Some grammatical features such as

specifiers (SPR), complements (COMPS) and arguments (ARGS) are considered as having a list of

[10]

grammatical values, especially in Head-driven Phrase Structure Grammars (Pollard and Sag 1994 ; Sag,

[12]

Wasow, Bender 2003 ). For languages other than English, some of these features may take other kinds of

collections, namely sets or multisets, as their value. In a language (e.g. German, Korean or Japanese) that

allows a relatively free word order, the feature COMPS may be analysed as taking a set or multiset, instead of

a list, of complements. For more general applications, ISO 24610-1:2006 thus introduces sets and multisets

as well as lists as built-in ways of assembling complex feature values.

Collections (; ISO 24610-1:2006, 5.8, Collections as complex feature values) take the organization

(org) attribute, with the values “list”, “set” and “bag”. In lists, order and multiplicity of elements matter. In bags,

only multiplicity matters (these are often called multisets). In sets, neither order nor multiplicity matter.

For example, the feature ARGS of verbs can be represented by specifying the organization of as a list

of values, each of which is of type phrase.
EXAMPLE List value


put
























8 © ISO 2011 – All rights reserved
---------------------- Page: 13 ----------------------
ISO 24610-2:2011(E)

Some would call the type of this collection list (phrase), but polymorphic lists are not yet supported in this part

of ISO 24610. This is equivalent to the following AVM notation, where NP stands for a feature structure of the

type phrase with a positive NOMINAL feature, namely a noun phrase, and PP, a feature structure of the type

phrase with a positive PREPOSITIONAL feature, namely a prepositional
...

SLOVENSKI STANDARD
SIST ISO 24610-2:2013
01-julij-2013

Upravljanje z jezikovnimi viri - Strukture lastnosti - 2. del: Deklaracija sistema

lastnosti
Language resource management -- Feature structures -- Part 2: Feature system
declaration

Gestion des ressources langagières -- Structures de traits -- Partie 2: Déclaration de

système de structures de traits
Ta slovenski standard je istoveten z: ISO 24610-2:2011
ICS:
01.140.20 Informacijske vede Information sciences
SIST ISO 24610-2:2013 en,fr,de

2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

---------------------- Page: 1 ----------------------
SIST ISO 24610-2:2013
---------------------- Page: 2 ----------------------
SIST ISO 24610-2:2013
INTERNATIONAL ISO
STANDARD 24610-2
First edition
2011-10-01
Language resource management —
Feature structures —
Part 2:
Feature system declaration
Gestion des ressources langagières — Structures de traits —
Partie 2: Déclaration de système de structures de traits
Reference number
ISO 24610-2:2011(E)
ISO 2011
---------------------- Page: 3 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO 2011

All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,

electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or

ISO's member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2011 – All rights reserved
---------------------- Page: 4 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
Contents Page

Foreword ............................................................................................................................................................ iv

Introduction ......................................................................................................................................................... v

1  Scope ...................................................................................................................................................... 1

2  Normative references ............................................................................................................................ 1

3  Terms and definitions ........................................................................................................................... 2

4  Overall structure .................................................................................................................................... 5

5  Basic concepts ...................................................................................................................................... 6

5.1  Typed feature structures reviewed ...................................................................................................... 6

5.2  Types ...................................................................................................................................................... 7

5.3  Type inheritance hierarchies ................................................................................................................ 9

5.4  Type constraints .................................................................................................................................. 11

5.5  Optional (default) values and underspecification ............................................................................ 12

5.6  Subsumption ........................................................................................................................................ 12

6  Defining well-formedness versus validity......................................................................................... 14

6.1  Overview ............................................................................................................................................... 14

6.2  ISO 24610 ............................................................................................................................................. 14

7  A feature system for a grammar ........................................................................................................ 19

7.1  Overview ............................................................................................................................................... 19

7.2  Sample FSDs ........................................................................................................................................ 20

8  Declaration of a feature system ......................................................................................................... 23

8.1  Overview ............................................................................................................................................... 24

8.2  Linking a text to feature system declarations .................................................................................. 24

8.3  Overall structure of a feature system declaration ........................................................................... 25

8.4  Feature declarations ........................................................................................................................... 27

8.5  Feature structure constraints ............................................................................................................ 33

Annex A (normative) XML schema for feature structures ............................................................................ 36

Annex B (informative) A complete example ................................................................................................... 46

Bibliography ...................................................................................................................................................... 50

© ISO 2011 – All rights reserved iii
---------------------- Page: 5 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies

(ISO member bodies). The work of preparing International Standards is normally carried out through ISO

technical committees. Each member body interested in a subject for which a technical committee has been

established has the right to be represented on that committee. International organizations, governmental and

non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the

International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.

International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.

The main task of technical committees is to prepare International Standards. Draft International Standards

adopted by the technical committees are circulated to the member bodies for voting. Publication as an

International Standard requires approval by at least 75 % of the member bodies casting a vote.

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent

rights. ISO shall not be held responsible for identifying any or all such patent rights.

ISO 24610-2 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content

resources, Subcommittee SC 4, Language resource management.

ISO 24610 consists of the following parts, under the general title Language resource management — Feature

structures:
 Part 1: Feature structure representation
 Part 2: Feature system declaration
iv © ISO 2011 – All rights reserved
---------------------- Page: 6 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
Introduction
ISO 24610 is organized in two separate main parts.

 Part 1, Feature structure representation, is dedicated to the description of feature structures, providing an

informal and yet explicit outline of their characteristics, as well as an XML-based structured way of

representing feature structures in general and typed feature structures in particular. It is designed to lay a

basis for constructing an XML-based reference format for exchanging (typed) feature structures between

applications.

 Part 2, Feature system declaration, will provide an implementation standard for XML-based typed feature

structures, first by defining a set of types and their hierarchy, then by formulating type constraints on a set

of features and their respective admissible feature values and finally by introducing a set of validity

conditions on feature structures for particular applications, especially related to the goal of language

resource management.

A feature structure is a general-purpose data structure that identifies and groups together individual features

by assigning a particular value to each. Because of the generality of feature structures, they can be used to

represent many different kinds of information. Interrelations among various pieces of information and their

instantiation in markup provide a meta-language for representing linguistic content. Moreover, this

instantiation allows a specification of a set of features and values associated with specific types and their

restrictions, by means of feature system declarations, or other XML mechanisms to be discussed in this part

of ISO 24610.

Some of the statements here are copied from ISO 24610-1:2006 in order to make this part standalone without

referring to part 1.
© ISO 2011 – All rights reserved v
---------------------- Page: 7 ----------------------
SIST ISO 24610-2:2013
---------------------- Page: 8 ----------------------
SIST ISO 24610-2:2013
INTERNATIONAL STANDARD ISO 24610-2:2011(E)
Language resource management — Feature structures —
Part 2:
Feature system declaration
1 Scope

This part of ISO 24610 provides a format to represent, store or exchange feature structures in natural

language applications, for both annotation and production of linguistic data. It is ultimately designed to provide

a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature

specifications and operations on feature structures, thus offering means to check the conformance of each

feature structure with regards to a reference specification. Feature structures are an essential part of many

linguistic formalisms as well as an underlying mechanism for representing the information consumed or

produced by and for language engineering applications.

A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that

makes use of fs (that is, feature structure) elements. The FSD serves four purposes.

 It provides an encoding by which types and their subtyping and inheritance relationships can be

introduced and defined, thus laying the basis for constructing a feature system.

 It provides a mechanism by which the encoder can list all of the feature names and feature values and

give a prose description as to what each represents.

 It provides a mechanism by which type constraints can be declared, against which typed feature

structures are validated relative to a given theory stated in typed feature logic. These constraints may

involve constraints on the range of a feature's value, constraints on which features are permitted within

certain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value

pairs. The source of these constraints is normally the empirical domain being modelled.

 It provides a mechanism by which the encoder can define the intended interpretation of underspecified

feature structures. This involves defining default values (whether literal or computed) for missing features.

The scheme described in this part of ISO 24610 may be used to document any feature system, but is primarily

intended for use with the typed feature structure representation defined in ISO 24610-1. The feature structure

representations of ISO 24610-1 specify data structures that are subject to the typing conventions and

constraints specified using ISO 24610-2. The feature structure representations of ISO 24610-1 are also used

within some of the elements defined in ISO 24610-2.
2 Normative references

The following referenced documents are indispensable for the application of this document. For dated

references, only the edition cited applies. For undated references, the latest edition of the referenced

document (including any amendments) applies.

ISO 24610-1:2006, Language resource management — Feature structures — Part 1: Feature structure

representation

ISO/IEC 19757-2, Information technology — Document Schema Definition Language (DSDL) — Part 2:

Regular-grammar-based validation — RELAX NG
© ISO 2011 – All rights reserved 1
---------------------- Page: 9 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
3 Terms and definitions

For the purposes of this document, the terms and definitions given in ISO 19757-2 and the following apply.

3.1
admissibility constraint
feature admissibility constraint

specification of a set of admissible features (3.2) and admissible feature values (3.3) associated with a

specific type (3.24)
3.2
admissible feature
appropriate feature

feature which any feature structure (3.14) of a given type (3.24) may bear a value (3.17) for

NOTE This term is often interpreted elsewhere to mean obligatory, i.e. feature structures of the given type must bear

a value for every admissible feature. This term does not imply that the feature is obligatory here.

3.3
admissible feature value
admissible value
value restriction
range restriction

value (3.17) that the value of an admissible feature (3.2) must be subsumed by in feature structures (3.14)

of a given type (3.24)
3.4
atomic type
user-defined type (3.24) with no admissible features (3.2) declared or inherited
3.5
bag
multiset

triple of an integer n, a set S and a function that maps the integers in the range, 1 to n, to elements of S

NOTE A bag is halfway between a set (in that its elements are unordered) and a list (in that particular elements can

occur more than once).
3.6
built-in

non-user-defined element that may appear in place of a feature structure (3.14), for example, as a feature

value (3.17)

NOTE Built-ins can be atomic or complex. The atomic built-ins are numeric, string, symbol and binary. The complex

built-ins are collections (3.7) and applications of the operators, i.e. alternation, negation and merge (5.2.4).

3.7
collection

feature value (3.17) consisting of potentially many values, organized as a list, set or bag (3.5)

3.8
constraint

unit of specification that identifies some collection of feature structures (3.14) as invalid

NOTE 1 All constraints are implicational in their syntactic form, although some are distinguished as admissibility

constraints. See validity (3.31) and 5.4. All feature structures not explicitly excluded as invalid are considered to be valid.

NOTE 2 A feature structure that has not been so identified by any of the constraints in a feature system is considered

to be valid.
2 © ISO 2011 – All rights reserved
---------------------- Page: 10 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
3.9
default value
value (3.17) otherwise assigned to a feature (3.12) when one is not specified
EXAMPLE Masculine is the default value of the grammatical gender in Dutch.
NOTE A feature structure may not bear a feature without a corresponding value.
3.10
empty feature structure
feature structure (3.14) that contains no information
NOTE An empty feature structure subsumes all other feature structures.
3.11
extension
converse of subsumption (3.21)
NOTE A feature structure F extends G if and only if G subsumes F.
3.12
feature

property or aspect of an entity that is formally represented as a function mapping the entity to a corresponding

value (3.17)
3.13
feature specification

pairing of a feature (3.12) with a value (3.17) in a feature structure description

3.14
feature structure

record structure that associates one value (3.17) to each of a collection of features

NOTE 1 Each value is either a feature structure or a simpler built-in (3.6) such as a string.

NOTE 2 Feature structures are partially ordered. The minimal feature structures in this ordering are the empty feature

structures.
3.15
feature system

type hierarchy (3.26) in which each type (3.24) has been associated with a collection of admissibility

constraints (3.1) and implicational constraints (3.18)
NOTE cf. type declaration (3.25)
3.16
feature system declaration
FSD
specification of a particular feature system (3.15)
3.17
feature value
value

entity or aggregation of entities that characterize some property or aspect of another entity

© ISO 2011 – All rights reserved 3
---------------------- Page: 11 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
3.18
implicational constraint

constraint of the form, “if G, then H,” where G and H are feature structures (3.14)

NOTE This identifies any feature structure F as invalid for which G subsumes F, and yet F and H have no valid

extension in common. See subsumption (3.21) and 8.5. Often used to refer to implicational constraints that are not also

admissibility constraints.
3.19
interpretation

minimally informative (or equivalently, most general) extension (3.11) of a feature structure (3.14) that is

consistent with a set of constraints declared by an FSD (3.16)
3.20
partial order
partially ordered set

set S equipped with a relation  over S  S that is (1) reflexive (for all s  S, s  s), (2) anti-symmetric (for all p,

q  S, if p  q and q  p, then p  q), and (3) transitive (for all p, q, r  S, if p  q and q  r, then p  r)

NOTE The set of integers Z is partially ordered, but it has an additional property: for every p, q  Z, either p  q or

q  p. Not all partial orders have this property. The taxonomical classification of organisms into phyla, genera and species,

for example, is a partial order that does not. Type hierarchies may not necessarily. The typed feature structures of a

feature system do not, unless (a) their type hierarchy does, and (b) either the type hierarchy has exactly one type, or every

y type is constrained to have exactly one appropriate feature.
3.21
subsumption

property that holds between two feature structures, G and F, such that G is said to subsume F if and only if F

carries all of the information with it that G does
NOTE A formal definition is provided in 5.6.
3.22
subtype

type (3.24) to which another type confers its constraints and appropriate features

3.23
supertype
base type

type (3.24) from which another type inherits constraints and appropriate features

NOTE s is a subtype of t iff t is a supertype of s. Every type is a subtype and supertype of itself.

3.24
semantic type
type

referring expression that distinguishes a collection of feature structures (3.14) as an identifiable and

conceptually significant class

NOTE As implied by the name semantic type, types in this part of ISO 24610 do not serve to distinguish feature

structures or their specifications syntactically.
3.25
type declaration

structure that declares the supertypes (3.23), admissible features (3.2), admissible feature values (3.3),

admissibility constraints (3.1) and implicational constraints (3.18) for a given type (3.24)

NOTE The constraints on a type in the resulting feature system are those that have been declared in its declaration,

in addition to those that it has inherited from its supertypes.
4 © ISO 2011 – All rights reserved
---------------------- Page: 12 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
3.26
type hierarchy
partial order (3.20) over a set of types (3.24)
NOTE See ISO 24610-1:2006, Annex C, Type inheritance hierarchies.
3.27
typed feature structure
TFS
feature structure (3.14) that bears a type (3.24)
3.28
typing

assignment of a semantic type (3.24) to a built-in (3.6) or feature structure (3.14), either atomic or complex

NOTE Semantic types in feature systems are partially ordered, with multiple inheritance.

3.29
underspecification
provision of partial information about a value (3.17)

NOTE An underspecification generally subsumes one of a range of candidate values that could be resolved to a

single value through subsequent constraint resolution. See subsumption (3.21).
3.30
well-formedness
syntactic conformity of a feature structure (3.14) representation to ISO 24610-1
3.31
validity

conformity of a typed feature structure (3.27) to the constraints (3.8) of a particular feature system (3.15)

NOTE See Clause 6.
4 Overall structure
The main part of the document consists of four clauses: Clauses 5, 6, 7 and 8.

 Clause 5, Basic concepts, reviews the definition of typed feature structures and the notions of atomic and

complex types, collections and other operators that may appear in feature values. It then describes the

notions of type inheritance hierarchies, type constraints, default values and underspecification that are

essential to the construction of feature systems.

 Clause 6, Defining well-formedness versus validity, discusses the conditions of well-formedness and

validity.

 Clause 7, A feature system for a grammar, illustrates how to define types with a type hierarchy and type

constraints which declare what features and values are admissible for specific types.

 Finally, Clause 8, Declaration of a feature system, discusses how a feature system can be declared and

developed into a validator.

The main part of the document is followed by two annexes: Annex A contains the XML schema for this part of

ISO 24610; Annex B contains a complete example.
© ISO 2011 – All rights reserved 5
---------------------- Page: 13 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)
5 Basic concepts
5.1 Typed feature structures reviewed

Typed feature structures (TFSs) are introduced as basic records for language resource management.

For more information, refer to ISO 24610-1:2006, 4.7, Typed feature structure, and Annex C, Type inheritance

hierarchies.

Here, a TFS is formally defined as a tuple over a finite set Feat of features, a collection X of

non-feature-structure elements, and a type hierarchy Type, , where Type is a finite set of types and  is a

subtyping relation over Type.
A feature structure is a tuple , in which
a) Q is a set of nodes,
b) γ ∈ Q is the root node of the feature structure,
c) θ : Q → Type is a partial typing function, and
d) δ : Feat × Q → Q ∪ X is a partial feature value function,

such that, for all q ∈ Q, there exists a path of features F , ..., F such that δ[F , ... δ(F , γ) ... ]  q.

1 n n 1

elements denote nodes. This definition deviates from the standard one used in linguistics and theoretical

computer science in that (1) typing is partial, not total, i.e. not all feature structures have types, and (2) feature

values might not be feature structures, but instead be drawn from a collection denoted by other XML elements

such as string, numeric, symbol, and binary (the X above). Note that nodes are typed, but features themselves

are not.

The following XML representation of a feature structure is considered well-formed, where the attribute type is

assigned to each of the two elements.
EXAMPLE Typed feature structure:


had












The feature name ORTH above stands for orthography, the conventional written form of a word or phrase.

This XML representation shows how the morpho-syntactic features of an English word “had” are specified as

a past-tensed and non-auxiliary verb.

In the alternative, “matrix” or “AVM” notation, type names are conventionally in the lower-case, sometimes

italicized or in the text type font, feature names in the upper-case, and strings in quotes. Binary values are

6 © ISO 2011 – All rights reserved
---------------------- Page: 14 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)

indicated with  or . These conventions are followed in this document, too. The above feature structure would

be depicted in matrix notation as shown in Figure 1.
Figure 1 — Matrix notation
5.2 Types
5.2.1 Atomic types

Alongside the built-ins (, , and ), it is possible for a feature structure to

have a type but no features. These are called simple or atomic feature structures, and types that allow for no

features in their feature system declaration (FSD) are called atomic types.

There is, as a result, always the possibility of declaring new atomic types and using these instead of the

above-mentioned built-ins to specify simple values. The above feature structure, for example, could have

instead been rendered as follows, assuming the extra types had, past and false were declared in an FSD.

EXAMPLE Typed feature structure: alternative formulation















There is a difference also noticed between the two classes of built-ins: on the one hand, and

, and on the other. Any kind of string is permissible as the content of the

element, whereas a very restricted set of values is permissible in , and

elements. To reflect this difference, members of the latter class specify their values using the attribute value.

The type , for instance, is associated with four values: true, false, plus (equivalent to true) and minus

(equivalent to false).

NOTE ISO 24610-1:2006 introduced the type binary, but the W3C's XML schema (2001) names it boolean.

It is the duty of the encoder to choose between atomic-type encodings and built-in encodings consistently.

This part of ISO 24610 does not regard one as identical or even consistent with the other.

5.2.2 Complex types

Types that are not atomic are called complex. These include all of the types declared by the encoder in an

FSD that declare or inherit admissible features. A feature is only admissible to a type if feature structures of

that type are permitted by the FSD to have values for that feature. This does not mean that well-formed

feature structures cannot arbitrarily associate types with feature structures regardless of their featural content

– they can. But only those feature structures that use only admissible features to their type, as specified by

© ISO 2011 – All rights reserved 7
---------------------- Page: 15 ----------------------
SIST ISO 24610-2:2013
ISO 24610-2:2011(E)

some FSD, could be validated against that FSD. The distinction between validity and well-formedness is

further elaborated upon in Clause 6.

All user-declared types, no matter whether they are atomic or complex, are semantic, i.e. syntactically, they

look no different from each other, apart from the value of their type attribute. It is the role of a validator to

interpret the real significance of these types through enforcing restrictions on admissibility, restrictions on the

possible values that admissible features can have (), and other constraints that take the form of

logical implications. All of these are specified, for each type, in an FSD.

The built-ins defined by the ISO 24610-1:2006 feature structure representations (FSRs) standard are purely

syntactic. They can be used without declaration in an FSD, and they cannot be declared in an FSD. They can

appear in value range restrictions, or in implicational constraints, but they cannot have such restrictions (since

they have no admissible features) or constraints of their own.
5.2.3 Collections

Not all built-ins are as simple as those mentioned above, however. Some grammatical features such as

specifiers (SPR), complements (COMPS) and arguments (ARGS) are considered as having a list of

[10]

grammatical values, especially in Head-driven Phrase Structure Grammars (Pollard and Sag 1994 ; Sag,

[12]

Wasow, Bender 2003 ). For languages other than English, some of these features may take other kinds of

collections, namely sets or multisets, as their value. In a language (e.g. German, Korean or Japanese) that

allows a relatively free word order, the feature COMPS may be analysed as taking a set or multiset, instead of

a list, of complements. For more general applications, ISO 24610-1:2006 thus introduces sets and multisets

as well as lists as built-in ways of assembling complex feature values.

Collections (; ISO 24610-1:2006, 5.8, Collections as complex feature values) take the organization

(org) attribute, with the values “list”, “set” and “bag”. In lists, order and multiplicity of elements matter. In bags,

only multiplicity matters (these are often called multisets). In sets, neither order nor multiplicity matter.

For example, the feature ARGS of verbs can be repres
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.