Language resource management - Feature structures - Part 2: Feature system declaration

ISO 24610-2:2011 provides a format to represent, store or exchange feature structures in natural language applications, for both annotation and production of linguistic data. It is ultimately designed to provide a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature specifications and operations on feature structures, thus offering means to check the conformance of each feature structure with regards to a reference specification. Feature structures are an essential part of many linguistic formalisms as well as an underlying mechanism for representing the information consumed or produced by and for language engineering applications. A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that makes use of fs (that is, feature structure) elements. The FSD serves four purposes. 1) It provides an encoding by which types and their subtyping and inheritance relationships can be introduced and defined, thus laying the basis for constructing a feature system. 2) It provides a mechanism by which the encoder can list all of the feature names and feature values and give a prose description as to what each represents. 3) It provides a mechanism by which type constraints can be declared, against which typed feature structures are validated relative to a given theory stated in typed feature logic. These constraints may involve constraints on the range of a feature's value, constraints on which features are permitted within certain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value pairs. The source of these constraints is normally the empirical domain being modelled. 4) It provides a mechanism by which the encoder can define the intended interpretation of underspecified feature structures. This involves defining default values (whether literal or computed) for missing features. The scheme described in ISO 24610-2:2011 may be used to document any feature system, but is primarily intended for use with the typed feature structure representation defined in ISO 24610-1. The feature structure representations of ISO 24610-1 specify data structures that are subject to the typing conventions and constraints specified using ISO 24610-2:2011. The feature structure representations of ISO 24610-1 are also used within some of the elements defined in ISO 24610-2:2011.

Gestion des ressources langagières — Structures de traits — Partie 2: Déclaration de système de structures de traits

Upravljanje z jezikovnimi viri - Strukture lastnosti - 2. del: Deklaracija sistema lastnosti

Ta del standarda ISO 24610 zagotavlja format za predstavitev, shranjevanje in izmenjavo struktur lastnosti v aplikacijah za naravni jezik za označevanje in oblikovanje jezikovnih podatkov. Njegov glavni namen je zagotovitev računalniškega formata za določanje hierarhije tipov in opis omejitev, ki veljajo za sklop specifikacij lastnosti in operacij na strukturah lastnosti, in tako nudi način za preverjanje skladnosti vsake strukture lastnosti glede na referenčno specifikacijo. Strukture lastnosti so ključni sestavni del številnih jezikovnih formalizmov in osnovni mehanizmi za predstavitev informacij, ki jih predelajo ali oblikujejo aplikacije za jezikovno inženirstvo, ali ki so oblikovane za te aplikacije. Deklaracija sistema lastnosti (FSD) je pomožna datoteka, ki se uporablja z določeno vrsto besedila, in uporablja elemente struktur lastnosti. FSD ima štiri namene. – Zagotavlja kodiranje, prek katerega se lahko uvedejo in definirajo tipi in njihovi podtipi ter odnosi dedovanja, s čimer se postavijo temelji za gradnjo sistema lastnosti. – Zagotavlja mehanizem, s katerim lahko kodirnik oblikuje seznam vseh imen lastnosti in vrednosti lastnosti ter besedilu doda opis o tem, kaj predstavlja. – Zagotavlja mehanizem, s katerim se lahko navedejo omejitve tipa in s čimer se potrdijo tipizirane strukture lastnosti glede na dano teorijo, navedeno v logiki tipiziranih lastnosti. Te omejitve lahko vključujejo omejitve glede razpona vrednosti lastnosti, omejitve glede tega, katere lastnosti so dovoljene v določenih vrstah struktur lastnosti, ali omejitve, ki onemogočajo sopojavitev določenih parov lastnost-vrednost. Vir teh omejitev je običajno empirično modeliranje. – Zagotavlja mehanizem, s katerim lahko kodirnik določi želeno interpretacijo premalo specificiranih struktur lastnosti. To vključuje določanje privzetih vrednosti (dobesednih ali izračunanih) za manjkajoče vrednosti. Shema, ki jo opisuje ta del standarda ISO 24610, se lahko uporablja za dokumentiranje katerega koli sistema lastnosti, a je v prvi vrsti namenjena uporabi pri predstavitvi tipiziranih struktur lastnosti iz standarda ISO 24610-1. Predstavitve struktur lastnosti iz standarda ISO 24610-1 določajo strukture podatkov, za katere veljajo tipološke norme in omejitve, določene s pomočjo standarda ISO 24610-2. Predstavitve struktur lastnosti iz standarda ISO 24610-1 se uporabljajo tudi pri nekaterih elementih, določenih v standardu ISO 24610-2.

General Information

Status
Published
Publication Date
25-Sep-2011
Current Stage
9093 - International Standard confirmed
Start Date
18-Feb-2024
Completion Date
13-Dec-2025

Overview

ISO 24610-2:2011 - Language resource management - Feature structures - Part 2: Feature system declaration - defines a machine-readable format for documenting and validating typed feature structures used in natural language processing (NLP) and language resource management. The standard specifies the structure of a feature system declaration (FSD): an auxiliary XML-based file that introduces type hierarchies, lists features and values, declares constraints, and defines interpretation rules (defaults and underspecification). ISO 24610-2 is intended to be used primarily with ISO 24610-1 (feature structure representation) to enable exchange, validation and conformance checking of linguistic annotations and generated data.

Key topics and requirements

  • Definition of a feature system declaration (FSD) for representing a feature system that includes:
    • Type hierarchies and subtyping/inheritance relationships
    • A catalog of feature names and admissible feature values with prose descriptions
    • Admissibility constraints and implicational constraints (rule-like “if G then H” constraints) to validate typed feature structures
    • Mechanisms for underspecification and default values to define intended interpretations of incomplete feature structures
  • Formal notions covered: typed feature structures, subsumption/extension, feature value ranges, and validity vs. well-formedness
  • XML support: normative references and an XML schema (annex) and examples demonstrating FSD usage
  • Validation guidance for checking conformance of fs elements to a declared theory expressed in typed feature logic

Practical applications and users

ISO 24610-2 is practical for anyone who needs a standardized, interoperable way to define and validate linguistic feature specifications:

  • NLP engineers and computational linguists defining grammar feature systems for parsers, taggers or unification-based formalisms
  • Corpus builders and annotation tool developers creating or enforcing annotation schemas for linguistic corpora
  • Language resource managers, lexicographers and digital humanities projects that exchange annotated data between tools
  • Localization and text-processing teams that need deterministic handling of underspecified annotations via default value definitions Practical uses include schema-driven validation of annotations, documenting type/feature inventories for reproducible research, and enabling tool-to-tool interoperability using a shared FSD.

Related standards

  • ISO 24610-1:2006 - Feature structure representation (primary companion standard)
  • ISO/IEC 19757-2 (RELAX NG) - referenced for XML validation mechanisms

Keywords: ISO 24610-2, feature structures, feature system declaration, FSD, typed feature structures, language resource management, XML schema, validation, type hierarchy, NLP, annotation.

Standard

ISO 24610-2:2011 - Language resource management — Feature structures — Part 2: Feature system declaration Released:9/26/2011

English language
50 pages
sale 15% off
Preview
sale 15% off
Preview
Standard

ISO 24610-2:2011 - Language resource management — Feature structures — Part 2: Feature system declaration Released:3/3/2014

Russian language
50 pages
sale 15% off
Preview
sale 15% off
Preview

Frequently Asked Questions

ISO 24610-2:2011 is a standard published by the International Organization for Standardization (ISO). Its full title is "Language resource management - Feature structures - Part 2: Feature system declaration". This standard covers: ISO 24610-2:2011 provides a format to represent, store or exchange feature structures in natural language applications, for both annotation and production of linguistic data. It is ultimately designed to provide a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature specifications and operations on feature structures, thus offering means to check the conformance of each feature structure with regards to a reference specification. Feature structures are an essential part of many linguistic formalisms as well as an underlying mechanism for representing the information consumed or produced by and for language engineering applications. A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that makes use of fs (that is, feature structure) elements. The FSD serves four purposes. 1) It provides an encoding by which types and their subtyping and inheritance relationships can be introduced and defined, thus laying the basis for constructing a feature system. 2) It provides a mechanism by which the encoder can list all of the feature names and feature values and give a prose description as to what each represents. 3) It provides a mechanism by which type constraints can be declared, against which typed feature structures are validated relative to a given theory stated in typed feature logic. These constraints may involve constraints on the range of a feature's value, constraints on which features are permitted within certain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value pairs. The source of these constraints is normally the empirical domain being modelled. 4) It provides a mechanism by which the encoder can define the intended interpretation of underspecified feature structures. This involves defining default values (whether literal or computed) for missing features. The scheme described in ISO 24610-2:2011 may be used to document any feature system, but is primarily intended for use with the typed feature structure representation defined in ISO 24610-1. The feature structure representations of ISO 24610-1 specify data structures that are subject to the typing conventions and constraints specified using ISO 24610-2:2011. The feature structure representations of ISO 24610-1 are also used within some of the elements defined in ISO 24610-2:2011.

ISO 24610-2:2011 provides a format to represent, store or exchange feature structures in natural language applications, for both annotation and production of linguistic data. It is ultimately designed to provide a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature specifications and operations on feature structures, thus offering means to check the conformance of each feature structure with regards to a reference specification. Feature structures are an essential part of many linguistic formalisms as well as an underlying mechanism for representing the information consumed or produced by and for language engineering applications. A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that makes use of fs (that is, feature structure) elements. The FSD serves four purposes. 1) It provides an encoding by which types and their subtyping and inheritance relationships can be introduced and defined, thus laying the basis for constructing a feature system. 2) It provides a mechanism by which the encoder can list all of the feature names and feature values and give a prose description as to what each represents. 3) It provides a mechanism by which type constraints can be declared, against which typed feature structures are validated relative to a given theory stated in typed feature logic. These constraints may involve constraints on the range of a feature's value, constraints on which features are permitted within certain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value pairs. The source of these constraints is normally the empirical domain being modelled. 4) It provides a mechanism by which the encoder can define the intended interpretation of underspecified feature structures. This involves defining default values (whether literal or computed) for missing features. The scheme described in ISO 24610-2:2011 may be used to document any feature system, but is primarily intended for use with the typed feature structure representation defined in ISO 24610-1. The feature structure representations of ISO 24610-1 specify data structures that are subject to the typing conventions and constraints specified using ISO 24610-2:2011. The feature structure representations of ISO 24610-1 are also used within some of the elements defined in ISO 24610-2:2011.

ISO 24610-2:2011 is classified under the following ICS (International Classification for Standards) categories: 01.140.20 - Information sciences. The ICS classification helps identify the subject area and facilitates finding related standards.

You can purchase ISO 24610-2:2011 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.

Standards Content (Sample)


SLOVENSKI STANDARD
01-julij-2013
Upravljanje z jezikovnimi viri - Strukture lastnosti - 2. del: Deklaracija sistema
lastnosti
Language resource management -- Feature structures -- Part 2: Feature system
declaration
Gestion des ressources langagières -- Structures de traits -- Partie 2: Déclaration de
système de structures de traits
Ta slovenski standard je istoveten z: ISO 24610-2:2011
ICS:
01.140.20 Informacijske vede Information sciences
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

INTERNATIONAL ISO
STANDARD 24610-2
First edition
2011-10-01
Language resource management —
Feature structures —
Part 2:
Feature system declaration
Gestion des ressources langagières — Structures de traits —
Partie 2: Déclaration de système de structures de traits

Reference number
©
ISO 2011
©  ISO 2011
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2011 – All rights reserved

Contents Page
Foreword . iv
Introduction . v
1  Scope . 1
2  Normative references . 1
3  Terms and definitions . 2
4  Overall structure . 5
5  Basic concepts . 6
5.1  Typed feature structures reviewed . 6
5.2  Types . 7
5.3  Type inheritance hierarchies . 9
5.4  Type constraints . 11
5.5  Optional (default) values and underspecification . 12
5.6  Subsumption . 12
6  Defining well-formedness versus validity. 14
6.1  Overview . 14
6.2  ISO 24610 . 14
7  A feature system for a grammar . 19
7.1  Overview . 19
7.2  Sample FSDs . 20
8  Declaration of a feature system . 23
8.1  Overview . 24
8.2  Linking a text to feature system declarations . 24
8.3  Overall structure of a feature system declaration . 25
8.4  Feature declarations . 27
8.5  Feature structure constraints . 33
Annex A (normative) XML schema for feature structures . 36
Annex B (informative) A complete example . 46
Bibliography . 50

Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies
(ISO member bodies). The work of preparing International Standards is normally carried out through ISO
technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of technical committees is to prepare International Standards. Draft International Standards
adopted by the technical committees are circulated to the member bodies for voting. Publication as an
International Standard requires approval by at least 75 % of the member bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights.
ISO 24610-2 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content
resources, Subcommittee SC 4, Language resource management.
ISO 24610 consists of the following parts, under the general title Language resource management — Feature
structures:
 Part 1: Feature structure representation
 Part 2: Feature system declaration
iv © ISO 2011 – All rights reserved

Introduction
ISO 24610 is organized in two separate main parts.
 Part 1, Feature structure representation, is dedicated to the description of feature structures, providing an
informal and yet explicit outline of their characteristics, as well as an XML-based structured way of
representing feature structures in general and typed feature structures in particular. It is designed to lay a
basis for constructing an XML-based reference format for exchanging (typed) feature structures between
applications.
 Part 2, Feature system declaration, will provide an implementation standard for XML-based typed feature
structures, first by defining a set of types and their hierarchy, then by formulating type constraints on a set
of features and their respective admissible feature values and finally by introducing a set of validity
conditions on feature structures for particular applications, especially related to the goal of language
resource management.
A feature structure is a general-purpose data structure that identifies and groups together individual features
by assigning a particular value to each. Because of the generality of feature structures, they can be used to
represent many different kinds of information. Interrelations among various pieces of information and their
instantiation in markup provide a meta-language for representing linguistic content. Moreover, this
instantiation allows a specification of a set of features and values associated with specific types and their
restrictions, by means of feature system declarations, or other XML mechanisms to be discussed in this part
of ISO 24610.
Some of the statements here are copied from ISO 24610-1:2006 in order to make this part standalone without
referring to part 1.
INTERNATIONAL STANDARD ISO 24610-2:2011(E)

Language resource management — Feature structures —
Part 2:
Feature system declaration
1 Scope
This part of ISO 24610 provides a format to represent, store or exchange feature structures in natural
language applications, for both annotation and production of linguistic data. It is ultimately designed to provide
a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature
specifications and operations on feature structures, thus offering means to check the conformance of each
feature structure with regards to a reference specification. Feature structures are an essential part of many
linguistic formalisms as well as an underlying mechanism for representing the information consumed or
produced by and for language engineering applications.
A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that
makes use of fs (that is, feature structure) elements. The FSD serves four purposes.
 It provides an encoding by which types and their subtyping and inheritance relationships can be
introduced and defined, thus laying the basis for constructing a feature system.
 It provides a mechanism by which the encoder can list all of the feature names and feature values and
give a prose description as to what each represents.
 It provides a mechanism by which type constraints can be declared, against which typed feature
structures are validated relative to a given theory stated in typed feature logic. These constraints may
involve constraints on the range of a feature's value, constraints on which features are permitted within
certain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value
pairs. The source of these constraints is normally the empirical domain being modelled.
 It provides a mechanism by which the encoder can define the intended interpretation of underspecified
feature structures. This involves defining default values (whether literal or computed) for missing features.
The scheme described in this part of ISO 24610 may be used to document any feature system, but is primarily
intended for use with the typed feature structure representation defined in ISO 24610-1. The feature structure
representations of ISO 24610-1 specify data structures that are subject to the typing conventions and
constraints specified using ISO 24610-2. The feature structure representations of ISO 24610-1 are also used
within some of the elements defined in ISO 24610-2.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO 24610-1:2006, Language resource management — Feature structures — Part 1: Feature structure
representation
ISO/IEC 19757-2, Information technology — Document Schema Definition Language (DSDL) — Part 2:
Regular-grammar-based validation — RELAX NG
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 19757-2 and the following apply.
3.1
admissibility constraint
feature admissibility constraint
specification of a set of admissible features (3.2) and admissible feature values (3.3) associated with a
specific type (3.24)
3.2
admissible feature
appropriate feature
feature which any feature structure (3.14) of a given type (3.24) may bear a value (3.17) for
NOTE This term is often interpreted elsewhere to mean obligatory, i.e. feature structures of the given type must bear
a value for every admissible feature. This term does not imply that the feature is obligatory here.
3.3
admissible feature value
admissible value
value restriction
range restriction
value (3.17) that the value of an admissible feature (3.2) must be subsumed by in feature structures (3.14)
of a given type (3.24)
3.4
atomic type
user-defined type (3.24) with no admissible features (3.2) declared or inherited
3.5
bag
multiset
triple of an integer n, a set S and a function that maps the integers in the range, 1 to n, to elements of S
NOTE A bag is halfway between a set (in that its elements are unordered) and a list (in that particular elements can
occur more than once).
3.6
built-in
non-user-defined element that may appear in place of a feature structure (3.14), for example, as a feature
value (3.17)
NOTE Built-ins can be atomic or complex. The atomic built-ins are numeric, string, symbol and binary. The complex
built-ins are collections (3.7) and applications of the operators, i.e. alternation, negation and merge (5.2.4).
3.7
collection
feature value (3.17) consisting of potentially many values, organized as a list, set or bag (3.5)
3.8
constraint
unit of specification that identifies some collection of feature structures (3.14) as invalid
NOTE 1 All constraints are implicational in their syntactic form, although some are distinguished as admissibility
constraints. See validity (3.31) and 5.4. All feature structures not explicitly excluded as invalid are considered to be valid.
NOTE 2 A feature structure that has not been so identified by any of the constraints in a feature system is considered
to be valid.
2 © ISO 2011 – All rights reserved

3.9
default value
value (3.17) otherwise assigned to a feature (3.12) when one is not specified
EXAMPLE Masculine is the default value of the grammatical gender in Dutch.
NOTE A feature structure may not bear a feature without a corresponding value.
3.10
empty feature structure
feature structure (3.14) that contains no information
NOTE An empty feature structure subsumes all other feature structures.
3.11
extension
converse of subsumption (3.21)
NOTE A feature structure F extends G if and only if G subsumes F.
3.12
feature
property or aspect of an entity that is formally represented as a function mapping the entity to a corresponding
value (3.17)
3.13
feature specification
pairing of a feature (3.12) with a value (3.17) in a feature structure description
3.14
feature structure
record structure that associates one value (3.17) to each of a collection of features
NOTE 1 Each value is either a feature structure or a simpler built-in (3.6) such as a string.
NOTE 2 Feature structures are partially ordered. The minimal feature structures in this ordering are the empty feature
structures.
3.15
feature system
type hierarchy (3.26) in which each type (3.24) has been associated with a collection of admissibility
constraints (3.1) and implicational constraints (3.18)
NOTE cf. type declaration (3.25)
3.16
feature system declaration
FSD
specification of a particular feature system (3.15)
3.17
feature value
value
entity or aggregation of entities that characterize some property or aspect of another entity
3.18
implicational constraint
constraint of the form, “if G, then H,” where G and H are feature structures (3.14)
NOTE This identifies any feature structure F as invalid for which G subsumes F, and yet F and H have no valid
extension in common. See subsumption (3.21) and 8.5. Often used to refer to implicational constraints that are not also
admissibility constraints.
3.19
interpretation
minimally informative (or equivalently, most general) extension (3.11) of a feature structure (3.14) that is
consistent with a set of constraints declared by an FSD (3.16)
3.20
partial order
partially ordered set
set S equipped with a relation  over S  S that is (1) reflexive (for all s  S, s  s), (2) anti-symmetric (for all p,
q  S, if p  q and q  p, then p  q), and (3) transitive (for all p, q, r  S, if p  q and q  r, then p  r)
NOTE The set of integers Z is partially ordered, but it has an additional property: for every p, q  Z, either p  q or
q  p. Not all partial orders have this property. The taxonomical classification of organisms into phyla, genera and species,
for example, is a partial order that does not. Type hierarchies may not necessarily. The typed feature structures of a
feature system do not, unless (a) their type hierarchy does, and (b) either the type hierarchy has exactly one type, or every
y type is constrained to have exactly one appropriate feature.
3.21
subsumption
property that holds between two feature structures, G and F, such that G is said to subsume F if and only if F
carries all of the information with it that G does
NOTE A formal definition is provided in 5.6.
3.22
subtype
type (3.24) to which another type confers its constraints and appropriate features
3.23
supertype
base type
type (3.24) from which another type inherits constraints and appropriate features
NOTE s is a subtype of t iff t is a supertype of s. Every type is a subtype and supertype of itself.
3.24
semantic type
type
referring expression that distinguishes a collection of feature structures (3.14) as an identifiable and
conceptually significant class
NOTE As implied by the name semantic type, types in this part of ISO 24610 do not serve to distinguish feature
structures or their specifications syntactically.
3.25
type declaration
structure that declares the supertypes (3.23), admissible features (3.2), admissible feature values (3.3),
admissibility constraints (3.1) and implicational constraints (3.18) for a given type (3.24)
NOTE The constraints on a type in the resulting feature system are those that have been declared in its declaration,
in addition to those that it has inherited from its supertypes.
4 © ISO 2011 – All rights reserved

3.26
type hierarchy
partial order (3.20) over a set of types (3.24)
NOTE See ISO 24610-1:2006, Annex C, Type inheritance hierarchies.
3.27
typed feature structure
TFS
feature structure (3.14) that bears a type (3.24)
3.28
typing
assignment of a semantic type (3.24) to a built-in (3.6) or feature structure (3.14), either atomic or complex
NOTE Semantic types in feature systems are partially ordered, with multiple inheritance.
3.29
underspecification
provision of partial information about a value (3.17)
NOTE An underspecification generally subsumes one of a range of candidate values that could be resolved to a
single value through subsequent constraint resolution. See subsumption (3.21).
3.30
well-formedness
syntactic conformity of a feature structure (3.14) representation to ISO 24610-1
3.31
validity
conformity of a typed feature structure (3.27) to the constraints (3.8) of a particular feature system (3.15)
NOTE See Clause 6.
4 Overall structure
The main part of the document consists of four clauses: Clauses 5, 6, 7 and 8.
 Clause 5, Basic concepts, reviews the definition of typed feature structures and the notions of atomic and
complex types, collections and other operators that may appear in feature values. It then describes the
notions of type inheritance hierarchies, type constraints, default values and underspecification that are
essential to the construction of feature systems.
 Clause 6, Defining well-formedness versus validity, discusses the conditions of well-formedness and
validity.
 Clause 7, A feature system for a grammar, illustrates how to define types with a type hierarchy and type
constraints which declare what features and values are admissible for specific types.
 Finally, Clause 8, Declaration of a feature system, discusses how a feature system can be declared and
developed into a validator.
The main part of the document is followed by two annexes: Annex A contains the XML schema for this part of
ISO 24610; Annex B contains a complete example.
5 Basic concepts
5.1 Typed feature structures reviewed
Typed feature structures (TFSs) are introduced as basic records for language resource management.
For more information, refer to ISO 24610-1:2006, 4.7, Typed feature structure, and Annex C, Type inheritance
hierarchies.
Here, a TFS is formally defined as a tuple over a finite set Feat of features, a collection X of
non-feature-structure elements, and a type hierarchy Type, , where Type is a finite set of types and  is a
subtyping relation over Type.
A feature structure is a tuple , in which
a) Q is a set of nodes,
b) γ ∈ Q is the root node of the feature structure,
c) θ : Q → Type is a partial typing function, and
d) δ : Feat × Q → Q ∪ X is a partial feature value function,
such that, for all q ∈ Q, there exists a path of features F , ., F such that δ[F , . δ(F , γ) . ]  q.

1 n n 1
elements denote nodes. This definition deviates from the standard one used in linguistics and theoretical
computer science in that (1) typing is partial, not total, i.e. not all feature structures have types, and (2) feature
values might not be feature structures, but instead be drawn from a collection denoted by other XML elements
such as string, numeric, symbol, and binary (the X above). Note that nodes are typed, but features themselves
are not.
The following XML representation of a feature structure is considered well-formed, where the attribute type is
assigned to each of the two elements.
EXAMPLE Typed feature structure:


had












The feature name ORTH above stands for orthography, the conventional written form of a word or phrase.
This XML representation shows how the morpho-syntactic features of an English word “had” are specified as
a past-tensed and non-auxiliary verb.
In the alternative, “matrix” or “AVM” notation, type names are conventionally in the lower-case, sometimes
italicized or in the text type font, feature names in the upper-case, and strings in quotes. Binary values are
6 © ISO 2011 – All rights reserved

indicated with  or . These conventions are followed in this document, too. The above feature structure would
be depicted in matrix notation as shown in Figure 1.

Figure 1 — Matrix notation
5.2 Types
5.2.1 Atomic types
Alongside the built-ins (, , and ), it is possible for a feature structure to
have a type but no features. These are called simple or atomic feature structures, and types that allow for no
features in their feature system declaration (FSD) are called atomic types.
There is, as a result, always the possibility of declaring new atomic types and using these instead of the
above-mentioned built-ins to specify simple values. The above feature structure, for example, could have
instead been rendered as follows, assuming the extra types had, past and false were declared in an FSD.
EXAMPLE Typed feature structure: alternative formulation















There is a difference also noticed between the two classes of built-ins: on the one hand, and
, and on the other. Any kind of string is permissible as the content of the
element, whereas a very restricted set of values is permissible in , and
elements. To reflect this difference, members of the latter class specify their values using the attribute value.
The type , for instance, is associated with four values: true, false, plus (equivalent to true) and minus
(equivalent to false).
NOTE ISO 24610-1:2006 introduced the type binary, but the W3C's XML schema (2001) names it boolean.
It is the duty of the encoder to choose between atomic-type encodings and built-in encodings consistently.
This part of ISO 24610 does not regard one as identical or even consistent with the other.
5.2.2 Complex types
Types that are not atomic are called complex. These include all of the types declared by the encoder in an
FSD that declare or inherit admissible features. A feature is only admissible to a type if feature structures of
that type are permitted by the FSD to have values for that feature. This does not mean that well-formed
feature structures cannot arbitrarily associate types with feature structures regardless of their featural content
– they can. But only those feature structures that use only admissible features to their type, as specified by
some FSD, could be validated against that FSD. The distinction between validity and well-formedness is
further elaborated upon in Clause 6.
All user-declared types, no matter whether they are atomic or complex, are semantic, i.e. syntactically, they
look no different from each other, apart from the value of their type attribute. It is the role of a validator to
interpret the real significance of these types through enforcing restrictions on admissibility, restrictions on the
possible values that admissible features can have (), and other constraints that take the form of
logical implications. All of these are specified, for each type, in an FSD.
The built-ins defined by the ISO 24610-1:2006 feature structure representations (FSRs) standard are purely
syntactic. They can be used without declaration in an FSD, and they cannot be declared in an FSD. They can
appear in value range restrictions, or in implicational constraints, but they cannot have such restrictions (since
they have no admissible features) or constraints of their own.
5.2.3 Collections
Not all built-ins are as simple as those mentioned above, however. Some grammatical features such as
specifiers (SPR), complements (COMPS) and arguments (ARGS) are considered as having a list of
[10]
grammatical values, especially in Head-driven Phrase Structure Grammars (Pollard and Sag 1994 ; Sag,
[12]
Wasow, Bender 2003 ). For languages other than English, some of these features may take other kinds of
collections, namely sets or multisets, as their value. In a language (e.g. German, Korean or Japanese) that
allows a relatively free word order, the feature COMPS may be analysed as taking a set or multiset, instead of
a list, of complements. For more general applications, ISO 24610-1:2006 thus introduces sets and multisets
as well as lists as built-in ways of assembling complex feature values.
Collections (; ISO 24610-1:2006, 5.8, Collections as complex feature values) take the organization
(org) attribute, with the values “list”, “set” and “bag”. In lists, order and multiplicity of elements matter. In bags,
only multiplicity matters (these are often called multisets). In sets, neither order nor multiplicity matter.
For example, the feature ARGS of verbs can be represented by specifying the organization of as a list
of values, each of which is of type phrase.
EXAMPLE List value


put
























8 © ISO 2011 – All rights reserved

Some would call the type of this collection list (phrase), but polymorphic lists are not yet supported in this part
of ISO 24610. This is equivalent to the following AVM notation, where NP stands for a feature structure of the
type phrase with a positive NOMINAL feature, namely a noun phrase, and PP, a feature structure of the type
phrase with a positive PREPOSITIONAL feature, namely a prepositional phrase. The boxed integers are the
labels for marking structure sharing as shown in Figue 2.

Figure 2 — Marking structure sharing
5.2.4 Operators
The other class of built-ins are operators that take one or more built-ins or feature structures as arguments,
but instead of constructing a collection from them, denote a value that is in some other way derived from them.
Alternations (; ISO 24610-1:2006, 5.9.2, Alternations) denote one of their arguments' values. A feature
structure containing an alternation does not denote multiple feature structures, however. An alternation is a
single value that underspecifies which of several possible alternatives it is. Alternations can be regarded as
the joins of their arguments in the partial order induced by subsumption (see 5.6).
Negations (; ISO 24610-1:2006, 5.9.3, Negation) take a single argument, and denote a value which is
not its argument. A negation is equivalent to an alternation among all values that are inconsistent with its
argument. A negation is actually not a logical negation of a value, but rather the complement of that value in
the full Boolean lattice that contains the partial order induced by subsumption.
A merge (; ISO 24610-1:2006, 5.9.4, Collection of values) denotes the concatenation or union of
several values and/or collections of values, according to how its org attribute is set. org takes the same values,
with the same meanings, as in .
5.3 Type inheritance hierarchies
The type hierarchy is discussed in great length in ISO 24610-1:2006, Annex C Type inheritance
hierarchies. This structure is normally depicted as a directed acyclic graph with a unique top node. The label
of this top node is often top, and represents the most general type, the type that is consistent with all typed
feature structures. Subtypes are connected to, and appear below, their supertypes. The most specific types
appear at the bottom of the graph. These are mutually incompatible with each other, which is generally
understood implicitly, or, on occasion, depicted by another special type, bottom as the unique bottom-most
element. Bottom is not used in this part of ISO 24610.
Figure 3 provides an example that depicts a part of the natural world:

Figure 3 — Type hierarchy for living beings
According to this picture, living beings consist of plants and animals. Animals are subclassified into fish, birds
and mammals. Dogs, humans and bovines (oxen, cows, bulls) belong to the class of mammals.
Type hierarchies are not always trees; they may have two or more branches meeting at a single node. When
this happens, it means that a type has multiple supertypes, and properties multiply inherited from all of them.
Figure 4 provides is an example of this.

Figure 4 — Medieval hierarchy of beings
Here, the type human has two parent types, animal and rational. Hence, a human is viewed as an animal like
a dog, but also a spiritual and rational being like an angel. A human thus shares some properties with both
dogs and angels.
All these types are partially ordered by a subtyping relation, , over types. A type τ is a subtype of type σ if
and only if σ is more general than τ, i.e. if the set of feature structures of type σ contains the set of feature
structures of type τ. Since the type animate is more general than the type animal in the above example, all
animals are asserted to be animate. A type σ is said to be a supertype of a type τ if and only if τ is a subtype
of σ. The immediate supertypes of a type are often called its parents.
A subtype inherits all of the properties from its supertype. The type human, for instance, inherits all the
properties from its supertypes (being, animate, animal, spiritual and rational).
[2]
Here is a linguistic example, modified from Grammar 2 of Copestake (2002) is shown in Figure 5.

Figure 5 — Type hierarchy for a simple grammar top
The type hierarchy has a unique top element. It is the most general type with no parents or immediate
supertypes. Top is only a subtype of itself.
Each type has a name and every type, except for the top-most type, has exactly one parent. The type top has
four immediate subtypes. phrase and det are incomparable – neither is a subtype or supertype of the other.
Depending on the complexity of a grammar, the type hierarchy can be very complex. Some portions of the
hierarchy may be universal to all languages, while others are very language-specific. The agreement type
agr-cat in English, for example, has only two immediate subtypes: 3sing and non-3sing (e.g. “sings” versus
“sing”).
The type det stands for a determiner such as “the” or “a”; 3sing stands for 3rd person singular, and non-3sing
stands for agreement categories other than 3sing.
This distinction is the one that is apparent in English verb agreement.
10 © ISO 2011 – All rights reserved

5.4 Type constraints
The type hierarchy is the skeleton on which the rest of the grammar grows. The rest of the grammar takes the
form of constraints over feature structures of these user-defined types. These constraints are at least of the
following three kinds: (1) implicational constraints, (2) constraints on admissible features, and (3) constraints
on admissible feature values. Actually, all of them can be thought of implicationally, e.g.
 if a feature structure is of type verb, then it may have the feature AUXiliary,
 if a feature structure is of type verb, then it may have the feature INVerted,
 if a feature structure is of type verb, then its AUX value must be “binary”,
 if a feature structure is of type verb, then its INV value must be “binary”,
 if a feature structure is of type verb and its AUX value is negative, then its INV value must be “negative”.
The first two of these are feature admissibility constraints. They tell us that a particular feature can be used in
feature structures of a particular type. The second two are constraints on admissible feature values,
sometimes called “value restrictions” or “range restrictions”. They tell us what kind of value a particular feature
must take when it occurs in a feature structure of some given type. The last of these is of a more general form;
however, this kind of constraint says that whenever a feature structure takes some particular form (determined
by types, feature values, etc.), it must satisfy some other criteria (again stated in terms of types, feature values,
etc.). This last form of constraint is generally what is meant by the phrase implicational constraint. Each of
these three forms has a different syntax in an FSD. The above constraints on verb would be encoded as
follows.
EXAMPLE Constraint on the type verb



























The first two kinds are specified together inside an element, the second being the portion
of that declaration, whereas the third is specified as an if-then conditional ().
5.5 Optional (default) values and underspecification
In a feature structure, some features must be specified and others need not be. In French, for example, the
specification of the features NUMBER and GENDER is obligatory for nouns and adjectives. In English, the
feature NUMBER must be specified for each noun, but the specification of the feature GENDER is optional. It
is obligatory for the third person singular pronouns, “he”, “she” and “it”.
Nevertheless, there are cases in which some obligatory features are not specified. In those cases, there are
two possibilities: (1) if a default value has been defined, then it is understood to be its value; and (2) if not,
then the value of the feature is inferred from the feature's range restriction.
English mass nouns such as “water” and “air” are uncountable and singular by default. Hence, their NUMBER
feature need not be specified, although the feature NUMBER is obligatory. Some countable nouns such as
“sheep” can be either singular or plural. When the NUMBER is not specified, its value is understood to be
some more general type such as number, which is a supertype of all of the admissible feature values.
Grammatical descriptions are often underspecified in order to capture generalizations. In English, for instance,
verbs are subclassified into the number of complements that they require. Intransitive verbs (“smile”, “bark”)
take only a subject, transitive verbs (“love”, “attack”) take a subject and a direct object, and ditransitive verbs
(“give”, “put”) take a subject, direct object and indirect object. Many of the grammatical phenomena, however,
do not refer to one of these specific subclasses. Examples include subject-verb agreement (“The dog barks”
versus ill-formed “the dog bark”) or subject-verb inversion (“Does the dog bark?” versus ill-formed “Do the dog
attacks Jane?”). Since the specification of this feature is irrelevant in describing these grammatical
phenomena, it is left underspecified.
Here is another example for underspecification. The analysis of a sentence like “The sheep attacked Jane”
may be underspecified with respect to the NUMBER value of “sheep”. Only if necessary is its lexical ambiguity
displayed.
Default values are specified in FSDs with , as explained in 8.4, and can be referenced in FSRs with
the element (ISO 24610-1:2006, 5.10, Default values).
5.6 Subsumption
A feature structure F subsumes another feature structure G (F  G) if and only if G contains all of the
information that F does. “Information” is delivered by a feature structure in two ways: typing and path equality.
When one views feature structures as a pair consisting of an equivalence relation on paths (≡), and a partial
typing function on paths (Θ), then formally < ≡ , Θ >  < ≡ , Θ > if and only if ≡ ⊆ ≡ and for all
F F G G F G
π ∈ Paths ∩ Paths , if Θ (π) is defined, then Θ (π) is defined and Θ (π) is a subtype of Θ (π). When F  G,
F G F G G F
we say that G extends F.
The view of typed feature structures taken here is still more general than is often the case in either the
linguistics literature or the formal literature on typed feature logic, because of the presence of symbols, strings,
numbers and other feature values than elements. With respect to extensions and subsumption, strings,
symbols, numbers and Booleans (binary values) behave as though they were types with no admissible
features that are discretely ordered alongside, but not connected to, the rest of the type inheritance hierarchy,
i.e. they have no subtype relationships with any type but themselves. Feature structures of these “types” are
only subsumed by themselves and the most general untyped feature structure, , and they have no
extensions other than themselves. Some caution must be exercised, however, with respect to determining
subsumption within this extended view of typed feature structures, because re-entrancies may still exist or not
exist between identical-looking symbols, strings, numbers, etc. The more extensional view of identity that
usually accompanies these other entities is inconsistent with the view of identity that the logic of typed feature
structures takes with respect to feature structures over its own types. It is the latter that this part of ISO 24610
uses and applies to both feature structures and these other entities, when they occur within feature structures.
Alternations are also often excluded from more formal work on typed feature logic, but can be thought of as
joins of their respective typed feature structure arguments in the partial order of typed feature structures
induced by subsumption. The negation of a value can similarly be thought of as the join of every structure that
12 © ISO 2011 – All rights reserved

is inconsistent with that value under unification. Collections actually depend on the organization. Lists appear
in the subsumption partial order as though they were encoded as typed feature structures using this FSD.
EXAMPLE Sample FSD


Empty lists


Non-empty lists







One bag (multiset) B subsumes another bag B if and only if there exists a total surjection σ between the
1 2
elements of the two bags such that, for all b in the domain of B with multiplicity µ (b ), and all b in the

1 1 1 1 2
domain of B with multiplicity µ (b ):

2 2 2
1) b  σ(b ),
1 1
2) µ (b ) =     ∑ µ (b ),

2 2 1 1
b : σ(b )  b
1 1 2
and σ can be extended to a total function, σ*, between the substructures of the elements of the two bags,
such that, for all substructures, c, of the elements of B :
3) σ*(c) = σ(c) if c is an element of B , and
4) σ * [δ(F, c)] = δ[F, σ*(c)
...


SLOVENSKI STANDARD
01-julij-2013
Upravljanje z jezikovnimi viri - Strukture lastnosti - 2. del: Deklaracija sistema
lastnosti
Language resource management -- Feature structures -- Part 2: Feature system
declaration
Gestion des ressources langagières -- Structures de traits -- Partie 2: Déclaration de
système de structures de traits
Ta slovenski standard je istoveten z: ISO 24610-2:2011
ICS:
01.020 Terminologija (načela in Terminology (principles and
koordinacija) coordination)
01.140.20 Informacijske vede Information sciences
35.240.30 Uporabniške rešitve IT v IT applications in information,
informatiki, dokumentiranju in documentation and
založništvu publishing
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

INTERNATIONAL ISO
STANDARD 24610-2
First edition
2011-10-01
Language resource management —
Feature structures —
Part 2:
Feature system declaration
Gestion des ressources langagières — Structures de traits —
Partie 2: Déclaration de système de structures de traits

Reference number
©
ISO 2011
©  ISO 2011
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2011 – All rights reserved

Contents Page
Foreword . iv
Introduction . v
1  Scope . 1
2  Normative references . 1
3  Terms and definitions . 2
4  Overall structure . 5
5  Basic concepts . 6
5.1  Typed feature structures reviewed . 6
5.2  Types . 7
5.3  Type inheritance hierarchies . 9
5.4  Type constraints . 11
5.5  Optional (default) values and underspecification . 12
5.6  Subsumption . 12
6  Defining well-formedness versus validity. 14
6.1  Overview . 14
6.2  ISO 24610 . 14
7  A feature system for a grammar . 19
7.1  Overview . 19
7.2  Sample FSDs . 20
8  Declaration of a feature system . 23
8.1  Overview . 24
8.2  Linking a text to feature system declarations . 24
8.3  Overall structure of a feature system declaration . 25
8.4  Feature declarations . 27
8.5  Feature structure constraints . 33
Annex A (normative) XML schema for feature structures . 36
Annex B (informative) A complete example . 46
Bibliography . 50

Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies
(ISO member bodies). The work of preparing International Standards is normally carried out through ISO
technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of technical committees is to prepare International Standards. Draft International Standards
adopted by the technical committees are circulated to the member bodies for voting. Publication as an
International Standard requires approval by at least 75 % of the member bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights.
ISO 24610-2 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content
resources, Subcommittee SC 4, Language resource management.
ISO 24610 consists of the following parts, under the general title Language resource management — Feature
structures:
 Part 1: Feature structure representation
 Part 2: Feature system declaration
iv © ISO 2011 – All rights reserved

Introduction
ISO 24610 is organized in two separate main parts.
 Part 1, Feature structure representation, is dedicated to the description of feature structures, providing an
informal and yet explicit outline of their characteristics, as well as an XML-based structured way of
representing feature structures in general and typed feature structures in particular. It is designed to lay a
basis for constructing an XML-based reference format for exchanging (typed) feature structures between
applications.
 Part 2, Feature system declaration, will provide an implementation standard for XML-based typed feature
structures, first by defining a set of types and their hierarchy, then by formulating type constraints on a set
of features and their respective admissible feature values and finally by introducing a set of validity
conditions on feature structures for particular applications, especially related to the goal of language
resource management.
A feature structure is a general-purpose data structure that identifies and groups together individual features
by assigning a particular value to each. Because of the generality of feature structures, they can be used to
represent many different kinds of information. Interrelations among various pieces of information and their
instantiation in markup provide a meta-language for representing linguistic content. Moreover, this
instantiation allows a specification of a set of features and values associated with specific types and their
restrictions, by means of feature system declarations, or other XML mechanisms to be discussed in this part
of ISO 24610.
Some of the statements here are copied from ISO 24610-1:2006 in order to make this part standalone without
referring to part 1.
INTERNATIONAL STANDARD ISO 24610-2:2011(E)

Language resource management — Feature structures —
Part 2:
Feature system declaration
1 Scope
This part of ISO 24610 provides a format to represent, store or exchange feature structures in natural
language applications, for both annotation and production of linguistic data. It is ultimately designed to provide
a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature
specifications and operations on feature structures, thus offering means to check the conformance of each
feature structure with regards to a reference specification. Feature structures are an essential part of many
linguistic formalisms as well as an underlying mechanism for representing the information consumed or
produced by and for language engineering applications.
A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that
makes use of fs (that is, feature structure) elements. The FSD serves four purposes.
 It provides an encoding by which types and their subtyping and inheritance relationships can be
introduced and defined, thus laying the basis for constructing a feature system.
 It provides a mechanism by which the encoder can list all of the feature names and feature values and
give a prose description as to what each represents.
 It provides a mechanism by which type constraints can be declared, against which typed feature
structures are validated relative to a given theory stated in typed feature logic. These constraints may
involve constraints on the range of a feature's value, constraints on which features are permitted within
certain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value
pairs. The source of these constraints is normally the empirical domain being modelled.
 It provides a mechanism by which the encoder can define the intended interpretation of underspecified
feature structures. This involves defining default values (whether literal or computed) for missing features.
The scheme described in this part of ISO 24610 may be used to document any feature system, but is primarily
intended for use with the typed feature structure representation defined in ISO 24610-1. The feature structure
representations of ISO 24610-1 specify data structures that are subject to the typing conventions and
constraints specified using ISO 24610-2. The feature structure representations of ISO 24610-1 are also used
within some of the elements defined in ISO 24610-2.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO 24610-1:2006, Language resource management — Feature structures — Part 1: Feature structure
representation
ISO/IEC 19757-2, Information technology — Document Schema Definition Language (DSDL) — Part 2:
Regular-grammar-based validation — RELAX NG
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 19757-2 and the following apply.
3.1
admissibility constraint
feature admissibility constraint
specification of a set of admissible features (3.2) and admissible feature values (3.3) associated with a
specific type (3.24)
3.2
admissible feature
appropriate feature
feature which any feature structure (3.14) of a given type (3.24) may bear a value (3.17) for
NOTE This term is often interpreted elsewhere to mean obligatory, i.e. feature structures of the given type must bear
a value for every admissible feature. This term does not imply that the feature is obligatory here.
3.3
admissible feature value
admissible value
value restriction
range restriction
value (3.17) that the value of an admissible feature (3.2) must be subsumed by in feature structures (3.14)
of a given type (3.24)
3.4
atomic type
user-defined type (3.24) with no admissible features (3.2) declared or inherited
3.5
bag
multiset
triple of an integer n, a set S and a function that maps the integers in the range, 1 to n, to elements of S
NOTE A bag is halfway between a set (in that its elements are unordered) and a list (in that particular elements can
occur more than once).
3.6
built-in
non-user-defined element that may appear in place of a feature structure (3.14), for example, as a feature
value (3.17)
NOTE Built-ins can be atomic or complex. The atomic built-ins are numeric, string, symbol and binary. The complex
built-ins are collections (3.7) and applications of the operators, i.e. alternation, negation and merge (5.2.4).
3.7
collection
feature value (3.17) consisting of potentially many values, organized as a list, set or bag (3.5)
3.8
constraint
unit of specification that identifies some collection of feature structures (3.14) as invalid
NOTE 1 All constraints are implicational in their syntactic form, although some are distinguished as admissibility
constraints. See validity (3.31) and 5.4. All feature structures not explicitly excluded as invalid are considered to be valid.
NOTE 2 A feature structure that has not been so identified by any of the constraints in a feature system is considered
to be valid.
2 © ISO 2011 – All rights reserved

3.9
default value
value (3.17) otherwise assigned to a feature (3.12) when one is not specified
EXAMPLE Masculine is the default value of the grammatical gender in Dutch.
NOTE A feature structure may not bear a feature without a corresponding value.
3.10
empty feature structure
feature structure (3.14) that contains no information
NOTE An empty feature structure subsumes all other feature structures.
3.11
extension
converse of subsumption (3.21)
NOTE A feature structure F extends G if and only if G subsumes F.
3.12
feature
property or aspect of an entity that is formally represented as a function mapping the entity to a corresponding
value (3.17)
3.13
feature specification
pairing of a feature (3.12) with a value (3.17) in a feature structure description
3.14
feature structure
record structure that associates one value (3.17) to each of a collection of features
NOTE 1 Each value is either a feature structure or a simpler built-in (3.6) such as a string.
NOTE 2 Feature structures are partially ordered. The minimal feature structures in this ordering are the empty feature
structures.
3.15
feature system
type hierarchy (3.26) in which each type (3.24) has been associated with a collection of admissibility
constraints (3.1) and implicational constraints (3.18)
NOTE cf. type declaration (3.25)
3.16
feature system declaration
FSD
specification of a particular feature system (3.15)
3.17
feature value
value
entity or aggregation of entities that characterize some property or aspect of another entity
3.18
implicational constraint
constraint of the form, “if G, then H,” where G and H are feature structures (3.14)
NOTE This identifies any feature structure F as invalid for which G subsumes F, and yet F and H have no valid
extension in common. See subsumption (3.21) and 8.5. Often used to refer to implicational constraints that are not also
admissibility constraints.
3.19
interpretation
minimally informative (or equivalently, most general) extension (3.11) of a feature structure (3.14) that is
consistent with a set of constraints declared by an FSD (3.16)
3.20
partial order
partially ordered set
set S equipped with a relation  over S  S that is (1) reflexive (for all s  S, s  s), (2) anti-symmetric (for all p,
q  S, if p  q and q  p, then p  q), and (3) transitive (for all p, q, r  S, if p  q and q  r, then p  r)
NOTE The set of integers Z is partially ordered, but it has an additional property: for every p, q  Z, either p  q or
q  p. Not all partial orders have this property. The taxonomical classification of organisms into phyla, genera and species,
for example, is a partial order that does not. Type hierarchies may not necessarily. The typed feature structures of a
feature system do not, unless (a) their type hierarchy does, and (b) either the type hierarchy has exactly one type, or every
y type is constrained to have exactly one appropriate feature.
3.21
subsumption
property that holds between two feature structures, G and F, such that G is said to subsume F if and only if F
carries all of the information with it that G does
NOTE A formal definition is provided in 5.6.
3.22
subtype
type (3.24) to which another type confers its constraints and appropriate features
3.23
supertype
base type
type (3.24) from which another type inherits constraints and appropriate features
NOTE s is a subtype of t iff t is a supertype of s. Every type is a subtype and supertype of itself.
3.24
semantic type
type
referring expression that distinguishes a collection of feature structures (3.14) as an identifiable and
conceptually significant class
NOTE As implied by the name semantic type, types in this part of ISO 24610 do not serve to distinguish feature
structures or their specifications syntactically.
3.25
type declaration
structure that declares the supertypes (3.23), admissible features (3.2), admissible feature values (3.3),
admissibility constraints (3.1) and implicational constraints (3.18) for a given type (3.24)
NOTE The constraints on a type in the resulting feature system are those that have been declared in its declaration,
in addition to those that it has inherited from its supertypes.
4 © ISO 2011 – All rights reserved

3.26
type hierarchy
partial order (3.20) over a set of types (3.24)
NOTE See ISO 24610-1:2006, Annex C, Type inheritance hierarchies.
3.27
typed feature structure
TFS
feature structure (3.14) that bears a type (3.24)
3.28
typing
assignment of a semantic type (3.24) to a built-in (3.6) or feature structure (3.14), either atomic or complex
NOTE Semantic types in feature systems are partially ordered, with multiple inheritance.
3.29
underspecification
provision of partial information about a value (3.17)
NOTE An underspecification generally subsumes one of a range of candidate values that could be resolved to a
single value through subsequent constraint resolution. See subsumption (3.21).
3.30
well-formedness
syntactic conformity of a feature structure (3.14) representation to ISO 24610-1
3.31
validity
conformity of a typed feature structure (3.27) to the constraints (3.8) of a particular feature system (3.15)
NOTE See Clause 6.
4 Overall structure
The main part of the document consists of four clauses: Clauses 5, 6, 7 and 8.
 Clause 5, Basic concepts, reviews the definition of typed feature structures and the notions of atomic and
complex types, collections and other operators that may appear in feature values. It then describes the
notions of type inheritance hierarchies, type constraints, default values and underspecification that are
essential to the construction of feature systems.
 Clause 6, Defining well-formedness versus validity, discusses the conditions of well-formedness and
validity.
 Clause 7, A feature system for a grammar, illustrates how to define types with a type hierarchy and type
constraints which declare what features and values are admissible for specific types.
 Finally, Clause 8, Declaration of a feature system, discusses how a feature system can be declared and
developed into a validator.
The main part of the document is followed by two annexes: Annex A contains the XML schema for this part of
ISO 24610; Annex B contains a complete example.
5 Basic concepts
5.1 Typed feature structures reviewed
Typed feature structures (TFSs) are introduced as basic records for language resource management.
For more information, refer to ISO 24610-1:2006, 4.7, Typed feature structure, and Annex C, Type inheritance
hierarchies.
Here, a TFS is formally defined as a tuple over a finite set Feat of features, a collection X of
non-feature-structure elements, and a type hierarchy Type, , where Type is a finite set of types and  is a
subtyping relation over Type.
A feature structure is a tuple , in which
a) Q is a set of nodes,
b) γ ∈ Q is the root node of the feature structure,
c) θ : Q → Type is a partial typing function, and
d) δ : Feat × Q → Q ∪ X is a partial feature value function,
such that, for all q ∈ Q, there exists a path of features F , ., F such that δ[F , . δ(F , γ) . ]  q.

1 n n 1
elements denote nodes. This definition deviates from the standard one used in linguistics and theoretical
computer science in that (1) typing is partial, not total, i.e. not all feature structures have types, and (2) feature
values might not be feature structures, but instead be drawn from a collection denoted by other XML elements
such as string, numeric, symbol, and binary (the X above). Note that nodes are typed, but features themselves
are not.
The following XML representation of a feature structure is considered well-formed, where the attribute type is
assigned to each of the two elements.
EXAMPLE Typed feature structure:


had












The feature name ORTH above stands for orthography, the conventional written form of a word or phrase.
This XML representation shows how the morpho-syntactic features of an English word “had” are specified as
a past-tensed and non-auxiliary verb.
In the alternative, “matrix” or “AVM” notation, type names are conventionally in the lower-case, sometimes
italicized or in the text type font, feature names in the upper-case, and strings in quotes. Binary values are
6 © ISO 2011 – All rights reserved

indicated with  or . These conventions are followed in this document, too. The above feature structure would
be depicted in matrix notation as shown in Figure 1.

Figure 1 — Matrix notation
5.2 Types
5.2.1 Atomic types
Alongside the built-ins (, , and ), it is possible for a feature structure to
have a type but no features. These are called simple or atomic feature structures, and types that allow for no
features in their feature system declaration (FSD) are called atomic types.
There is, as a result, always the possibility of declaring new atomic types and using these instead of the
above-mentioned built-ins to specify simple values. The above feature structure, for example, could have
instead been rendered as follows, assuming the extra types had, past and false were declared in an FSD.
EXAMPLE Typed feature structure: alternative formulation















There is a difference also noticed between the two classes of built-ins: on the one hand, and
, and on the other. Any kind of string is permissible as the content of the
element, whereas a very restricted set of values is permissible in , and
elements. To reflect this difference, members of the latter class specify their values using the attribute value.
The type , for instance, is associated with four values: true, false, plus (equivalent to true) and minus
(equivalent to false).
NOTE ISO 24610-1:2006 introduced the type binary, but the W3C's XML schema (2001) names it boolean.
It is the duty of the encoder to choose between atomic-type encodings and built-in encodings consistently.
This part of ISO 24610 does not regard one as identical or even consistent with the other.
5.2.2 Complex types
Types that are not atomic are called complex. These include all of the types declared by the encoder in an
FSD that declare or inherit admissible features. A feature is only admissible to a type if feature structures of
that type are permitted by the FSD to have values for that feature. This does not mean that well-formed
feature structures cannot arbitrarily associate types with feature structures regardless of their featural content
– they can. But only those feature structures that use only admissible features to their type, as specified by
some FSD, could be validated against that FSD. The distinction between validity and well-formedness is
further elaborated upon in Clause 6.
All user-declared types, no matter whether they are atomic or complex, are semantic, i.e. syntactically, they
look no different from each other, apart from the value of their type attribute. It is the role of a validator to
interpret the real significance of these types through enforcing restrictions on admissibility, restrictions on the
possible values that admissible features can have (), and other constraints that take the form of
logical implications. All of these are specified, for each type, in an FSD.
The built-ins defined by the ISO 24610-1:2006 feature structure representations (FSRs) standard are purely
syntactic. They can be used without declaration in an FSD, and they cannot be declared in an FSD. They can
appear in value range restrictions, or in implicational constraints, but they cannot have such restrictions (since
they have no admissible features) or constraints of their own.
5.2.3 Collections
Not all built-ins are as simple as those mentioned above, however. Some grammatical features such as
specifiers (SPR), complements (COMPS) and arguments (ARGS) are considered as having a list of
[10]
grammatical values, especially in Head-driven Phrase Structure Grammars (Pollard and Sag 1994 ; Sag,
[12]
Wasow, Bender 2003 ). For languages other than English, some of these features may take other kinds of
collections, namely sets or multisets, as their value. In a language (e.g. German, Korean or Japanese) that
allows a relatively free word order, the feature COMPS may be analysed as taking a set or multiset, instead of
a list, of complements. For more general applications, ISO 24610-1:2006 thus introduces sets and multisets
as well as lists as built-in ways of assembling complex feature values.
Collections (; ISO 24610-1:2006, 5.8, Collections as complex feature values) take the organization
(org) attribute, with the values “list”, “set” and “bag”. In lists, order and multiplicity of elements matter. In bags,
only multiplicity matters (these are often called multisets). In sets, neither order nor multiplicity matter.
For example, the feature ARGS of verbs can be represented by specifying the organization of as a list
of values, each of which is of type phrase.
EXAMPLE List value


put
























8 © ISO 2011 – All rights reserved

Some would call the type of this collection list (phrase), but polymorphic lists are not yet supported in this part
of ISO 24610. This is equivalent to the following AVM notation, where NP stands for a feature structure of the
type phrase with a positive NOMINAL feature, namely a noun phrase, and PP, a feature structure of the type
phrase with a positive PREPOSITIONAL feature, namely a prepositional phrase. The boxed integers are the
labels for marking structure sharing as shown in Figue 2.

Figure 2 — Marking structure sharing
5.2.4 Operators
The other class of built-ins are operators that take one or more built-ins or feature structures as arguments,
but instead of constructing a collection from them, denote a value that is in some other way derived from them.
Alternations (; ISO 24610-1:2006, 5.9.2, Alternations) denote one of their arguments' values. A feature
structure containing an alternation does not denote multiple feature structures, however. An alternation is a
single value that underspecifies which of several possible alternatives it is. Alternations can be regarded as
the joins of their arguments in the partial order induced by subsumption (see 5.6).
Negations (; ISO 24610-1:2006, 5.9.3, Negation) take a single argument, and denote a value which is
not its argument. A negation is equivalent to an alternation among all values that are inconsistent with its
argument. A negation is actually not a logical negation of a value, but rather the complement of that value in
the full Boolean lattice that contains the partial order induced by subsumption.
A merge (; ISO 24610-1:2006, 5.9.4, Collection of values) denotes the concatenation or union of
several values and/or collections of values, according to how its org attribute is set. org takes the same values,
with the same meanings, as in .
5.3 Type inheritance hierarchies
The type hierarchy is discussed in great length in ISO 24610-1:2006, Annex C Type inheritance
hierarchies. This structure is normally depicted as a directed acyclic graph with a unique top node. The label
of this top node is often top, and represents the most general type, the type that is consistent with all typed
feature structures. Subtypes are connected to, and appear below, their supertypes. The most specific types
appear at the bottom of the graph. These are mutually incompatible with each other, which is generally
understood implicitly, or, on occasion, depicted by another special type, bottom as the unique bottom-most
element. Bottom is not used in this part of ISO 24610.
Figure 3 provides an example that depicts a part of the natural world:

Figure 3 — Type hierarchy for living beings
According to this picture, living beings consist of plants and animals. Animals are subclassified into fish, birds
and mammals. Dogs, humans and bovines (oxen, cows, bulls) belong to the class of mammals.
Type hierarchies are not always trees; they may have two or more branches meeting at a single node. When
this happens, it means that a type has multiple supertypes, and properties multiply inherited from all of them.
Figure 4 provides is an example of this.

Figure 4 — Medieval hierarchy of beings
Here, the type human has two parent types, animal and rational. Hence, a human is viewed as an animal like
a dog, but also a spiritual and rational being like an angel. A human thus shares some properties with both
dogs and angels.
All these types are partially ordered by a subtyping relation, , over types. A type τ is a subtype of type σ if
and only if σ is more general than τ, i.e. if the set of feature structures of type σ contains the set of feature
structures of type τ. Since the type animate is more general than the type animal in the above example, all
animals are asserted to be animate. A type σ is said to be a supertype of a type τ if and only if τ is a subtype
of σ. The immediate supertypes of a type are often called its parents.
A subtype inherits all of the properties from its supertype. The type human, for instance, inherits all the
properties from its supertypes (being, animate, animal, spiritual and rational).
[2]
Here is a linguistic example, modified from Grammar 2 of Copestake (2002) is shown in Figure 5.

Figure 5 — Type hierarchy for a simple grammar top
The type hierarchy has a unique top element. It is the most general type with no parents or immediate
supertypes. Top is only a subtype of itself.
Each type has a name and every type, except for the top-most type, has exactly one parent. The type top has
four immediate subtypes. phrase and det are incomparable – neither is a subtype or supertype of the other.
Depending on the complexity of a grammar, the type hierarchy can be very complex. Some portions of the
hierarchy may be universal to all languages, while others are very language-specific. The agreement type
agr-cat in English, for example, has only two immediate subtypes: 3sing and non-3sing (e.g. “sings” versus
“sing”).
The type det stands for a determiner such as “the” or “a”; 3sing stands for 3rd person singular, and non-3sing
stands for agreement categories other than 3sing.
This distinction is the one that is apparent in English verb agreement.
10 © ISO 2011 – All rights reserved

5.4 Type constraints
The type hierarchy is the skeleton on which the rest of the grammar grows. The rest of the grammar takes the
form of constraints over feature structures of these user-defined types. These constraints are at least of the
following three kinds: (1) implicational constraints, (2) constraints on admissible features, and (3) constraints
on admissible feature values. Actually, all of them can be thought of implicationally, e.g.
 if a feature structure is of type verb, then it may have the feature AUXiliary,
 if a feature structure is of type verb, then it may have the feature INVerted,
 if a feature structure is of type verb, then its AUX value must be “binary”,
 if a feature structure is of type verb, then its INV value must be “binary”,
 if a feature structure is of type verb and its AUX value is negative, then its INV value must be “negative”.
The first two of these are feature admissibility constraints. They tell us that a particular feature can be used in
feature structures of a particular type. The second two are constraints on admissible feature values,
sometimes called “value restrictions” or “range restrictions”. They tell us what kind of value a particular feature
must take when it occurs in a feature structure of some given type. The last of these is of a more general form;
however, this kind of constraint says that whenever a feature structure takes some particular form (determined
by types, feature values, etc.), it must satisfy some other criteria (again stated in terms of types, feature values,
etc.). This last form of constraint is generally what is meant by the phrase implicational constraint. Each of
these three forms has a different syntax in an FSD. The above constraints on verb would be encoded as
follows.
EXAMPLE Constraint on the type verb



























The first two kinds are specified together inside an element, the second being the portion
of that declaration, whereas the third is specified as an if-then conditional ().
5.5 Optional (default) values and underspecification
In a feature structure, some features must be specified and others need not be. In French, for example, the
specification of the features NUMBER and GENDER is obligatory for nouns and adjectives. In English, the
feature NUMBER must be specified for each noun, but the specification of the feature GENDER is optional. It
is obligatory for the third person singular pronouns, “he”, “she” and “it”.
Nevertheless, there are cases in which some obligatory features are not specified. In those cases, there are
two possibilities: (1) if a default value has been defined, then it is understood to be its value; and (2) if not,
then the value of the feature is inferred from the feature's range restriction.
English mass nouns such as “water” and “air” are uncountable and singular by default. Hence, their NUMBER
feature need not be specified, although the feature NUMBER is obligatory. Some countable nouns such as
“sheep” can be either singular or plural. When the NUMBER is not specified, its value is understood to be
some more general type such as number, which is a supertype of all of the admissible feature values.
Grammatical descriptions are often underspecified in order to capture generalizations. In English, for instance,
verbs are subclassified into the number of complements that they require. Intransitive verbs (“smile”, “bark”)
take only a subject, transitive verbs (“love”, “attack”) take a subject and a direct object, and ditransitive verbs
(“give”, “put”) take a subject, direct object and indirect object. Many of the grammatical phenomena, however,
do not refer to one of these specific subclasses. Examples include subject-verb agreement (“The dog barks”
versus ill-formed “the dog bark”) or subject-verb inversion (“Does the dog bark?” versus ill-formed “Do the dog
attacks Jane?”). Since the specification of this feature is irrelevant in describing these grammatical
phenomena, it is left underspecified.
Here is another example for underspecification. The analysis of a sentence like “The sheep attacked Jane”
may be underspecified with respect to the NUMBER value of “sheep”. Only if necessary is its lexical ambiguity
displayed.
Default values are specified in FSDs with , as explained in 8.4, and can be referenced in FSRs with
the element (ISO 24610-1:2006, 5.10, Default values).
5.6 Subsumption
A feature structure F subsumes another feature structure G (F  G) if and only if G contains all of the
information that F does. “Information” is delivered by a feature structure in two ways: typing and path equality.
When one views feature structures as a pair consisting of an equivalence relation on paths (≡), and a partial
typing function on paths (Θ), then formally < ≡ , Θ >  < ≡ , Θ > if and only if ≡ ⊆ ≡ and for all
F F G G F G
π ∈ Paths ∩ Paths , if Θ (π) is defined, then Θ (π) is defined and Θ (π) is a subtype of Θ (π). When F  G,
F G F G G F
we say that G extends F.
The view of typed feature structures taken here is still more general than is often the case in either the
linguistics literature or the formal literature on typed feature logic, because of the presence of symbols, strings,
numbers and other feature values than elements. With respect to extensions and subsumption, strings,
symbols, numbers and Booleans (binary values) behave as though they were types with no admissible
features that are discretely ordered alongside, but not connected to, the rest of the type inheritance hierarchy,
i.e. they have no subtype relationships with any type but themselves. Feature structures of these “types” are
only subsumed by themselves and the most general untyped feature structure, , and they have no
extensions other than themselves. Some caution must be exercised, however, with respect to determining
subsumption within this extended view of typed feature structures, because re-entrancies may still exist or not
exist between identical-looking symbols, strings, numbers, etc. The more extensional view of identity that
usually accompanies these other entities is inconsistent with the view of identity that the logic of typed feature
structures takes with respect to feature structures over its own types. It is the latter that this part of ISO 24610
uses and applies to both feature structures and these other entities, when they occur within feature structures.
Alternations are also often excluded from more formal work on typed feature logic, but can be thought of as
joins of their respective typed feature structure arguments in the partial order of typed feature structures
induced by subsumption. The negation of a value can similarly be thought of as the join of every structure that
12 © ISO 2011 – All rights reserved

is inconsistent with that value under unification. Collections actually depend on the organization. Lists appear
in the subsumption partial order as though they were encoded as typed feature structures using this FSD.
EXAMPLE Sample FSD


Empty lists


Non-empty lists







One bag (multiset) B subsumes another bag B if and only if there exists a total surjection σ between the
1 2
elements of the two bags such that, for all b in the domain of B with multiplicity µ (b ), and all b in the

1 1 1 1 2
domain of B with multiplicity µ (b ):

2 2 2
1) b  σ(b ),
1 1
2) µ (b ) =     ∑ µ (b ),

2 2 1 1
b : σ(b )  b
1 1 2
and σ can be extended to a total function, σ*, between th
...


INTERNATIONAL ISO
STANDARD 24610-2
First edition
2011-10-01
Language resource management —
Feature structures —
Part 2:
Feature system declaration
Gestion des ressources langagières — Structures de traits —
Partie 2: Déclaration de système de structures de traits

Reference number
©
ISO 2011
©  ISO 2011
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2011 – All rights reserved

Contents Page
Foreword . iv
Introduction . v
1  Scope . 1
2  Normative references . 1
3  Terms and definitions . 2
4  Overall structure . 5
5  Basic concepts . 6
5.1  Typed feature structures reviewed . 6
5.2  Types . 7
5.3  Type inheritance hierarchies . 9
5.4  Type constraints . 11
5.5  Optional (default) values and underspecification . 12
5.6  Subsumption . 12
6  Defining well-formedness versus validity. 14
6.1  Overview . 14
6.2  ISO 24610 . 14
7  A feature system for a grammar . 19
7.1  Overview . 19
7.2  Sample FSDs . 20
8  Declaration of a feature system . 23
8.1  Overview . 24
8.2  Linking a text to feature system declarations . 24
8.3  Overall structure of a feature system declaration . 25
8.4  Feature declarations . 27
8.5  Feature structure constraints . 33
Annex A (normative) XML schema for feature structures . 36
Annex B (informative) A complete example . 46
Bibliography . 50

Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies
(ISO member bodies). The work of preparing International Standards is normally carried out through ISO
technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of technical committees is to prepare International Standards. Draft International Standards
adopted by the technical committees are circulated to the member bodies for voting. Publication as an
International Standard requires approval by at least 75 % of the member bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights.
ISO 24610-2 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content
resources, Subcommittee SC 4, Language resource management.
ISO 24610 consists of the following parts, under the general title Language resource management — Feature
structures:
 Part 1: Feature structure representation
 Part 2: Feature system declaration
iv © ISO 2011 – All rights reserved

Introduction
ISO 24610 is organized in two separate main parts.
 Part 1, Feature structure representation, is dedicated to the description of feature structures, providing an
informal and yet explicit outline of their characteristics, as well as an XML-based structured way of
representing feature structures in general and typed feature structures in particular. It is designed to lay a
basis for constructing an XML-based reference format for exchanging (typed) feature structures between
applications.
 Part 2, Feature system declaration, will provide an implementation standard for XML-based typed feature
structures, first by defining a set of types and their hierarchy, then by formulating type constraints on a set
of features and their respective admissible feature values and finally by introducing a set of validity
conditions on feature structures for particular applications, especially related to the goal of language
resource management.
A feature structure is a general-purpose data structure that identifies and groups together individual features
by assigning a particular value to each. Because of the generality of feature structures, they can be used to
represent many different kinds of information. Interrelations among various pieces of information and their
instantiation in markup provide a meta-language for representing linguistic content. Moreover, this
instantiation allows a specification of a set of features and values associated with specific types and their
restrictions, by means of feature system declarations, or other XML mechanisms to be discussed in this part
of ISO 24610.
Some of the statements here are copied from ISO 24610-1:2006 in order to make this part standalone without
referring to part 1.
INTERNATIONAL STANDARD ISO 24610-2:2011(E)

Language resource management — Feature structures —
Part 2:
Feature system declaration
1 Scope
This part of ISO 24610 provides a format to represent, store or exchange feature structures in natural
language applications, for both annotation and production of linguistic data. It is ultimately designed to provide
a computer format to define a type hierarchy and to declare the constraints that bear on a set of feature
specifications and operations on feature structures, thus offering means to check the conformance of each
feature structure with regards to a reference specification. Feature structures are an essential part of many
linguistic formalisms as well as an underlying mechanism for representing the information consumed or
produced by and for language engineering applications.
A feature system declaration (FSD) is an auxiliary file used in conjunction with a certain type of text that
makes use of fs (that is, feature structure) elements. The FSD serves four purposes.
 It provides an encoding by which types and their subtyping and inheritance relationships can be
introduced and defined, thus laying the basis for constructing a feature system.
 It provides a mechanism by which the encoder can list all of the feature names and feature values and
give a prose description as to what each represents.
 It provides a mechanism by which type constraints can be declared, against which typed feature
structures are validated relative to a given theory stated in typed feature logic. These constraints may
involve constraints on the range of a feature's value, constraints on which features are permitted within
certain types of feature structures, or constraints that prevent the co-occurrence of certain feature-value
pairs. The source of these constraints is normally the empirical domain being modelled.
 It provides a mechanism by which the encoder can define the intended interpretation of underspecified
feature structures. This involves defining default values (whether literal or computed) for missing features.
The scheme described in this part of ISO 24610 may be used to document any feature system, but is primarily
intended for use with the typed feature structure representation defined in ISO 24610-1. The feature structure
representations of ISO 24610-1 specify data structures that are subject to the typing conventions and
constraints specified using ISO 24610-2. The feature structure representations of ISO 24610-1 are also used
within some of the elements defined in ISO 24610-2.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO 24610-1:2006, Language resource management — Feature structures — Part 1: Feature structure
representation
ISO/IEC 19757-2, Information technology — Document Schema Definition Language (DSDL) — Part 2:
Regular-grammar-based validation — RELAX NG
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 19757-2 and the following apply.
3.1
admissibility constraint
feature admissibility constraint
specification of a set of admissible features (3.2) and admissible feature values (3.3) associated with a
specific type (3.24)
3.2
admissible feature
appropriate feature
feature which any feature structure (3.14) of a given type (3.24) may bear a value (3.17) for
NOTE This term is often interpreted elsewhere to mean obligatory, i.e. feature structures of the given type must bear
a value for every admissible feature. This term does not imply that the feature is obligatory here.
3.3
admissible feature value
admissible value
value restriction
range restriction
value (3.17) that the value of an admissible feature (3.2) must be subsumed by in feature structures (3.14)
of a given type (3.24)
3.4
atomic type
user-defined type (3.24) with no admissible features (3.2) declared or inherited
3.5
bag
multiset
triple of an integer n, a set S and a function that maps the integers in the range, 1 to n, to elements of S
NOTE A bag is halfway between a set (in that its elements are unordered) and a list (in that particular elements can
occur more than once).
3.6
built-in
non-user-defined element that may appear in place of a feature structure (3.14), for example, as a feature
value (3.17)
NOTE Built-ins can be atomic or complex. The atomic built-ins are numeric, string, symbol and binary. The complex
built-ins are collections (3.7) and applications of the operators, i.e. alternation, negation and merge (5.2.4).
3.7
collection
feature value (3.17) consisting of potentially many values, organized as a list, set or bag (3.5)
3.8
constraint
unit of specification that identifies some collection of feature structures (3.14) as invalid
NOTE 1 All constraints are implicational in their syntactic form, although some are distinguished as admissibility
constraints. See validity (3.31) and 5.4. All feature structures not explicitly excluded as invalid are considered to be valid.
NOTE 2 A feature structure that has not been so identified by any of the constraints in a feature system is considered
to be valid.
2 © ISO 2011 – All rights reserved

3.9
default value
value (3.17) otherwise assigned to a feature (3.12) when one is not specified
EXAMPLE Masculine is the default value of the grammatical gender in Dutch.
NOTE A feature structure may not bear a feature without a corresponding value.
3.10
empty feature structure
feature structure (3.14) that contains no information
NOTE An empty feature structure subsumes all other feature structures.
3.11
extension
converse of subsumption (3.21)
NOTE A feature structure F extends G if and only if G subsumes F.
3.12
feature
property or aspect of an entity that is formally represented as a function mapping the entity to a corresponding
value (3.17)
3.13
feature specification
pairing of a feature (3.12) with a value (3.17) in a feature structure description
3.14
feature structure
record structure that associates one value (3.17) to each of a collection of features
NOTE 1 Each value is either a feature structure or a simpler built-in (3.6) such as a string.
NOTE 2 Feature structures are partially ordered. The minimal feature structures in this ordering are the empty feature
structures.
3.15
feature system
type hierarchy (3.26) in which each type (3.24) has been associated with a collection of admissibility
constraints (3.1) and implicational constraints (3.18)
NOTE cf. type declaration (3.25)
3.16
feature system declaration
FSD
specification of a particular feature system (3.15)
3.17
feature value
value
entity or aggregation of entities that characterize some property or aspect of another entity
3.18
implicational constraint
constraint of the form, “if G, then H,” where G and H are feature structures (3.14)
NOTE This identifies any feature structure F as invalid for which G subsumes F, and yet F and H have no valid
extension in common. See subsumption (3.21) and 8.5. Often used to refer to implicational constraints that are not also
admissibility constraints.
3.19
interpretation
minimally informative (or equivalently, most general) extension (3.11) of a feature structure (3.14) that is
consistent with a set of constraints declared by an FSD (3.16)
3.20
partial order
partially ordered set
set S equipped with a relation  over S  S that is (1) reflexive (for all s  S, s  s), (2) anti-symmetric (for all p,
q  S, if p  q and q  p, then p  q), and (3) transitive (for all p, q, r  S, if p  q and q  r, then p  r)
NOTE The set of integers Z is partially ordered, but it has an additional property: for every p, q  Z, either p  q or
q  p. Not all partial orders have this property. The taxonomical classification of organisms into phyla, genera and species,
for example, is a partial order that does not. Type hierarchies may not necessarily. The typed feature structures of a
feature system do not, unless (a) their type hierarchy does, and (b) either the type hierarchy has exactly one type, or every
y type is constrained to have exactly one appropriate feature.
3.21
subsumption
property that holds between two feature structures, G and F, such that G is said to subsume F if and only if F
carries all of the information with it that G does
NOTE A formal definition is provided in 5.6.
3.22
subtype
type (3.24) to which another type confers its constraints and appropriate features
3.23
supertype
base type
type (3.24) from which another type inherits constraints and appropriate features
NOTE s is a subtype of t iff t is a supertype of s. Every type is a subtype and supertype of itself.
3.24
semantic type
type
referring expression that distinguishes a collection of feature structures (3.14) as an identifiable and
conceptually significant class
NOTE As implied by the name semantic type, types in this part of ISO 24610 do not serve to distinguish feature
structures or their specifications syntactically.
3.25
type declaration
structure that declares the supertypes (3.23), admissible features (3.2), admissible feature values (3.3),
admissibility constraints (3.1) and implicational constraints (3.18) for a given type (3.24)
NOTE The constraints on a type in the resulting feature system are those that have been declared in its declaration,
in addition to those that it has inherited from its supertypes.
4 © ISO 2011 – All rights reserved

3.26
type hierarchy
partial order (3.20) over a set of types (3.24)
NOTE See ISO 24610-1:2006, Annex C, Type inheritance hierarchies.
3.27
typed feature structure
TFS
feature structure (3.14) that bears a type (3.24)
3.28
typing
assignment of a semantic type (3.24) to a built-in (3.6) or feature structure (3.14), either atomic or complex
NOTE Semantic types in feature systems are partially ordered, with multiple inheritance.
3.29
underspecification
provision of partial information about a value (3.17)
NOTE An underspecification generally subsumes one of a range of candidate values that could be resolved to a
single value through subsequent constraint resolution. See subsumption (3.21).
3.30
well-formedness
syntactic conformity of a feature structure (3.14) representation to ISO 24610-1
3.31
validity
conformity of a typed feature structure (3.27) to the constraints (3.8) of a particular feature system (3.15)
NOTE See Clause 6.
4 Overall structure
The main part of the document consists of four clauses: Clauses 5, 6, 7 and 8.
 Clause 5, Basic concepts, reviews the definition of typed feature structures and the notions of atomic and
complex types, collections and other operators that may appear in feature values. It then describes the
notions of type inheritance hierarchies, type constraints, default values and underspecification that are
essential to the construction of feature systems.
 Clause 6, Defining well-formedness versus validity, discusses the conditions of well-formedness and
validity.
 Clause 7, A feature system for a grammar, illustrates how to define types with a type hierarchy and type
constraints which declare what features and values are admissible for specific types.
 Finally, Clause 8, Declaration of a feature system, discusses how a feature system can be declared and
developed into a validator.
The main part of the document is followed by two annexes: Annex A contains the XML schema for this part of
ISO 24610; Annex B contains a complete example.
5 Basic concepts
5.1 Typed feature structures reviewed
Typed feature structures (TFSs) are introduced as basic records for language resource management.
For more information, refer to ISO 24610-1:2006, 4.7, Typed feature structure, and Annex C, Type inheritance
hierarchies.
Here, a TFS is formally defined as a tuple over a finite set Feat of features, a collection X of
non-feature-structure elements, and a type hierarchy Type, , where Type is a finite set of types and  is a
subtyping relation over Type.
A feature structure is a tuple , in which
a) Q is a set of nodes,
b) γ ∈ Q is the root node of the feature structure,
c) θ : Q → Type is a partial typing function, and
d) δ : Feat × Q → Q ∪ X is a partial feature value function,
such that, for all q ∈ Q, there exists a path of features F , ., F such that δ[F , . δ(F , γ) . ]  q.

1 n n 1
elements denote nodes. This definition deviates from the standard one used in linguistics and theoretical
computer science in that (1) typing is partial, not total, i.e. not all feature structures have types, and (2) feature
values might not be feature structures, but instead be drawn from a collection denoted by other XML elements
such as string, numeric, symbol, and binary (the X above). Note that nodes are typed, but features themselves
are not.
The following XML representation of a feature structure is considered well-formed, where the attribute type is
assigned to each of the two elements.
EXAMPLE Typed feature structure:


had












The feature name ORTH above stands for orthography, the conventional written form of a word or phrase.
This XML representation shows how the morpho-syntactic features of an English word “had” are specified as
a past-tensed and non-auxiliary verb.
In the alternative, “matrix” or “AVM” notation, type names are conventionally in the lower-case, sometimes
italicized or in the text type font, feature names in the upper-case, and strings in quotes. Binary values are
6 © ISO 2011 – All rights reserved

indicated with  or . These conventions are followed in this document, too. The above feature structure would
be depicted in matrix notation as shown in Figure 1.

Figure 1 — Matrix notation
5.2 Types
5.2.1 Atomic types
Alongside the built-ins (, , and ), it is possible for a feature structure to
have a type but no features. These are called simple or atomic feature structures, and types that allow for no
features in their feature system declaration (FSD) are called atomic types.
There is, as a result, always the possibility of declaring new atomic types and using these instead of the
above-mentioned built-ins to specify simple values. The above feature structure, for example, could have
instead been rendered as follows, assuming the extra types had, past and false were declared in an FSD.
EXAMPLE Typed feature structure: alternative formulation















There is a difference also noticed between the two classes of built-ins: on the one hand, and
, and on the other. Any kind of string is permissible as the content of the
element, whereas a very restricted set of values is permissible in , and
elements. To reflect this difference, members of the latter class specify their values using the attribute value.
The type , for instance, is associated with four values: true, false, plus (equivalent to true) and minus
(equivalent to false).
NOTE ISO 24610-1:2006 introduced the type binary, but the W3C's XML schema (2001) names it boolean.
It is the duty of the encoder to choose between atomic-type encodings and built-in encodings consistently.
This part of ISO 24610 does not regard one as identical or even consistent with the other.
5.2.2 Complex types
Types that are not atomic are called complex. These include all of the types declared by the encoder in an
FSD that declare or inherit admissible features. A feature is only admissible to a type if feature structures of
that type are permitted by the FSD to have values for that feature. This does not mean that well-formed
feature structures cannot arbitrarily associate types with feature structures regardless of their featural content
– they can. But only those feature structures that use only admissible features to their type, as specified by
some FSD, could be validated against that FSD. The distinction between validity and well-formedness is
further elaborated upon in Clause 6.
All user-declared types, no matter whether they are atomic or complex, are semantic, i.e. syntactically, they
look no different from each other, apart from the value of their type attribute. It is the role of a validator to
interpret the real significance of these types through enforcing restrictions on admissibility, restrictions on the
possible values that admissible features can have (), and other constraints that take the form of
logical implications. All of these are specified, for each type, in an FSD.
The built-ins defined by the ISO 24610-1:2006 feature structure representations (FSRs) standard are purely
syntactic. They can be used without declaration in an FSD, and they cannot be declared in an FSD. They can
appear in value range restrictions, or in implicational constraints, but they cannot have such restrictions (since
they have no admissible features) or constraints of their own.
5.2.3 Collections
Not all built-ins are as simple as those mentioned above, however. Some grammatical features such as
specifiers (SPR), complements (COMPS) and arguments (ARGS) are considered as having a list of
[10]
grammatical values, especially in Head-driven Phrase Structure Grammars (Pollard and Sag 1994 ; Sag,
[12]
Wasow, Bender 2003 ). For languages other than English, some of these features may take other kinds of
collections, namely sets or multisets, as their value. In a language (e.g. German, Korean or Japanese) that
allows a relatively free word order, the feature COMPS may be analysed as taking a set or multiset, instead of
a list, of complements. For more general applications, ISO 24610-1:2006 thus introduces sets and multisets
as well as lists as built-in ways of assembling complex feature values.
Collections (; ISO 24610-1:2006, 5.8, Collections as complex feature values) take the organization
(org) attribute, with the values “list”, “set” and “bag”. In lists, order and multiplicity of elements matter. In bags,
only multiplicity matters (these are often called multisets). In sets, neither order nor multiplicity matter.
For example, the feature ARGS of verbs can be represented by specifying the organization of as a list
of values, each of which is of type phrase.
EXAMPLE List value


put
























8 © ISO 2011 – All rights reserved

Some would call the type of this collection list (phrase), but polymorphic lists are not yet supported in this part
of ISO 24610. This is equivalent to the following AVM notation, where NP stands for a feature structure of the
type phrase with a positive NOMINAL feature, namely a noun phrase, and PP, a feature structure of the type
phrase with a positive PREPOSITIONAL feature, namely a prepositional phrase. The boxed integers are the
labels for marking structure sharing as shown in Figue 2.

Figure 2 — Marking structure sharing
5.2.4 Operators
The other class of built-ins are operators that take one or more built-ins or feature structures as arguments,
but instead of constructing a collection from them, denote a value that is in some other way derived from them.
Alternations (; ISO 24610-1:2006, 5.9.2, Alternations) denote one of their arguments' values. A feature
structure containing an alternation does not denote multiple feature structures, however. An alternation is a
single value that underspecifies which of several possible alternatives it is. Alternations can be regarded as
the joins of their arguments in the partial order induced by subsumption (see 5.6).
Negations (; ISO 24610-1:2006, 5.9.3, Negation) take a single argument, and denote a value which is
not its argument. A negation is equivalent to an alternation among all values that are inconsistent with its
argument. A negation is actually not a logical negation of a value, but rather the complement of that value in
the full Boolean lattice that contains the partial order induced by subsumption.
A merge (; ISO 24610-1:2006, 5.9.4, Collection of values) denotes the concatenation or union of
several values and/or collections of values, according to how its org attribute is set. org takes the same values,
with the same meanings, as in .
5.3 Type inheritance hierarchies
The type hierarchy is discussed in great length in ISO 24610-1:2006, Annex C Type inheritance
hierarchies. This structure is normally depicted as a directed acyclic graph with a unique top node. The label
of this top node is often top, and represents the most general type, the type that is consistent with all typed
feature structures. Subtypes are connected to, and appear below, their supertypes. The most specific types
appear at the bottom of the graph. These are mutually incompatible with each other, which is generally
understood implicitly, or, on occasion, depicted by another special type, bottom as the unique bottom-most
element. Bottom is not used in this part of ISO 24610.
Figure 3 provides an example that depicts a part of the natural world:

Figure 3 — Type hierarchy for living beings
According to this picture, living beings consist of plants and animals. Animals are subclassified into fish, birds
and mammals. Dogs, humans and bovines (oxen, cows, bulls) belong to the class of mammals.
Type hierarchies are not always trees; they may have two or more branches meeting at a single node. When
this happens, it means that a type has multiple supertypes, and properties multiply inherited from all of them.
Figure 4 provides is an example of this.

Figure 4 — Medieval hierarchy of beings
Here, the type human has two parent types, animal and rational. Hence, a human is viewed as an animal like
a dog, but also a spiritual and rational being like an angel. A human thus shares some properties with both
dogs and angels.
All these types are partially ordered by a subtyping relation, , over types. A type τ is a subtype of type σ if
and only if σ is more general than τ, i.e. if the set of feature structures of type σ contains the set of feature
structures of type τ. Since the type animate is more general than the type animal in the above example, all
animals are asserted to be animate. A type σ is said to be a supertype of a type τ if and only if τ is a subtype
of σ. The immediate supertypes of a type are often called its parents.
A subtype inherits all of the properties from its supertype. The type human, for instance, inherits all the
properties from its supertypes (being, animate, animal, spiritual and rational).
[2]
Here is a linguistic example, modified from Grammar 2 of Copestake (2002) is shown in Figure 5.

Figure 5 — Type hierarchy for a simple grammar top
The type hierarchy has a unique top element. It is the most general type with no parents or immediate
supertypes. Top is only a subtype of itself.
Each type has a name and every type, except for the top-most type, has exactly one parent. The type top has
four immediate subtypes. phrase and det are incomparable – neither is a subtype or supertype of the other.
Depending on the complexity of a grammar, the type hierarchy can be very complex. Some portions of the
hierarchy may be universal to all languages, while others are very language-specific. The agreement type
agr-cat in English, for example, has only two immediate subtypes: 3sing and non-3sing (e.g. “sings” versus
“sing”).
The type det stands for a determiner such as “the” or “a”; 3sing stands for 3rd person singular, and non-3sing
stands for agreement categories other than 3sing.
This distinction is the one that is apparent in English verb agreement.
10 © ISO 2011 – All rights reserved

5.4 Type constraints
The type hierarchy is the skeleton on which the rest of the grammar grows. The rest of the grammar takes the
form of constraints over feature structures of these user-defined types. These constraints are at least of the
following three kinds: (1) implicational constraints, (2) constraints on admissible features, and (3) constraints
on admissible feature values. Actually, all of them can be thought of implicationally, e.g.
 if a feature structure is of type verb, then it may have the feature AUXiliary,
 if a feature structure is of type verb, then it may have the feature INVerted,
 if a feature structure is of type verb, then its AUX value must be “binary”,
 if a feature structure is of type verb, then its INV value must be “binary”,
 if a feature structure is of type verb and its AUX value is negative, then its INV value must be “negative”.
The first two of these are feature admissibility constraints. They tell us that a particular feature can be used in
feature structures of a particular type. The second two are constraints on admissible feature values,
sometimes called “value restrictions” or “range restrictions”. They tell us what kind of value a particular feature
must take when it occurs in a feature structure of some given type. The last of these is of a more general form;
however, this kind of constraint says that whenever a feature structure takes some particular form (determined
by types, feature values, etc.), it must satisfy some other criteria (again stated in terms of types, feature values,
etc.). This last form of constraint is generally what is meant by the phrase implicational constraint. Each of
these three forms has a different syntax in an FSD. The above constraints on verb would be encoded as
follows.
EXAMPLE Constraint on the type verb



























The first two kinds are specified together inside an element, the second being the portion
of that declaration, whereas the third is specified as an if-then conditional ().
5.5 Optional (default) values and underspecification
In a feature structure, some features must be specified and others need not be. In French, for example, the
specification of the features NUMBER and GENDER is obligatory for nouns and adjectives. In English, the
feature NUMBER must be specified for each noun, but the specification of the feature GENDER is optional. It
is obligatory for the third person singular pronouns, “he”, “she” and “it”.
Nevertheless, there are cases in which some obligatory features are not specified. In those cases, there are
two possibilities: (1) if a default value has been defined, then it is understood to be its value; and (2) if not,
then the value of the feature is inferred from the feature's range restriction.
English mass nouns such as “water” and “air” are uncountable and singular by default. Hence, their NUMBER
feature need not be specified, although the feature NUMBER is obligatory. Some countable nouns such as
“sheep” can be either singular or plural. When the NUMBER is not specified, its value is understood to be
some more general type such as number, which is a supertype of all of the admissible feature values.
Grammatical descriptions are often underspecified in order to capture generalizations. In English, for instance,
verbs are subclassified into the number of complements that they require. Intransitive verbs (“smile”, “bark”)
take only a subject, transitive verbs (“love”, “attack”) take a subject and a direct object, and ditransitive verbs
(“give”, “put”) take a subject, direct object and indirect object. Many of the grammatical phenomena, however,
do not refer to one of these specific subclasses. Examples include subject-verb agreement (“The dog barks”
versus ill-formed “the dog bark”) or subject-verb inversion (“Does the dog bark?” versus ill-formed “Do the dog
attacks Jane?”). Since the specification of this feature is irrelevant in describing these grammatical
phenomena, it is left underspecified.
Here is another example for underspecification. The analysis of a sentence like “The sheep attacked Jane”
may be underspecified with respect to the NUMBER value of “sheep”. Only if necessary is its lexical ambiguity
displayed.
Default values are specified in FSDs with , as explained in 8.4, and can be referenced in FSRs with
the element (ISO 24610-1:2006, 5.10, Default values).
5.6 Subsumption
A feature structure F subsumes another feature structure G (F  G) if and only if G contains all of the
information that F does. “Information” is delivered by a feature structure in two ways: typing and path equality.
When one views feature structures as a pair consisting of an equivalence relation on paths (≡), and a partial
typing function on paths (Θ), then formally < ≡ , Θ >  < ≡ , Θ > if and only if ≡ ⊆ ≡ and for all
F F G G F G
π ∈ Paths ∩ Paths , if Θ (π) is defined, then Θ (π) is defined and Θ (π) is a subtype of Θ (π). When F  G,
F G F G G F
we say that G extends F.
The view of typed feature structures taken here is still more general than is often the case in either the
linguistics literature or the formal literature on typed feature logic, because of the presence of symbols, strings,
numbers and other feature values than elements. With respect to extensions and subsumption, strings,
symbols, numbers and Booleans (binary values) behave as though they were types with no admissible
features that are discretely ordered alongside, but not connected to, the rest of the type inheritance hierarchy,
i.e. they have no subtype relationships with any type but themselves. Feature structures of these “types” are
only subsumed by themselves and the most general untyped feature structure, , and they have no
extensions other than themselves. Some caution must be exercised, however, with respect to determining
subsumption within this extended view of typed feature structures, because re-entrancies may still exist or not
exist between identical-looking symbols, strings, numbers, etc. The more extensional view of identity that
usually accompanies these other entities is inconsistent with the view of identity that the logic of typed feature
structures takes with respect to feature structures over its own types. It is the latter that this part of ISO 24610
uses and applies to both feature structures and these other entities, when they occur within feature structures.
Alternations are also often excluded from more formal work on typed feature logic, but can be thought of as
joins of their respective typed feature structure arguments in the partial order of typed feature structures
induced by subsumption. The negation of a value can similarly be thought of as the join of every structure that
12 © ISO 2011 – All rights reserved

is inconsistent with that value under unification. Collections actually depend on the organization. Lists appear
in the subsumption partial order as though they were encoded as typed feature structures using this FSD.
EXAMPLE Sample FSD


Empty lists


Non-empty lists







One bag (multiset) B subsumes another bag B if and only if there exists a total surjection σ between the
1 2
elements of the two bags such that, for all b in the domain of B with multiplicity µ (b ), and all b in the

1 1 1 1 2
domain of B with multiplicity µ (b ):

2 2 2
1) b  σ(b ),
1 1
2) µ (b ) =     ∑ µ (b ),

2 2 1 1
b : σ(b )  b
1 1 2
and σ can be extended to a total function, σ*, between the substructures of the elements of the two bags,
such that, for all substructures, c, of the elements of B :
3) σ*(c) = σ(c) if c is an element of B , and
4) σ * [δ(F, c)] = δ[F, σ*(c)], for every F ∈ Feat such that δ(F, c) is defined.
One set S likewise subsumes another set S if and only if conditions 1), 3) and 4) above apply. This means,
1 2
for example, that the two-element set {F , F } subsumes the one-element set {G } if both F  G and F  G .

1 2 1 1 1 2 1
This partially ordered interpretation of sets is called the Pollard-Moshier set theory, and it is one of the most
commonly used theories in typed feature logic.
In addition, a bag subsumes any list that is a permutation of its elements. A set subsumes a bag if the domain
of the bag is the set, i.e. all and only the elements of the set appear in the bag one or more times.
A combination of collections () occupies the same position in the subsumption partial order as the
result of the concatenation or union that it specifies, with the organization it specifies, would have.
The reflexive and transitive closure of all of these conditions produces the subsumption relation assumed by
this part of ISO 24610.
6 Defining well-formedness versus validity
6.1 Overview
6.1.1 General
This clause distinguishes the use of the concepts of well-form
...


МЕЖДУНАРОДНЫЙ ISO
СТАНДАРТ 24610-2
Первое издание
2011-10-01
Управление языковыми ресурсами.
Структуры элементов.
Часть 2.
Декларация системы элементов
Language resource management. – Feature structures –
Part 2:
Feature system declaration
Ответственность за подготовку русской версии несѐт GOST R
(Российская Федерация) в соответствии со статьѐй 18.1 Устава ISO

Ссылочный номер
©
ISO 2011
ДОКУМЕНТ ЗАЩИЩЁН АВТОРСКИМ ПРАВОМ

©  ISO 2011
Все права сохраняются. Если не указано иное, никакую часть настоящей публикации нельзя копировать или использовать в
какой-либо форме или каким-либо электронным или механическим способом, включая фотокопии и микрофильмы, без
предварительного письменного согласия издателя.
ISO copyright office
Case postale 56  CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Опубликовано в Швейцарии
©
ii ISO 2011 – Все права сохраняются

Содержание Страница
Предисловие .iv
Введение .v
1 Область применения .1
2 Нормативные ссылки .1
3 Термины и определения .2
4 Общая структура стандарта .6
5 Базовые понятия .7
5.1 Рассматриваемые типизированные структуры элементов .7
5.2 Типы .8
5.3 Иерархии наследования типов . 11
5.4 Ограничения для типов . 12
5.5 Опциональные (стандартные) значения и недоопределение . 13
5.6 Категоризация . 14
6 Определение формальной правильности и адекватности . 16
6.1 Общее описание . 16
6.2 О стандарте ISO 24610 . 17
7 Система элементов для грамматики. 22
7.1 Общие сведения . 22
7.2 Выборочные FSD. 23
8 Декларация системы элементов . 27
8.1 Общие сведения . 27
8.2 Привязка текста к декларациям систем элементов . 28
8.3 Общая структура декларации системы элементов. 29
8.4 Декларации элементов . 31
8.5 Ограничения структуры элементов . 37
Приложение A (нормативное) Схема XML для структур элементов . 40
Приложение B (информативное) Детализированный пример . 50
Библиография . 54

©
ISO 2011 – Все права сохраняются iii

Предисловие
Международная организация по стандартизации (ISO) является всемирной федерацией национальных
организаций по стандартизации (комитетов-членов ISO). Разработка международных стандартов
обычно осуществляется техническими комитетами ISO. Каждый комитет-член, заинтересованный в
деятельности, для которой был создан технический комитет, имеет право быть представленным в этом
комитете. Международные правительственные и неправительственные организации, имеющие связь с
ISO, также принимают участие в работе. ISO работает в тесном сотрудничестве с Международной
электротехнической комиссией (IEC) по всем вопросам стандартизации в области электротехники.
Проекты международных стандартов разрабатываются согласно правилам, приведѐнным в Директивах
ISO/IEC, Часть 2.
Разработка международных стандартов является основной задачей технических комитетов. Проекты
международных стандартов, принятые техническими комитетами, рассылаются комитетам-членам на
голосование. Для публикации в качестве международного стандарта требуется одобрение не менее
75 % комитетов-членов, принявших участие в голосовании.
Принимается во внимание тот факт, что некоторые из элементов настоящей части стандарта ISO 9735
могут быть объектом патентных прав. ISO не принимает на себя обязательств по определению
отдельных или всех таких патентных прав.
ISO 24610-2 был подготовлен Техническим комитетом ISO/TC 37, Терминология и другие языковые и
информационные ресурсы, Подкомитет SC 4, Управление языковыми ресурсами.
В целом серия ISO 24610 состоит из следующих частей, объединѐнных общим названием Управление
языковыми ресурсами. Структуры элементов:
 Часть 1. Представление структуры элементов
 Часть 2. Декларация системы элементов
©
iv ISO 2011 – Все права сохраняются

Введение
ISO 24610 состоит из двух отдельных важных частей.
 Часть 1, Представление структуры элементов, посвящена описанию структур, обеспечивающих
неформальное, но достаточно явное выражение их характеристик, а также описанию
представления структур элементов с использованием языка XML вообще и различных типов таких
структур, в частности. В этой части закладываются основы правильного форматирования
конструируемых XML-ссылок, обеспечивающих обмен структурами элементов (возможно, с
выделением типов) между приложениями.
 Часть 2, Декларация системы элементов, предоставляет стандартный метод реализации
различных типов структур элементов в языковой среде XML: сначала путѐм определения
множества типов и их иерархии; затем посредством формулирования ограничений, касающихся
различных типов, на множестве элементов и их допустимых значений, и, наконец, путѐм введения
множества условий, касающихся надѐжности структур элементов в аспекте их использования в
конкретных приложениях, - особенно, в целях управления языковыми ресурсами.
Структура элементов – это структура данных общего назначения, которая идентифицирует и
группирует отдельные элементы посредством присваивания каждому из них конкретного значения.
Благодаря универсальности структур элементов они могут использоваться для представления самых
разных типов информации. Существующие связи между различными «порциями» информации и их
реализация в языке разметки образуют некоторый метаязык для представления контента
лингвистического характера. Более того, подобная реализация позволяет сформировать описание
множества элементов и значений, соответствующих конкретным типам и их ограничениям,
посредством декларирования системы элементов или с помощью других механизмов языка XML,
обсуждаемых в данной части ISO 24610.
Некоторые положения данной части заимствованы из ISO 24610-1:2006 в целях обеспечения полной
независимости части 2 от части 1.

©
ISO 2011 – Все права сохраняются v

МЕЖДУНАРОДНЫЙ СТАНДАРТ ISO 24610-2:2011(R)

Управление языковыми ресурсами. Структуры элементов.
Часть 2.
Декларация системы элементов
1 Область применения
В данной части ISO 24610 предлагается формат представления, хранения и обмена для структур
элементов в прикладных системах, основанных на использовании естественного языка, как для
аннотирования, так и для формирования лингвистических данных. Основная цель состоит в том, чтобы
предложить такой формат машинной обработки, который позволяет определить иерархию типов и
декларировать ограничения, накладываемые на множество спецификаций элементов и на операции со
структурами элементов, обеспечивая таким образом средства контроля соответствия каждой
структуры элементов их базовой спецификации. Структуры элементов – это важнейшая часть многих
формализаций в лингвистике и основополагающий механизм представления информации,
используемой или порождаемой в приложениях, связанных с построением языковых систем.
Декларация системы элементов (FSD -feature system declaration) представляет собой вспомогательный
файл, относящийся к тексту конкретного типа, в рамках которого используются структурированные
элементы. Такая декларация служит четырѐм основным целям.
 Обеспечивает кодирование, посредством которого могут вводиться и определяться типы и
подтипы, образующие основу для конструирования системы элементов;
 Предоставляет механизм, с помощью которого кодировщик может сформировать список имѐн всех
элементов с соответствующими значениями и дать текстовое описание сущности каждого из них;
 Реализует механизм декларирования разных типов ограничений, в соответствии с которыми
осуществляется контроль достоверности различных типов структур элементов на основе
использования теоретических принципов, установленных логикой выделения типов элементов;
этими ограничениями могут задаваться диапазон допустимых значений элемента, разрешѐнные
типы структур элементов или запрет на совместное вхождение в ту или иную структуру
определѐнных пар значений элементов; первоисточником таких ограничений обычно бывает
подлежащая моделированию предметная область;
 Предоставляет механизм, посредством которого кодировщик может определять подразумеваемую
интерпретацию недоопределѐнных структур элементов: например, механизм определения
значений по умолчанию (литеральных или вычисляемых) для опущенных элементов.
Схема, описанная в данной части ISO 24610, может применяться для документирования любой
системы элементов, но предназначена, главным образом, для использования в рамках представлений
типизированных структур элементов, определѐнных в ISO 24610-1. Такие представления задают
структуры данных, подчиняющиеся условиям выделения типов и конкретным ограничениям,
определяемым с помощью ISO 24610-2. Представления структур элементов по ISO 24610-1
используются также применительно к некоторым элементам, определѐнным в ISO 24610-2.
2 Нормативные ссылки
Для применения данного документа необходимо обеспечение соответствия приведѐнным ниже
нормативным документам. Применительно к недатированным ссылочным документам (с плавающими
ссылками) действующим остаѐтся самое последнее издание нормативного документа.
©
ISO 2011 – Все права сохраняются 1

ISO 24610-1:2006, Управление языковыми ресурсами. Структуры элементов. Часть 1:
Представление структур элементов
ISO/IEC 19757-2, Информационные технологии. Язык определения схемы документа (DSDL).
Часть 2. Валидация на основе регулярной грамматики. RELAX NG
3 Термины и определения
Для целей данного документа используются термины и определения из стандарта ISO 19757-2, а
также терминология, приведѐнная ниже.
3.1
ограничение по допустимости
admissibility constraint
ограничение по разрешѐнным элементам
feature admissibility constraint
спецификация множества разрешѐнных элементов (3.2) и допустимых значений элементов (3.3),
ассоциируемая с конкретным типом (3.24)
3.2
разрешѐнный элемент
admissible feature
подходящий элемент
appropriate feature
элемент, для которого соответствующая структура элементов (3.14) определѐнного типа (3.24) может
нести в себе конкретное значение (3.17)
ПРИМЕЧАНИЕ В некоторых интерпретациях этот термин часто приобретает оттенок обязательности, то есть
считается, что структуры элементов конкретного типа должны содержать в себе значение для каждого
разрешѐнного элемента. Однако в нашем случае данный термин не предполагает обязательного присутствия
элемента.
3.3
разрешѐнное значение элемента
admissible feature value
допустимое значение
admissible value
ограничение по значениям
value restriction
ограничение по диапазону
range restriction
значение (3.17), которое должно быть отнесено к категории допустимых элементов (3.2) в
структурах элементов (3.14) данного типа (3.24)
3.4
атомарный тип
atomic type
пользовательский тип (3.24), который не имеет декларируемых или наследуемых допустимых
элементов (3.2)
3.5
множество с повторяющимися элементами
bag
мультимножество
multiset
триплет, образованный целым числом n, множеством S и функцией отображения целых чисел в
диапазоне от 1 до n, в элементы S
©
ISO 2011 – Все права сохраняются

ПРИМЕЧАНИЕ Множество с повторяющимися элементами – это промежуточный объект между обычным
множеством (как совокупностью неупорядоченных элементов) и списком (где отдельные элементы могут
встречаться многократно).
3.6
встроенный элемент
built-in
элемент, не определяемый пользователем, но могущий появиться вместо структуры элементов
(3.14), например, в качестве значения элемента (3.17)
ПРИМЕЧАНИЕ Встроенные элементы могут быть атомарными или составными. К первым относятся
численные, строковые, символьные и двоичные элементы; ко вторым - коллекции (3.7) и применяемые
логические операторы: например, дизъюнкция, отрицание и слияние (5.2.4).
3.7
коллекция
collection
значение элемента (3.17), содержащее совокупность возможных значений, которые представлены в
виде списка, обычного множества или множества с повторяющимися элементами (3.5)
3.8
ограничение
constraint
компонент спецификации, которая идентифицирует некоторую коллекцию структур элементов (3.14)
как неадекватную
ПРИМЕЧАНИЕ 1 Все ограничения по своей синтаксической форме импликативны, хотя некоторые из них
выделяются как ограничения по допустимости. См. адекватность (3.31) и 5.4. Все структуры элементов, которые
не исключены явным образом как неадекватные, считаются адекватными.
ПРИМЕЧАНИЕ 2 Структура элементов, не идентифицированная таким образом как не соответствующая
никакому из ограничений в системе элементов, считается адекватной.
3.9
значение по умолчанию, стандартное значение
default value
значение (3.17), присваиваемое элементу (3.12) в том случае, когда оно не определено
ПРИМЕР В датском языке при отсутствии явного указания грамматического рода ему присваивается значение
―мужской‖.
ПРИМЕЧАНИЕ Структура элементов не может содержать элементов, для которых не указано соответствующее
значение.
3.10
пустая структура элементов
empty feature structure
структура элементов (3.14), не содержащая никакой информации
ПРИМЕЧАНИЕ Пустая структура элементов категоризирует все другие структуры элементов.
3.11
расширение
extension
преобразование типа категоризации (3.21)
ПРИМЕЧАНИЕ Структура элементов F расширяет G тогда и только тогда, когда G категоризирует F.
©
ISO 2011 – Все права сохраняются 3

3.12
элемент
feature
свойство или аспект объекта, формально представляемые как функция, отображающая объект в его
соответствующее значение (3.17)
3.13
спецификация элементов
feature specification
связывание элемента (3.12) с его значением (3.17) в описании структуры элементов
3.14
структура элементов
feature structure
структура записей, которая ставит в соответствие каждой коллекции элементов одно значение (3.17)
ПРИМЕЧАНИЕ 1 Каждое значение представляет собой структуру элементов или более простой встроенный
элемент (3.6), такой как строка.
ПРИМЕЧАНИЕ 2 Структуры элементов частично упорядочены. Минимальными в этом упорядочении являются
пустые структуры элементов.
3.15
система элементов
feature system
иерархия типов (3.26), в которой каждый тип (3.24) ассоциируется с коллекцией ограничений по
допустимости (3.1) и импликативными ограничениями (3.18)
ПРИМЕЧАНИЕ ср. декларация типа (3.25)
3.16
декларация системы элементов
feature system declaration,
FSD
описание конкретной системы элементов (3.15)
3.17
значение для элемента
feature value
значение
value
объект или совокупность объектов, характеризующие некоторое свойство другого объекта
3.18
импликативное ограничение
implicational constraint
ограничение типа ―если G, то H,‖ где G и H – это структуры элементов (3.14)
ПРИМЕЧАНИЕ Такое ограничение идентифицирует любую структуру элементов F как неадекватную, когда
G категоризирует F, а F и H обычно не имеют адекватного расширения. См. категоризация (3.21) и 8.5. Часто
ограничение такого вида используется при обращении к импликативным ограничениям, которые одновременно не
являются ограничениями по допустимости.
3.19
интерпретация
interpretation
минимально информативное (то есть наиболее общее) расширение (3.11) структуры элементов
(3.14), которое совместимо с множеством ограничений, объявленным в декларации системы
элементов (3.16)
©
ISO 2011 – Все права сохраняются

3.20
частичный порядок
partial order
частично упорядоченное множество
partially ordered set
множество S, для которого определено отношение u на S  S , которое 1) рефлексивно (для всех s  S,
s u s), 2) антисимметрично (для всех p, q  S, если p u q и q u p, то p  q), и 3) транзитивно (для всех p, q,
r  S, если p u q и q u r, то p u r)
ПРИМЕЧАНИЕ Множество целых чисел Z частично упорядочено, но дополнительно оно обладает свойством,
согласно которому, для каждого p, q  Z, выполняется условие p u q или q u p. Этим свойством обладает не любой
частичный порядок. Например, такой частичный порядок, как таксономическая классификация организмов по
типам, родам и видам, указанным свойством не обладает; не обязательно обладают этим свойством также
иерархии типов. Типизированные структуры элементов системы не имеют этого свойства, если (a) данное
свойство присуще иерархии их типов, и (b) иерархия типов состоит из единственного типа либо каждый тип y
ограничен присутствием одного-единственного подходящего элемента.
3.21
категоризация
subsumption
свойство, связывающее две структуры элементов G и F таким образом, что G считается
принадлежащей F тогда и только тогда, когда F несѐт в себе всю информацию, которую содержит G
ПРИМЕЧАНИЕ Формальное определение представлено ниже, в 5.6.
3.22
подтип
subtype
тип (3.24), на который распространяются ограничения и соответствующие характеристики,
содержащиеся в другом типе
3.23
супертип, надтип
supertype
базовый тип
base type
тип (3.24), от которого другой тип наследует ограничения и соответствующие элементы
ПРИМЕЧАНИЕ s является подтипом t тогда и только тогда, когда t – супертип s. Каждый тип является
подтипом и супертипом самого себя.
3.24
семантический тип
semantic type
тип, характеризующий выражение, с помощью которого коллекция структур элементов (3.14)
различается как идентифицируемый и концептуально значимый класс
ПРИМЕЧАНИЕ Как это следует из имени семантический тип, типы, о которых идѐт речь в данной части
ISO 24610, не предназначены для различения структур элементов или их спецификаций по синтаксису.
3.25
декларация типа
type declaration
информационная структура, декларирующая супертипы (3.23), допустимые элементы (3.2),
значения допустимых элементов (3.3), ограничения по допустимости (3.1) и импликативные
ограничения (3.18) для данного типа (3.24)
ПРИМЕЧАНИЕ Ограничения, накладываемые на тип в результирующей системе элементов, это ограничения.
объявленные в декларации дополнительно к унаследованным от супертипов.
©
ISO 2011 – Все права сохраняются 5

3.26
иерархия типов
type hierarchy
частичный порядок (3.20) на множестве типов (3.24)
ПРИМЕЧАНИЯ См. ISO 24610-1:2006, Приложение C, Наследуемые иерархии типов.
3.27
типизированная структура элементов
typed feature structure, TFS
структура элементов (3.14), несущая в себе тип (3.24)
3.28
типизация
typing
присваивание семантического типа (3.24) встроенному элементу (3.6) либо структуре элементов
(3.14), атомарной или составной
ПРИМЕЧАНИЕ Семантические типы в системах элементов частично упорядочены и имеют множественные
отношения наследования.
3.29
недоопределение
underspecification
предоставление неполной информации о значении (3.17)
ПРИМЕЧАНИЕ Недоопределение обычно категоризирует одно значение из диапазона возможных значений,
которые могут быть сведены к единственному значению путѐм последовательного наложения ограничений. См.
категоризация (3.21).
3.30
формальная правильность
well-formedness
синтаксическое соответствие представления структуры элементов (3.14) и ISO 24610-1
3.31
адекватность
validity
соответствие типизированной структуры элементов (3.27) действующим ограничениям (3.8)
конкретной системы элементов (3.15)
ПРИМЕЧАНИЕ См. Раздел 6.
4 Общая структура стандарта
Основное содержание настоящего документа отражено в четырѐх разделах – 5, 6, 7 и 8.
 В Разделе 5, Базовые понятия, рассматривается определение типизированных структур
элементов, и вводятся понятия атомарных и составных типов структур элементов, коллекций и
прочих операторов, могущих фигурировать в значениях элементов; затем описываются понятия
наследуемых типов, иерархий типов, ограничений типов, значений по умолчанию и
недоопределения, которые имеют важнейшее значение для конструирования систем элементов.
 В Разделе 6, Определение формальной правильности и адекватности, обсуждаются условия
отмеченности и достоверности структур элементов.
©
ISO 2011 – Все права сохраняются

 Раздел 7, Система элементов для грамматики, иллюстрирует способ определения типов с
использованием иерархии и ограничений типов, в рамках которых декларируются допустимые
элементы и значения для конкретных типов.
 В Разделе 8, Декларация системы элементов, показывается, каким образом система элементов
может быть декларирована и преобразована в валидатор.
Эта главная часть документа включает в себя два приложения: Приложение A содержит the XML-
схему для данной части ISO 24610; Приложение B содержит развѐрнутый пример.
5 Базовые понятия
5.1 Рассматриваемые типизированные структуры элементов
Типизированные структуры элементов (TFS) вводятся как базовые записи для управления языковыми
ресурсами.
Для получения более подробной информации следует обратиться к ISO 24610-1:2006, 4.7,
Типизированные структуры элементов и Приложению C, Типизированные иерархии наследования.
В данном документе TFS определяется формально как кортеж на конечном множестве элементов Feat,
который состоит из коллекции X элементов, не входящих в структуру, и иерархии типов Type с
отношением u, где Type – это конечное множество типов, а отношение u определяет выделение
подтипов на множестве Type.
Структура элементов представляет собой кортеж , в котором:
a) Q – множество узлов,
b) γ ∈ Q – корневой узел структуры элементов,
c) θ : Q → Type является функцией частичного упорядочения, и
d) δ : Feat × Q → Q ∪ X – функция частичного означивания элементов, такая, что для всех q ∈ Q,
существует последовательность элементов F , ., F , в которой δ[F , . δ(F , γ) . ]  q.

1 n n 1
Обозначение показывает узлы. Приведѐнное выше определение отличается от стандартного,
используемого в лингвистике и теории вычислительных систем, тем, что, во-первых, типизация
осуществляется частично, а не полностью (то есть типы определяются не для всех структур
элементов), и, во-вторых, значения элементов не обязательно должны представлять собой структуры
элементов; однако эти значения могут извлекаться из коллекции, отмеченной другими элементами
XML, - такими, как строковые, численные, символьные и двоичные (выше им соответствует
обозначение X). Следует заметить, что узлы типизируются, тогда как сами элементы - нет.
Приведѐнное ниже XML-представление структуры элементов считается формально правильным; в нѐм
атрибут ―тип‖ указывается для каждого из двух элементов .
ПРИМЕР Типизированная структура элементов:


had






©
ISO 2011 – Все права сохраняются 7







Имя элемента ORTH обозначает орфографию, то есть общепринятое написание слова или фразы.
Данное XML-представление показывает, каким образом определяются морфосинтаксические
характеристики английского слова ―had‖ как не вспомогательного глагола в прошедшем времени.
В альтернативной ―матричной‖ или ―AVM‖ нотации имена типов обычно пишутся строчными буквами,
иногда курсивом или текстовым типографским шрифтом; имена элементов пишутся заглавными
буквами, а строковые элементы заключаются в кавычки. Двоичные значения отмечаются знаками
―плюс‖ (+) или ―минус‖ (-). В данном документе эти соглашения тоже соблюдаются. Представленная
выше структура элементов должна при использовании матричной нотации выглядеть так, как показано
на Рисунке 1.
Рисунок 1 — Матричная нотация
5.2 Типы
5.2.1 Атомарные типы
Наряду со структурами с встроенными элементами (, , и ) могут
существовать структуры элементов, имеющие тип, но не имеющие элементов. Такие структуры
называются простыми, или атомарными структурами элементов, а типы, которые допускают
отсутствие элементов в декларации системы элементов (FSD), именуются атомарными типами.
В результате всегда имеется возможность декларирования новых атомарных типов и использования
их вместо вышеупомянутых встроенных элементов для задания простых значений. Например,
приведѐнная выше структура элементов при условии декларирования в FSD дополнительных типов
had, past и false могла бы быть представлена так, как показано ниже.
ПРИМЕР Альтернативная формулировка типизированной структуры элементов:















©
ISO 2011 – Все права сохраняются

Существует различие между двумя классами встроенных элементов: (строковый) и
(символьный), (двоичный), (численный). В качестве содержимого элемента
допустима любая строка, тогда как в элементах , и набор допустимых
значений строго ограничен. Для отражения такого различия значения членов последнего класса
определяются с использованием атрибута value. Тип , например, ассоциируется с четырьмя
значениями: true (истина), false (ложь), plus (эквивалент true) и minus (эквивалент false).
ПРИМЕЧАНИЕ В ISO 24610-1:2006 был введѐн тип binary (двоичный), но в схеме W3C XML (2001) он
называется Boolean (булев).
Задача кодировщика состоит в том, чтобы осуществить правильный выбор между кодированием
атомарных типов и встроенных элементов. В данной части ISO 24610 различие между двумя
вышеуказанными классами не проводится.
5.2.2 Составные типы
Типы, не являющиеся атомарными, называются составными. К ним относятся все типы,
декларируемые кодировщиком в FSD, где объявляются или наследуются допустимые элементы.
Элемент допустим для некоторого типа только в том случае, если структурам элементов данного типа
декларацией FSD разрешается принимать те или иные значения. Из этого не следует, что структуры
элементов не могут произвольно ассоциироваться с теми или иными типами независимо от их
элементного наполнения. Такое ассоциирование возможно, но проверяться на адекватность FSD
смогут лишь те структуры элементов, которые содержат только элементы, разрешѐнные какой-либо
FSD. Различие между адекватностью и формальной правильностью рассматривается более подробно
в Разделе 6.
Все типы, декларируемые пользователем (независимо от того, атомарные они или составные)
являются семантическими представлениями, то есть синтаксически выглядят похожими друг на друга,
если не принимать во внимание значения атрибутов типов. Интерпретация реального смысла этих
типов посредством наложения ограничений по допустимости, ограничений на возможные значения
разрешѐнных элементов () и прочих ограничений в виде логических импликаций – это задача
валидатора
Встроенные элементы, определѐнные для представлений структур элементов (FSR) в рамках
стандарта ISO 24610-1:2006, являются чисто синтаксическими, могут использоваться без
декларирования в FSD, а потому их декларирование в FSD невозможно. Они могут появляться в
ограничениях по диапазону значений или в импликативных ограничениях, однако сами не могут иметь
таких ограничений (поскольку не имеют допустимых элементов) и сами не могут накладывать никаких
ограничений.
5.2.3 Коллекции
Однако не все встроенные элементы столь просты, как элементы, отмеченные выше. Некоторые
грамматические элементы – такие как спецификаторы (SPR), дополнения (COMPS) и аргументы
(ARGS) – считаются обладающими списком грамматических значений, особенно в контекстных
[10], [12]
грамматиках . В других языках, в отличие от английского, некоторые из указанных элементов
могут иметь в качестве своих значений другие коллекции: это могут быть простые множества или
мультимножества. В языке с относительно произвольным порядком слов (например, в немецком,
корейском или японском) элемент COMPS может анализироваться как принимающий значения
множества или мультимножества, а не списка дополнений. Таким образом, для приложений более
общего характера в ISO 24610-1:2006 вводятся в качестве встроенных методов компоновки значений
составных элементов простые множества, мультимножества и списки.
Коллекции (; ISO 24610-1:2006, 5.8, Коллекции как значения составных элементов)
снабжаются атрибутом способа организации (org), который принимает значения ―list‖, ―set‖ и ―bag‖. В
списках важную роль играют порядок и многократность вхождения элементов. В множествах с
повторяющимися элементами важна только многократность вхождения элементов (такие множества
часто называются мультимножествами). Применительно к обычным множествам ни порядок, ни
многократность вхождения элементов не играют роли.
©
ISO 2011 – Все права сохраняются 9

Например, элемент глаголов ARGS может представляться посредством определения способа
организации коллекции как списка значений, каждое из которых относится к типу phrase.
ПРИМЕР Списковое значение


put
























Этот тип коллекций можно было бы отнести к списковым [list (phrase)], однако полиморфные списки
пока еще не поддерживаются данной частью ISO 24610. Рассмотренный тип эквивалентен
приведѐнной ниже нотации AVM, NP обозначает структуру элементов типа phrase с положительным
элементом NOMINAL, а конкретней – именную группу, а PP соответствует структуре элементов тип
phrase с положительным ПРЕДЛОЖНЫМ элементом, а именно – предложной группе. Числа в
прямоугольниках являются пометами для разметки совместного использования структуры, как
показано на Рисунке 2.
Рисунок 2 — Разметка совместного использования структуры
5.2.4 Операторы
Еще один класс встроенных элементов – это операторы, которые принимают один или несколько
встроенных элементов или структур элементов в качестве своих аргументов, но вместо
конструирования из них коллекции указывают некоторое значение, получаемое на их основе тем или
иным методом
Дизъюнкции (; ISO 24610-1:2006, 5.9.2) указывают одно из значений их аргументов. Однако
структура элементов, содержащая дизъюнкцию, не может представлять структуры множественного
типа. Дизъюнкция – это единственное значение, которое не определяет точно конкретный вариант из
числа возможных. Дизъюнкции могут рассматриваться как объединения их аргументов в рамках
частичного порядка, установленного категоризацией (см. 5.6).
©
ISO 2011 – Все права сохраняются

Отрицания (; ISO 24610-1:2006, 5.9.3) имеют единственный аргумент и указывают значение,
которое не является их аргументом. Отрицание эквивалентно дизъюнкции всех значений, которые не
соответствуют его аргументу. Фактически отрицание не является логической функцией отрицания
конкретного значения, а скорее представляет собой дополнение того значения в полной булевой
решѐтке, которое содержит частичный порядок, установленный категоризацией.
Слияние (; ISO 24610-1:2006, 5.9.4, Коллекция значений) указывает конкатенацию или
объединение нескольких значений и/или коллекций значений, в соответствии с настройкой их атрибута
org. Этот атрибут принимает те же значения и тот же смысл, которые содержатся в .
5.3 Иерархии наследования типов
Иерархия типов достаточно подробно рассматривается в Приложении C ISO 24610-1:2006.
Эта структура обычно отображается как ориентированный ациклический граф с единственной
вершиной. Данная вершина часто имеет метку top и представляет самый общий тип, который
совместим со всеми типизированными структурами элементов. Подтипы соединяются со своими
супертипами и располагаются уровнем ниже. Максимально конкретизированные типы появляются в
самом низу графа. Они взаимно несовместимы друг с другом, что обычно бывает либо абсолютно ясно,
либо иногда отображается другим конкретизированным типом (bottom), который является
единственным самым нижним элементом. В рамках данной части ISO 24610 тип bottom не
используется.
На Рисунке 3 показан пример, иллюстрирующий частичную иерархию типов для живой природы.

Рисунок 3 — Иерархия типов для живой природы
В соответствии с этим рисунком, живая природа (living beings) разделяется на растительность (plant)
и животный мир (animal). Далее животные разбиваются на классы рыб (fish), птиц (bird) и
млекопитающих (mammal). Собаки (dog), люди (human) и крупный рогатый скот (bovine) – вол, корова,
бык – принадлежат к классу млекопитающих.
Иерархии типов не всегда имеют древовидную структуру; в схеме может быть два или больше
ответвлений, сходящихся в одном узле. Когда такое случается, это означает, некоторый тип имеет
несколько супертипов и свойства, наследуемые от всех них. Пример подобной иерархии приведѐн на
Рисунке 4.
Рисунок 4 — Средневековая иерархия живых существ
Здесь тип human (человек) имеет два родительских типа: animal (животное) и rational (разумное
существо). Следовательно, человек рассматривается одновременно и как животное (подобно собаке),
©
ISO 2011 – Все права сохраняются 11

и как одухотворѐнное и мыслящее существо, уподобленное ангелу. Таким образом, человек обладает
одновременно некоторыми свойствами и собак, и некоторыми свойствами ангелов.
Все эти типы частично упорядочены с помощью отношения выделения подтипов u, на множестве всех
типов. Тип τ является подтипом по отношению к типу σ тогда и только тогда, когда σ имеет более
общий характер по сравнению с τ, то есть, когда множество структур элементов типа σ содержит в
себе множество структур элементов типа τ. Так как тип animate (одухотворѐнный) является в
приведѐнном выше примере более общим по отношению к типу animal, все животные определяются
как одушевлѐнные. Тип σ считается супертипом типа τ тогда и только тогда, когда τ является подтипом
σ. Непосредственные супертипы какого-либо типа часто называются его родителями.
Подтип наследует все свойства от своего супертипа. Например, тип human наследует все свойства от
своих супертипов (каковыми являются being, animate, animal, spiritual и rational).
На Рисунке 5 приведѐн несколько изменѐнный лингвистический пример из грамматики 2 Коупстейка
[2].
(2000)
Рисунок 5 — Иерархия типов для вершины простой грамматики
Данная иерархия типов имеет единственную вершину. Это самый общий тип, не имеющий ни
родителей, ни непосредственных супертипов. Тип top - это также единственный подтип самого себя.
Каждый тип имеет имя и у каждого типа, за исключением наивысшего, имеется ровно один родитель. У
типа с именем top есть четыре непосредственных подтипа. Подтипы phrase (речевой оборот) и det
(определяющее слово) – не сопоставимы в том смысле, что ни один из них не является подтипом
другого.
В зависимости от степени сложности грамматики иерархия типов может оказаться очень сложной.
Некоторые еѐ участки могут быть универсальными для всех языков, тогда как другие могут быть очень
специфичными для конкретного языка. Так, тип соглашения agr-cat (соглашение по категоризации) в
английском языке имеет только два непосредственных подтипа: 3sing и non-3sing (например, ―sings‖ и
―sing‖).
Тип det обозначает определяющее слово (determiner), такое как артикли ―the‖ или ―a‖; 3sing указывает
на 3-е лицо единственного числа, а non-3sing указывает категории соглашений, отличные от 3sing.
Это различие характерно для правил согласования глаголов в английском языке.
5.4 Ограничения для типов
Иерархия типов представляет собой основу, на которой строятся все остальные разделы грамматики,
которые принимают форму ограничений для структур элементов на множестве пользовательских типов.
Таки ограничения бывают как минимум трѐх видов: 1) импликативные, 2) по разрешѐнным элементам и
3) по допустимым значениям элементов. Все они могут быть выражены в импликативной форме:
 если структура элементов относится к типу verb, то она может иметь элемент AUXiliary,
 если структура элементов относится к типу verb, она может иметь элемент INVerted,
©
ISO 2011 – Все права сохраняются

 если структура элементов относится к типу verb, то еѐ значением AUX должно быть ―binary‖,
 если структура элементов относится к типу verb, то еѐ значением INV должно быть ―binary‖,
 если структура элементов относится к типу verb и еѐ значение AUX отрицательно, то еѐ значением
INV должно быть ―negative‖.
Первые два из этих ограничений являются ограничениями по допустимости. Они говорят нам о том,
что конкретный элемент может использоваться в структурах элементов определѐнного типа.
Следующая пара ограничений касается значений допустимых элементов и называется иногда
―ограничениями по значению‖ или ―ограничениями по диапазону‖. Они говорят нам о том, какие
значения должен принимать конкретный элемент, когда он входит в структуру элементов данного типа.
Последнее из ограничений имеет наиболее общую форму, однако этот вид ограничений говорит о том,
что когда структура элементов приобретает некоторую конкретную форму (определяемую типами,
значениями элементов и т.п.), она должна удовлетворять каким-то другим критериям (опять же
выраженным в терминах типов, значений элементов и т.п.). Эта последняя форма ограничения обычно
представляет собой то, что подразумевается под импликативным ограничением синтаксической
конструкции. Каждая из этих трѐх форм имеет свою синтаксическую структуру в FSD. Ниже показан
пример кодирования вышеуказанных ограничений применительно к глаголу.
ПРИМЕР Ограничение для типа verb



























Два первых вида определяются вместе внутри элемента , причѐм второй из них описывается
частью указанной декларации, тогда как третий определяется в форме условной
конструкции «если…, то…» ().
5.5 Опциональные (стандартные) значения и недоопределение
Некоторые элементы, образующие структуру, подлежат обязательному определению, а некоторые –
нет. Так, во французском языке спецификация элементов NUMBER (ЧИСЛО) и GENDER (РОД)
обязательна для имѐн существительных и прилагательных, а в английском языке элемент NUMBER
©
ISO 2011 – Все права сохраняются 13

должен определяться для каждого существительного, а определение элемента GENDER – не
обязательно и требуется только для местоимений третьего лица единственного числа ―he‖, ―she‖ и ―it‖.
Тем не менее встречаются случаи, когда некоторые обязательные элементы не определяются. Для
таких случаев имеются два вероятных исхода: 1) если определено стандартное значение по
умолчанию, то считается, что именно оно и должно быть присвоено, и 2) если значение по умолчанию
не определено, то присваиваемое значение элемента выводится логически из действующего
ограничения элемента по диапазону значений.
Английские неисчисляемые существительные, такие как ―вода‖ и ―воздух‖, по умолчанию
определяются как несчѐтные и не имеющие множественного числа. Отсюда следует, что для них не
требуется определения элемента NUMBER, хотя сам элемент NUMBER обязателен. В английском
языке некоторые исчисляемые существительные (например, ―sheep‖) могут иметь одну и ту же форму
в единственном и множественном числе. Когда элемент NUMBER не определѐн, считается, что его
значение относится к некоторому более общему типу, такому как number, который является
супертипом всех значений разрешѐнных элементов.
Грамматические описания часто бывают недоопределѐнными в целях обеспечения возможности
обобщения. Так, в английском языке глаголы разделяются при необходимости на ряд дополнительных
категорий; непереходные глаголы – например, ―smile‖ (улыбаться) и ―bark‖ (лаять) – присоединяются
только к подлежащему; переходные глаголы – такие как ―love‖ (любить) и ―attack‖ (атаковать, нападать)
– присоединяются только к подлежащему и требуют за собой прямого дополнения. Есть ещѐ и
дитранзитивные (―дважды транзитивные‖) глаголы – например, ―give‖ (давать), ―put‖ (класть), которые
имеют при себе подлежащее, и одновременно прямое и косвенное дополнения. Однако многие
грамматические явления не относятся ни к одному из перечисленных выше специфических подклассов.
В качестве примера подобных явлений в английском языке можно привести согласование
подлежащего с глаголом (правильную форму ―The dog barks‖ и неправильную ―the dog bark‖) или
инверсию глагольной формы посредством еѐ вынесения в позицию перед подле
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...