Information technology — Document Schema Definition Language (DSDL) — Part 2: Regular-grammar-based validation — RELAX NG

ISO/IEC 19757-2:2008 specifies RELAX NG, a schema language for XML. A RELAX NG schema specifies a pattern for the structure and content of an XML document. The pattern is specified by using a regular tree grammar. It establishes requirements for RELAX NG schemas and specifies when an XML document matches the pattern specified by a RELAX NG schema.

Technologies de l'information — Langage de définition de schéma de documents (DSDL) — Partie 2: Validation de grammaire orientée courante — RELAX NG

General Information

Status: Published
Publication Date: 09-Dec-2008

ICS: 35.240.30 - IT applications in information, documentation and publishing

Technical Committee: ISO/IEC JTC 1/SC 34 - Document description and processing languages
Drafting Committee: ISO/IEC JTC 1/SC 34 - Document description and processing languages

Current Stage: 9093 - International Standard confirmed
Start Date: 12-Sep-2024
Completion Date: 14-Feb-2026

Relations

Referred By: CEN/TS 16371:2012 - Guidelines for implementors of EN 15744 and EN 15907
Effective Date: 09-Feb-2026

Revises: ISO/IEC 19757-2:2003/Amd 1:2006 - Information technology — Document Schema Definition Language (DSDL) — Part 2: Regular-grammar-based validation — RELAX NG — Amendment 1: Compact Syntax
Effective Date: 14-Aug-2008

Revises: ISO/IEC 19757-2:2003 - Information technology — Document Schema Definition Language (DSDL) — Part 2: Regular-grammar-based validation — RELAX NG
Effective Date: 14-Aug-2008

Overview

ISO/IEC 19757-2:2008 defines RELAX NG, a regular-grammar-based schema language for XML as part of the Document Schema Definition Language (DSDL) family. The standard specifies how a RELAX NG schema describes patterns for the structure and content of XML documents using a regular tree grammar, and establishes the formal requirements for when an XML document “matches” a RELAX NG schema. The document includes full and compact syntaxes, transformation rules that simplify schemas, formal semantics, and conformance criteria for validators.

Key topics and requirements

Schema language specification: Formal definition of RELAX NG patterns for elements, attributes, text, data, lists, interleaving, and grouping.
Regular tree grammar: Use of regular-grammar constructs to express valid XML tree structures and content models.
Syntax and simplification: Full XML-based syntax and the RELAX NG Compact Syntax (compact, human-friendly form), plus rules for simplifying schemas into a canonical simple form.
Semantics and validation: Formal inference rules and semantics that define when an XML document is valid against a RELAX NG schema.
Datatype support: Integration points for datatype libraries (built-in and external) and mechanisms for data/value validation.
Restrictions and conformance: Constraints on schema constructs, prohibited patterns, and formal conformance requirements for RELAX NG validators and translators.
Normative annexes: Includes a RELAX NG schema for RELAX NG itself, examples, and the compact syntax specification to aid implementers.

Applications and who uses it

RELAX NG (ISO/IEC 19757-2:2008) is widely used for:

XML schema authors who need an expressive, easy-to-read schema language for complex document grammars.
Tool and validator developers building XML validation engines that implement RELAX NG conformance rules.
Document architects and integrators designing document exchange formats, technical documentation, or content models where flexible content ordering (interleave) and concise schemas are beneficial.
Data governance and standards teams that require formal validation for XML-based data interchange, publishing, or archiving.

Practical applications include document format design, configuration file schemas, content management systems, and data interchange formats where robust, formally defined XML validation is required.

Related standards

ISO/IEC 19757 (DSDL) family - other parts include:
- Part 3: Schematron (rule-based validation)
- Part 4: NVDL (namespace-based validation dispatching)
- Other DSDL parts cover datatypes, character repertoires, and DTD-related declarations

Keywords: ISO/IEC 19757-2:2008, RELAX NG, DSDL, XML schema, regular-grammar-based validation, RELAX NG Compact Syntax, XML validation, schema language.

Buy Documents

ISO/IEC 19757-2:2008 - Information technology -- Document Schema Definition Language (DSDL) - Page 1 preview

ISO/IEC 19757-2:2008 - Information technology -- Document Schema Definition Language (DSDL) - Page 2 preview

ISO/IEC 19757-2:2008 - Information technology -- Document Schema Definition Language (DSDL) - Page 3 preview

Standard

ISO/IEC 19757-2:2008 - Information technology -- Document Schema Definition Language (DSDL)

English language (43 pages)

sale 15% off

Preview

sale 15% off

Preview

Get Certified

Connect with accredited certification bodies for this standard

BSI Group

BSI (British Standards Institution) is the business standards company that helps organizations make excellence a habit.

UKAS United Kingdom Verified

Visit Website

NYCE

Mexican standards and certification body.

EMA Mexico Verified

Visit Website

Frequently Asked Questions

What is ISO/IEC 19757-2:2008?

ISO/IEC 19757-2:2008 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology — Document Schema Definition Language (DSDL) — Part 2: Regular-grammar-based validation — RELAX NG". This standard covers: ISO/IEC 19757-2:2008 specifies RELAX NG, a schema language for XML. A RELAX NG schema specifies a pattern for the structure and content of an XML document. The pattern is specified by using a regular tree grammar. It establishes requirements for RELAX NG schemas and specifies when an XML document matches the pattern specified by a RELAX NG schema.

What is the scope of ISO/IEC 19757-2:2008?

What ICS categories does ISO/IEC 19757-2:2008 belong to?

ISO/IEC 19757-2:2008 is classified under the following ICS (International Classification for Standards) categories: 35.240.30 - IT applications in information, documentation and publishing. The ICS classification helps identify the subject area and facilitates finding related standards.

What standards are related to ISO/IEC 19757-2:2008?

ISO/IEC 19757-2:2008 has the following relationships with other standards: It is inter standard links to CEN/TS 16371:2012, ISO/IEC 19757-2:2003/Amd 1:2006, ISO/IEC 19757-2:2003. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

How can I access ISO/IEC 19757-2:2008?

ISO/IEC 19757-2:2008 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.

Standards Content (Sample)

ISO/IEC 19757-2:2008 - Informa...

INTERNATIONAL ISO/IEC
STANDARD 19757-2
Second edition
2008-12-15
Information technology — Document
Schema Definition Language (DSDL) —
Part 2:
Regular-grammar-based validation —
RELAX NG
Technologies de l'information — Langage de définition de schéma de
documents (DSDL) —
Partie 2: Validation de grammaire orientée courante — RELAX NG

Reference number
©
ISO/IEC 2008
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.

© ISO/IEC 2008
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2008 – All rights reserved

Contents Page
Foreword. iii
Introduction. iv
1 Scope. 1
2 Normative references. 1
3 Terms and definitions. 1
4 Notation. 4
4.1 EBNF. 4
4.2 Inference rules. 4
4.2.1 Variables. 4
4.2.2 Propositions. 5
4.2.3 Expressions. 6
5 Data model. 7
6 Full syntax. 8
7 Simplification. 9
7.1 General. 9
7.2 Annotations. 9
7.3 Whitespace. 9
7.4 datatypeLibrary attribute. 9
7.5 type attribute of value element. 10
7.6 href attribute. 10
7.7 externalRef element. 10
7.8 include element. 10
7.9 name attribute of element and attribute elements. 11
7.10 ns attribute. 11
7.11 QNames. 11
7.12 div element. 11
7.13 Number of child elements. 11
7.14 mixed element. 12
7.15 optional element. 12
7.16 zeroOrMore element. 12
7.17 Constraints. 12
7.18 combine attribute. 13
7.19 grammar element. 13
7.20 define and ref elements. 13
7.21 notAllowed element. 14
7.22 empty element. 14
8 Simple syntax. 14
9 Semantics. 15
9.1 Inference rules. 15
9.2 Name classes. 15
9.3 Patterns. 16
9.3.1 choice pattern. 16
9.3.2 group pattern. 16
9.3.3 empty pattern. 16
9.3.4 text pattern. 16
9.3.5 oneOrMore pattern. 17
9.3.6 interleave pattern. 17
9.3.7 element and attribute pattern. 17
9.3.8 data and value pattern. 18
9.3.9 Built-in datatype library. 18
9.3.10 list pattern. 19
9.4 Validity. 19
© ISO/IEC 2008 – All rights reserved iii

10 Restrictions. 19
10.1 General. 19
10.2 Prohibited paths. 19
10.2.1 General. 19
10.2.2 attribute pattern. 20
10.2.3 oneOrMore pattern. 20
10.2.4 list pattern. 20
10.2.5 except element in data pattern. 20
10.2.6 start element. 21
10.3 String sequences. 21
10.4 Restrictions on attributes. 23
10.5 Restrictions on interleave. 23
11 Conformance. 23
Annex A (normative) RELAX NG schema for RELAX NG. 24
Annex B (informative) Examples. 30
B.1 Data model. 30
B.2 Full syntax example. 31
B.3 Simple syntax example. 31
B.4 Validation example. 32
Annex C (normative) RELAX NG Compact syntax. 34
C.1 Introduction. 34
C.2 Syntax. 34
C.3 Lexical structure. 36
C.4 Declarations. 37
C.5 Annotations. 39
C.5.1 Support for annotations. 39
C.5.2 Initial annotations. 39
C.5.3 Documentation shorthand. 39
C.5.4 Following annotations. 40
C.5.5 Grammar annotations. 40
C.6 Conformance. 40
C.6.1 Types of conformance. 40
C.6.2 Validator. 41
C.6.3 Structure preserving translator. 41
C.6.4 Non-structure preserving translator. 41
C.7 Media type registration template for the RELAX NG Compact Syntax. 41
Bibliography. 43
iv © ISO/IEC 2008 – All rights reserved

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form
the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in
the development of International Standards through technical committees established by the respective organization
to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual
interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also
take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee,
ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International Standards
adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International
Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights.
ISO and IEC shall not be held responsible for identifying any or all such patent rights.
ISO/IEC 19757-2 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee
SC 34, Document description and processing languages.
This second edition cancels and replaces the first edition (ISO/IEC 19757-2:2003), of which it constitutes a minor revision.
It also incorporates the Amendment ISO/IEC 19757-2:2003/Amd.1:2006.
ISO/IEC 19757 consists of the following parts, under the general title Information technology — Document Schema
Definition Language (DSDL):
— Part 2: Regular-grammar-based validation — RELAX NG
— Part 3: Rule-based validation — Schematron
— Part 4: Namespace-based validation dispatching language — NVDL
— Part 8: Document semantics renaming language — DSRL
— Part 9: Namespace and datatype declaration in Document Type Definitions (DTDs)
The following parts are under preparation:
— Part 1: Overview
— Part 5: Extensible Datatypes
— Part 7: Character Repertoire Description Language (CREPDL)
© ISO/IEC 2008 – All rights reserved v

Introduction
The structure of this part of ISO/IEC 19757 is as follows. Clause 5 describes the data model, which is the abstraction
of an XML document used throughout the rest of the document. Clause 6 describes the syntax of a RELAX NG
schema. Clause 7 describes a sequence of transformations that are applied to simplify a RELAX NG schema, and
also specifies additional requirements on a RELAX NG schema. Clause 8 describes the syntax that results from
applying the transformations; this simple syntax is a subset of the full syntax. Clause 9 describes the semantics of a
correct RELAX NG schema that uses the simple syntax; the semantics specify when an element is valid with respect
to a RELAX NG schema. Clause 10 describes requirements that apply to a RELAX NG schema after it has been
transformed into simple form. Finally, Clause 11 describes conformance requirements for RELAX NG validators.
[1]
This part of ISO/IEC 19757 is based on the RELAX NG Specification , and the compact syntax shown in Annex C
[3]
is based on the RELAX NG Compact Syntax . A tutorial for RELAX NG is available separately (see the RELAX NG
[2]
Tutorial ).
vi © ISO/IEC 2008 – All rights reserved

INTERNATIONAL STANDARD ISO/IEC 19757-2:2008(E)
Information technology — Document Schema Definition
Language (DSDL) —
Part 2:
Regular-grammar-based validation — RELAX NG
1 Scope
This part of ISO/IEC 19757 specifies RELAX NG, a schema language for XML. A RELAX NG schema specifies a
pattern for the structure and content of an XML document. The pattern is specified by using a regular tree grammar.
This part of ISO/IEC 19757 establishes requirements for RELAX NG schemas and specifies when an XML document
matches the pattern specified by a RELAX NG schema.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated references,
only the edition cited applies. For undated references, the latest edition of the referenced document (including any
amendments) applies.
NOTE Each of the following documents has a unique identifier that is used to cite the document in the text. The unique
identifier consists of the part of the reference up to the first comma.
W3C XML, Extensible Markup Language (XML) 1.0 (Second Edition), W3C Recommendation, 6 October 2000,
available at http://www.w3.org/TR/2000/REC-xml-20001006
W3C XML-Names, Namespaces in XML, W3C Recommendation, 14 January 1999, available at
http://www.w3.org/TR/1999/REC-xml-names-19990114/
W3C XLink, XML Linking Language (XLink) Version 1.0, W3C Recommendation, 27 June 2001, available at
http://www.w3.org/TR/2001/REC-xlink-20010627/
W3C XML-Infoset, XML Information Set, W3C Recommendation, 24 October 2001, available at
http://www.w3.org/TR/2001/REC-xml-infoset-20011024/
IETF RFC 2045, Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies, Internet
Standards Track Specification, November 1996, available at http://www.ietf.org/rfc/rfc2045.txt
IETF RFC 2046, Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types, Internet Standards Track
Specification, November 1996, available at http://www.ietf.org/rfc/rfc2046.txt
IETF RFC 2396, Uniform Resource Identifiers (URI): Generic Syntax, Internet Standards Track Specification, August
1998, available at http://www.ietf.org/rfc/rfc2396.txt
IETF RFC 2732, Format for Literal IPv6 Addresses in URL's, Internet Standards Track Specification, December 1999,
available at http://www.ietf.org/rfc/rfc2732.txt
IETF RFC 3023, XML Media Types, Internet Standards Track Specification, August 1998, available at
http://www.ietf.org/rfc/rfc3023.txt
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
3.1
resource
something with identity, potentially addressable by a URI
© ISO/IEC 2008 – All rights reserved 1

3.2
URI
compact string of characters that uses the syntax defined in IETF RFC 2396 to identify an abstract or physical resource
3.3
URI reference
URI or relative URI and optional fragment identifier
3.4
relative URI
form of URI reference that can be resolved with respect to a base URI to produce another URI
3.5
base URI
URI used to resolve relative URIs
3.6
fragment identifier
additional information in a URI reference used by a user agent after the retrieval action on a URI has been successfully
performed
3.7
instance
XML document that is being validated with respect to a RELAX NG schema
3.8
space character
character with the code value #x20
3.9
whitespace character
character with the code value #x20, #x9, #xA or #xD
3.10
name
pair of a URI and a local name
3.11
namespace URI
URI that is part of a name
3.12
local name
NCName that is part of a name
3.13
NCName
string that matches the NCName production of W3C XML-Names
3.14
name class
part of a schema that can be matched against a name
3.15
pattern
part of a schema that can be matched against a set of attributes and a sequence of elements and strings
2 © ISO/IEC 2008 – All rights reserved

3.16
foreign attribute
attribute with a name whose namespace URI is neither the empty string nor the RELAX NG namespace URI
3.17
foreign element
element with a name whose namespace URI is not the RELAX NG namespace URI
3.18
full syntax
syntax of a RELAX NG grammar before simplification
3.19
simple syntax
syntax of a RELAX NG grammar after simplification
3.20
simplification
transformation of a RELAX NG schema in the full syntax to a schema in the simple syntax
3.21
datatype library
mapping from local names to datatypes
NOTE A datatype library is identified by a URI.
3.22
datatype
set of strings together with an equivalence relation on that set
3.23
axiom
proposition that is provable unconditionally
3.24
inference rule
rule consisting of one or more positive or negative antecendents and exactly one consequent, which makes the
consequent provable if all the positive antecedents are provable and none of the negative antecendents is provable
3.25
valid with respect to a schema
member of the set of XML documents described by the schema
3.26
schema
specification of a set of XML documents
3.27
grammar
start pattern together with a mapping from NCNames to patterns
3.28
correct schema
schema that satisfies all the requirements of this part of ISO/IEC 19757
3.29
validator
software module that determine whether a schema is correct and whether an instance is valid with respect to a schema
© ISO/IEC 2008 – All rights reserved 3

3.30
path
list of NCNames separated by / or //
3.31
infoset
an abstraction of an XML document defined by W3C XML-Infoset
3.32
information item
constituent of an information set
3.33
data model
abstract representation of an XML document defined by this part of ISO/IEC 19757
3.34
XML document
string that is a well-formed XML document as defined in W3C XML
3.35
EBNF
Extended BNF
notation used to described context-free grammars
3.36
weak matching
kind of matching specified in detail in 9.3.7
3.37
in-scope grammar
nearest ancestor grammar element
3.38
content-type
one of the three values empty, complex, or simple
3.39
mixed sequence
sequence that may contain both elements and strings
4 Notation
4.1 EBNF
This part of ISO/IEC 19757 uses EBNF notation to describe the full syntax and the simple syntax of RELAX NG. A
description of a grammar in EBNF consists of one or more production rules. Each production rule consists of the
name of a non-terminal, followed by ::=, followed by a list of alternatives separated by |. Within an alternative, italic
type is used to reference a non-terminal, concatenation indicates sequencing, [] indicates optionality, + indicates
repetition one or more times and * indicates repetition zero or more times; other characters in normal type stand for
themselves.
4.2 Inference rules
4.2.1 Variables
The symbol used for a variable indicates the variable's range as follows:
— n ranges over names
4 © ISO/IEC 2008 – All rights reserved

— nc ranges over name classes
— ln ranges over local names; a local name is a string that matches the NCName production of W3C XML-Names,
that is, a name with no colons
— u ranges over URIs
— cx ranges over contexts (as defined in Clause 5)
— a ranges over sets of attributes; a set with a single member is considered the same as that member
— m ranges over sequences of elements and strings; a sequence with a single member is considered the same as
that member; the sequences ranged over by m may contain consecutive strings and may contain strings that are
empty
NOTE There are sequences ranged over by m that cannot occur as the children of an element.
— p ranges over patterns (elements matching the pattern production)
— s ranges over strings
— ws ranges over the empty sequence and strings that consist entirely of whitespace
— params ranges over sequences of parameters
— e ranges over elements
— ct ranges over content-types
4.2.2 Propositions
The following notation is used for propositions:
— n in nc means that name n is a member of name class nc
— cx⊦ a; m =~ p means that with respect to context cx, the attributes a and the sequence of elements and strings
m matches the pattern p
— disjoint(a , a ) means that there is no name that is the name of both an attribute in a and of an attribute in a
1 2 1 2
— m interleaves m ; m means that m is an interleaving of m and m
1 2 3 1 2 3
— cx⊦ a; m =~ p means that with respect to context cx, the attributes a and the sequence of elements and
weak
strings m weakly matches the pattern p
— okAsChildren(m) means that the mixed sequence m can occur as the children of an element: it does not contain
any member that is an empty string, nor does it contain two consecutive members that are both strings
— deref(ln) = nc p means that the grammar contains nc p

— datatypeAllows(u, ln, params, s, cx) means that in the datatype library identified by URI u, the string s interpreted
with context cx is a legal value of datatype ln with parameters params
— datatypeEqual(u, ln, s , cx , s , cx ) means that in the datatype library identified by URI u, string s interpreted
1 1 2 2 1
with context cx represents the same value of the datatype ln as the string s interpreted in the context of cx
1 2 2
© ISO/IEC 2008 – All rights reserved 5

— s = s means that s and s are identical
1 2 1 2
— valid(e) means that the element e is valid with respect to the grammar
—
start() = p means that the grammar contains p
— groupable(ct , ct ) means that the content-types ct and ct are groupable
1 2 1 2
— p : ct means that pattern p has content-type ct
c
— incorrectSchema() means that the schema is incorrect
4.2.3 Expressions
The following notation is used for expressions in propositions:
— name( u, ln ) returns a name with URI u and local name ln
— m , m returns the concatenation of the sequences m and m
1 2 1 2
— a + a returns the union of a and a
1 2 1 2
— ( ) returns an empty sequence
— { } returns an empty set
— "" returns an empty string
— attribute( n, s ) returns an attribute with name n and value s
— element( n, cx, a, m ) returns an element with name n, context cx, attributes a and mixed sequence m as children
— max( ct , ct ) returns the maximum of ct and ct where the content-types in increasing order are empty( ),
1 2 1 2
complex( ), simple( )
— normalizeWhiteSpace( s ) returns the string s, with leading and trailing whitespace characters removed, and with
each other maximal sequence of whitespace characters replaced by a single space character
— split( s ) returns a sequence of strings one for each whitespace delimited token of s; each string in the returned
sequence will be non-empty and will not contain any whitespace
— context( u, cx ) returns a context which is the same as cx except that the default namespace is u; if u is the empty
string, then there is no default namespace in the constructed context
— empty( ) returns the empty content-type
— complex( ) returns the complex content-type
— simple( ) returns the simple content-type
— [cx] within the start-tag of a pattern refers to the context of the pattern element
6 © ISO/IEC 2008 – All rights reserved

5 Data model
RELAX NG deals with XML documents representing both schemas and instances through an abstract data model.
XML documents representing schemas and instances shall be well-formed in conformance with W3C XML and shall
conform to the constraints of W3C XML-Names.
An XML document is represented by an element. An element consists of
— a name
— a context
— a set of attributes
— an ordered sequence of zero or more children; each child is either an element or a non-empty string; the sequence
never contains two consecutive strings
A name consists of
— a string representing the namespace URI; the empty string has special significance, representing the absence
of any namespace
— a string representing the local name; this string matches the NCName production of W3C XML-Names
A context consists of
— a base URI
— a namespace map; this maps prefixes to namespace URIs, and also may specify a default namespace URI (as
declared by the xmlns attribute)
An attribute consists of
— a name
— a string representing the value
A string consists of a sequence of zero or more characters, where a character is as defined in W3C XML.
The element for an XML document is constructed from the infoset (see W3C XML-Infoset) of the XML document as
follows. The notation [x] refers to the value of the x property of an information item. An element is constructed from
a document information item by constructing an element from the [document element]. An element is constructed
from an element information item by constructing the name from the [namespace name] and [local name], the context
from the [base URI] and [in-scope namespaces], the attributes from the [attributes], and the children from the [children].
The attributes of an element are constructed from the unordered set of attribute information items by constructing an
attribute for each attribute information item.The children of an element are constructed from the list of child information
items first by removing information items other than element information items and character information items, and
then by constructing an element for each element information item in the list and a string for each maximal sequence
of character information items. An attribute is constructed from an attribute information item by constructing the name
from the [namespace name] and [local name], and the value from the [normalized value]. When constructing the name
of an element or attribute from the [namespace name] and [local name], if the [namespace name] property is not
present, then the name is constructed from an empty string and the [local name]. A string is constructed from a
sequence of character information items by constructing a character from the [character code] of each character
information item.
It is possible for there to be multiple distinct infosets for a single XML document. This is because XML parsers are
not required to process all DTD declarations or expand all external parsed general entities. Amongst these multiple
infosets, there is exactly one infoset for which [all declarations processed] is true and which does not contain any
© ISO/IEC 2008 – All rights reserved 7

unexpanded entity reference information items. This is the infoset that is the basis for defining the RELAX NG data
model.
6 Full syntax
The following grammar in EBNF notation summarizes the syntax of RELAX NG. Although the notation is based on
the XML representation of an RELAX NG schema as a sequence of characters, the grammar operates at the data
model level. For example, although the syntax uses , an instance or schema can use
instead, because they both represent the same element at the data model level. All elements shown in the grammar
are qualified with the namespace URI:
http://relaxng.org/ns/structure/1.0
The symbols QName and NCName are defined in W3C XML-Names. The anyURI symbol indicates a string that, after
escaping of disallowed values as described in Section 5.4 of W3C XLink, is a URI reference as defined in IETF RFC
2396 (as modified by IETF RFC 2732). The symbol string matches any string.
In addition to the attributes shown explicitly, any element can have an ns attribute and any element can have a
datatypeLibrary attribute. The ns attribute can have any value. The value of the datatypeLibrary attribute
shall match the anyURI symbol as described in the previous paragraph; in addition, it shall not use the relative form
of URI reference and shall not have a fragment identifier; as an exception to this, the value may be the empty string.
Any element can also have foreign attributes in addition to the attributes shown in the grammar. A foreign attribute is
an attribute with a name whose namespace URI is neither the empty string nor the RELAX NG namespace URI. Any
element that cannot have string children (that is, any element other than value, param and name) may have foreign
child elements in addition to the child elements shown in the grammar. A foreign element is an element with a name
whose namespace URI is not the RELAX NG namespace URI. There are no constraints on the relative position of
foreign child elements with respect to other child elements.
Any element can also have as children strings that consist entirely of whitespace characters, where a whitespace
character is one of #x20, #x9, #xD or #xA. There are no constraints on the relative position of whitespace string
children with respect to child elements.
Leading and trailing whitespace is allowed for value of each name, type and combine attribute and for the content
of each name element.
pattern ::=
pattern+
| nameClass pattern+
| [pattern]
| nameClass [pattern]
| pattern+
| pattern+
| pattern+
| pattern+
| pattern+
| pattern+
| pattern+
| pattern+
|
|
|
|
| string
| param* [exceptPattern]
|
|
| grammarContent*
param ::=
8 © ISO/IEC 2008 – All rights reserved

string
exceptPattern ::=
pattern+
grammarContent ::=
start
| define
|

grammarContent*

| includeContent*
includeContent ::=
start
| define
|

includeContent*

start ::=
pattern
define ::=
pattern+
method ::=
choice
| interleave
nameClass ::=
QName
| [exceptNameClass]
| [exceptNameClass]
| nameClass+
exceptNameClass ::=
nameClass+
An alternative compact syntax is described in Annex C.
7 Simplification
7.1 General
The full syntax given in the previous clause is transformed into a simpler syntax by applying the following transformation
rules in order. The effect shall be as if each rule was applied to all elements in the schema before the next rule is
applied. A transformation rule may also specify constraints that shall be satisfied by a correct schema. The
transformation rules are applied at the data model level. Before the transformations are applied, the schema is parsed
into an element in the data model.
7.2 Annotations
Foreign attributes and elements are removed.
NOTE It is safe to remove xml:base attributes at this stage because xml:base attributes are used in determining the [base
URI] of an element information item, which is in turn used to construct the base URI of the context of an element. Thus, after a
document has been parsed into an element in the data model, xml:base attributes can be discarded.
7.3 Whitespace
For each element other than value and param, each child that is a string containing only whitespace characters is
removed.
Leading and trailing whitespace characters are removed from the value of each name, type and combine attribute
and from the content of each name element.
7.4 datatypeLibrary attribute
The value of each datatypeLibary attribute is transformed by escaping disallowed characters as specified in
Section 5.4 of W3C XLink.
© ISO/IEC 2008 – All rights reserved 9

For any data or value element that does not have a datatypeLibrary attribute, a datatypeLibrary attribute
is added. The value of the added datatypeLibrary attribute is the value of the datatypeLibrary attribute of
the nearest ancestor element that has a datatypeLibrary attribute, or the empty string if there is no such ancestor.
Then, any datatypeLibrary attribute that is on an element other than data or value is removed.
7.5 type attribute of value element
For any value element that does not have a type attribute, a type attribute is added with a value of token and the
value of the datatypeLibrary attribute is changed to the empty string.
7.6 href attribute
The value of the href attribute on an externalRef or include element is first transformed by escaping disallowed
characters as specified in Section 5.4 of W3C XLink. The URI reference is then resolved into an absolute form as
described in Section 5.2 of IETF RFC 2396 using the base URI from the context of the element that bears the href
attribute.
The value of the href attribute is used to construct an element (as specified in Clause 5). This shall be done as
follows. The URI reference consists of the URI itself and an optional fragment identifier. The resource identified by
the URI is retrieved. The result is a MIME entity (see IETF RFC 2045): a sequence of bytes labeled with a MIME
media type (see IETF RFC 2046). The media type determines how an element is constructed from the MIME entity
and optional fragment identifier. When the media type is application/xml or text/xml, the MIME entity shall be
parsed as an XML document in accordance with the applicable RFC (at the term of writing IETF RFC 3023) and an
element constructed from the result of the parse as specified in Clause 5. In particular, the charset parameter shall
be handled as specified by the RFC. This specification does not define the handling of media types other than
application/xml and text/xml. The href attribute shall not include a fragment identifier unless the registration
of the media type of the resource identified by the attribute defines the interpretation of fragment identifiers for that
media type.
NOTE IETF RFC 3023 does not define the interpretation of fragment identifiers for application/xml or text/xml.
7.7 externalRef element
An externalRef element is transformed as follows. An element is constructed using the URI reference that is the
value of href attribute as specified in 7.6. This element shall match the syntax for pattern. The element is transformed
by recursively applying the rules from this subclauses and from previous subclauses of this clause. This shall not
result in a loop. In other words, the transformation of the referenced element shall not require the dereferencing of
an externalRef element with an href attribute with the same value.
Any ns attribute on the externalRef element is transferred to the referenced element if the referenced element
does not already have an ns attribute. The externalRef element is then replaced by the referenced element.
7.8 include element
An include element is transformed as follows. An element is constructed using the URI reference that is the value
of href attribute as specified in 7.6. This element shall be a grammar element, matching the syntax for grammar.
This grammar element is transformed by recursively applying the rules from this subclause and from previous
subclauses of this clause. This shall not result in a loop. In other words, the transformation of the grammar element
shall not require the dereferencing of an include element with an href attribute with the same value.
Define the components of an element to be the children of the element together with the components of any div child
elements. If the include element has a start component, then the grammar element shall have at least one start
component; it is then transformed by removing all start components. If the include element has a define
component, then the grammar element shall have at least one define component with the same name; it is then
transformed by removing all such define components.
The include element is transformed into a div element. The attributes of the div element are the attributes of the
include element other than the href attribute. The children of the div element are the grammar element (after
10 © ISO/IEC 2008 – All rights reserved
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...