Codes for the representation of names of languages

ISO 639-3:2007 provides a code, published by the Registration Authority of ISO 639-3, consisting of language code elements comprising three-letter language identifiers for the representation of languages. The language identifiers according to this ISO 639-3:2007 were devised for use in a wide range of applications, especially in computer systems, where there is potential need to support a large number of the languages that are known to have ever existed. Whereas ISO 639-1 and ISO 639-2 are intended to focus on the major languages of the world that are most frequently represented in the total body of the world's literature, ISO 639-3:2007 attempts to provide as complete an enumeration of languages as possible, including living, extinct, ancient and constructed languages, whether major or minor, written or unwritten. As a result, ISO 639-3:2007 deals with a very large number of lesser-known languages. Languages designed exclusively for machine use, such as computer-programming languages and reconstructed languages, are not included in this code.

Codes pour la représentation des noms de langues

L'ISO 639-3:2007 fournit un code, publié par l'Agence d'enregistrement de l'ISO 639-3, composé de codets de langue formés avec des indicatifs de langue à trois lettres pour la représentation des langues. Les indicatifs de langue de l'ISO 639-3:2007 ont été conçus pour être utilisés dans une large gamme d'applications, en particulier dans les systèmes informatiques, lorsqu'il y a un besoin potentiel de prise en charge du grand nombre de langues dont l'existence est connue. Alors que l'ISO 639-1 et l'ISO 639-2 se focalisent sur les grandes langues du monde les plus fréquemment représentées dans le corpus de la littérature mondiale, l'ISO 639-3:2007 vise à fournir une énumération de langues la plus complète possible, y compris les langues vivantes, les langues mortes, les langues anciennes et les langues construites artificiellement, qu'elles soient majeures ou mineures, écrites ou orales. Par conséquent, l'ISO 639-3:2007 traite d'un très grand nombre de langues moins connues. Les langages créés pour être exclusivement utilisés dans des machines, comme les langages de programmation, ainsi que les langues reconstituées ne sont pas inclus dans le présent code.

Kode za predstavljanje imen jezikov - 3. del: Tričrkovna koda za celovito predstavitev jezikov

General Information

Status
Published
Publication Date
04-Feb-2007
Current Stage
9020 - International Standard under periodical review
Start Date
15-Apr-2021

Buy Standard

Standard
ISO 639-3:2007 - Codes for the representation of names of languages
English language
12 pages
sale 15% off
Preview
sale 15% off
Preview
Standard
ISO 639-3:2008
English language
17 pages
sale 10% off
Preview
sale 10% off
Preview

e-Library read for
1 day
Standard
ISO 639-3:2007 - Codes pour la représentation des noms de langues
French language
14 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (sample)

INTERNATIONAL ISO
STANDARD 639-3
First edition
2007-02-01
Codes for the representation of names of
languages —
Part 3:
Alpha-3 code for comprehensive
coverage of languages
Codes pour la représentation des noms de langues —
Partie 3: Code alpha-3 pour un traitement exhaustif des langues
Reference number
ISO 639-3:2007(E)
ISO 2007
---------------------- Page: 1 ----------------------
ISO 639-3:2007(E)
PDF disclaimer

This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but

shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In

downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat

accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.

Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation

parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In

the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.

© ISO 2007

All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,

electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or

ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2007 – All rights reserved
---------------------- Page: 2 ----------------------
ISO 639-3:2007(E)
Contents Page

Foreword............................................................................................................................................................ iv

Introduction ........................................................................................................................................................ v

1 Scope ..................................................................................................................................................... 1

2 Normative references ........................................................................................................................... 1

3 Terms and definitions........................................................................................................................... 1

4 Three-letter language code.................................................................................................................. 3

4.1 Form of the language identifier ........................................................................................................... 3

4.2 Denotation of the language identifier ................................................................................................. 3

4.3 Documentation of the intended denotation of identifiers................................................................. 6

4.4 Relationship between ISO 639-2 and ISO 639-3 ................................................................................ 7

4.5 Registration Authority and maintenance of the code....................................................................... 8

4.6 Application of language identifiers..................................................................................................... 9

4.7 Scripts and regions .............................................................................................................................. 9

5 Language code tables .......................................................................................................................... 9

Annex A (normative) Procedures for the Registration Authority and Registration Authorities

Advisory Committee for ISO 639....................................................................................................... 10

Bibliography ..................................................................................................................................................... 12

© ISO 2007 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO 639-3:2007(E)
Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies

(ISO member bodies). The work of preparing International Standards is normally carried out through ISO

technical committees. Each member body interested in a subject for which a technical committee has been

established has the right to be represented on that committee. International organizations, governmental and

non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the

International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.

International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.

The main task of technical committees is to prepare International Standards. Draft International Standards

adopted by the technical committees are circulated to the member bodies for voting. Publication as an

International Standard requires approval by at least 75 % of the member bodies casting a vote.

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent

rights. ISO shall not be held responsible for identifying any or all such patent rights.

ISO 639-3 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content

resources, Subcommittee SC 2, Terminographical and lexicographical working methods.

ISO 639 consists of the following parts, under the general title Codes for the representation of names of

languages:
— Part 1: Alpha-2 code
— Part 2: Alpha-3 code
— Part 3: Alpha-3 code for comprehensive coverage of languages
The following parts are under preparation:
— Part 4: Implementation guidelines and general principles for language coding
— Part 5: Alpha-3 code for language families and groups

— Part 6: Alpha-4 representation for comprehensive coverage of language variation

iv © ISO 2007 – All rights reserved
---------------------- Page: 4 ----------------------
ISO 639-3:2007(E)
Introduction

ISO 639 provides three language codes for the representation of names of languages: one is a two-letter code

(ISO 639-1) and two others are three-letter codes (ISO 639-2 and ISO 639-3). ISO 639-1 was devised

primarily for use in terminology, lexicography and linguistics. ISO 639-2 was devised primarily for use in

terminology and bibliography; it represents all languages contained in ISO 639-1 and in addition other

languages and language collections of interest for those primary applications. ISO 639-3 was devised to

provide a comprehensive set of identifiers for all languages for use in a wide range of applications, including

linguistics, lexicography and internationalization of information systems. It attempts to represent all known

languages.

The three-letter codes in ISO 639-2 and ISO 639-3 are complementary and compatible. The two codes have

been devised for different purposes. The set of individual languages listed in ISO 639-2 is a subset of those

listed in ISO 639-3. The codes differ in that ISO 639-2 includes code elements representing some individual

languages and also collections of languages, while ISO 639-3 includes code elements for all known individual

languages but not for collections of languages. Overall, the set of individual languages listed in ISO 639-3 is

much larger than the set of individual languages listed in ISO 639-2.

The languages represented in ISO 639-1 are a subset of those represented in ISO 639-2; every language

code element in the two-letter code has a corresponding language code element in ISO 639-2, but not

necessarily vice versa. Likewise, elements other than collections listed in ISO 639-2 are a subset of those

listed ISO 639-3; each non-collective element in ISO 639-2 is included in ISO 639-3, but not necessarily vice

versa. The denotation represented by alpha-3 identifiers included in both ISO 639-2 and ISO 639-3 is the

same in each part, and the denotation represented by alpha-2 identifiers in ISO 639-1 is the same as that

represented by the corresponding alpha-3 identifiers in ISO 639-2 and ISO 639-3.
All three language codes are to be considered as open lists.

The large number of languages in the initial inventory of ISO 639-3 beyond those already included in

[1]

ISO 639-2 was derived primarily from Ethnologue , with additional ancient, historic or artificial languages

[2], [3]
obtained from Linguist List .

This part of ISO 639 also includes guidelines for the creation of language code elements and their use in

some applications.
© ISO 2007 – All rights reserved v
---------------------- Page: 5 ----------------------
INTERNATIONAL STANDARD ISO 639-3:2007(E)
Codes for the representation of names of languages —
Part 3:
Alpha-3 code for comprehensive coverage of languages
1 Scope

This part of ISO 639 provides a code, published by the Registration Authority of ISO 639-3, consisting of

language code elements comprising three-letter language identifiers for the representation of languages. The

language identifiers according to this part of ISO 639 were devised for use in a wide range of applications,

especially in computer systems, where there is potential need to support a large number of the languages that

are known to have ever existed. Whereas ISO 639-1 and ISO 639-2 are intended to focus on the major

languages of the world that are most frequently represented in the total body of the world's literature, this part

of ISO 639 attempts to provide as complete an enumeration of languages as possible, including living, extinct,

ancient and constructed languages, whether major or minor, written or unwritten. As a result, this part of

ISO 639 deals with a very large number of lesser-known languages. Languages designed exclusively for

machine use, such as computer-programming languages and reconstructed languages, are not included in

this code.

Knowledge of the world's languages at any given time is never complete or perfect. Additional language

identifiers may be created for this list when it becomes apparent that there is a linguistic variety that is deemed

to be distinct from other languages in accordance with the definitions in Clause 3 and their elaboration in

Clause 4. In addition, the denotation of existing identifiers may be revised or identifiers may become

deprecated when it becomes apparent that they do not accurately reflect actual language distinctions. In all

such changes, careful consideration is given to minimize adverse effects on existing implementations.

2 Normative references

The following referenced documents are indispensable for the application of this document. For dated

references, only the edition cited applies. For undated references, the latest edition of the referenced

document (including any amendments) applies.

ISO 3166-1, Codes for the representation of names of countries and their subdivisions — Part 1: Country

codes

ISO 15924, Information and documentation — Codes for the representation of names of scripts

3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
3.1
code

data transformed or represented in different forms according to a pre-established set of rules

3.2
code element
individual entry in a code table
© ISO 2007 – All rights reserved 1
---------------------- Page: 6 ----------------------
ISO 639-3:2007(E)
3.3
language identifier
language symbol
symbol that uniquely identifies a particular language

NOTE 1 In the language code in this part of ISO 639, each language identifier is composed of three letters.

NOTE 2 In this part of ISO 639, each language identifier represents the various language names used to designate a

particular language.
3.4
name
reference name
appellation
linguistic expression used to designate an individual concept

NOTE 1 In this part of ISO 639, a language name is used to designate the concept of a particular language.

NOTE 2 In this part of ISO 639, names used to designate a language may be expressions taken from one or more

specified source languages, such as English or French. It is not guaranteed, however, that a complete set of names from

any particular language will be provided, or that the source language for any name will be indicated.

NOTE 3 In the initial code table for this part of ISO 639, the names used for many languages will be names used in

[1]

Ethnologue . In subsequent maintenance of this part of ISO 639, these names may be revised.

NOTE 4 In this part of ISO 639, a language name is considered normative insofar as it designates a particular

language. The actual form of a name is not immutable.

NOTE 5 In this part of ISO 639, reference names may include parenthetic information not generally used to designate

a given language in order to differentiate between distinct languages that have identical names. See 4.3.

3.5
language code element
code element (3.2) in a language code table

NOTE In the language code table published by the Registration Authority of ISO 639-3 (see 4.5), each language

code element consists of a language identifier and one or more language names.
3.6
scope

attribute of a language code element (3.5) that pertains to the breadth of language varieties to which it

corresponds, and to the nature of the relationship between that language code element and other language

code elements

NOTE For the purposes of this part of ISO 639, language code elements have one of four scopes: individual

language, macrolanguage, collection or special purpose. See 4.2.
3.7
individual language code element

language code element (3.5) with a scope (3.6) representing an individual language

NOTE The language represented by an individual language code element is considered distinct from those

represented by any other individual language code element; thus, there is no correspondence between different individual

language code elements. The notion of individual language is explained further in 4.2.2.

2 © ISO 2007 – All rights reserved
---------------------- Page: 7 ----------------------
ISO 639-3:2007(E)
3.8
macrolanguage code element

language code element (3.5) with a scope (3.6) representing multiple, closely-related individual languages

that are deemed in some usage contexts to be a single language

NOTE Every macrolanguage code element has a normative correspondence to the individual language code

elements representing the individual languages encompassed by the macrolanguage. This normative relationship between

macrolanguage code elements and individual language code elements is documented in the code tables included in this

part of ISO 639. The notion of macrolanguage is explained further in 4.2.3.
3.9
collective language code element

language code element (3.5) with a scope (3.6) representing a group of individual languages that are not

deemed to be one language in any usage context

NOTE The language code in this part of ISO 639 does not include collective language code elements.

4 Three-letter language code
4.1 Form of the language identifier

The language identifiers consist of a sequence of three letters each taken from the following set of 26 letters

of the Latin alphabet in lower case: a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z. No diacritical

marks or modified characters are used.

Language identifiers are not intended to be an abbreviation for a name of the language, but to serve as a

device to identify a given language uniquely. With thousands of languages, many pairs of which have similar

names, it is not possible to provide identifiers that resemble a language name in every case. In many cases,

language identifiers do bear some resemblance to a name for the language, but this is not guaranteed. Many

languages have alternate names used by different internal or external communities. In such cases, the form of

the language identifier does not imply that a name resembling the language identifier is considered to be

preferred.

To ensure continuity and stability, the identifier for any given language shall not be changed, though the

names listed in relation to an identifier may change. On occasion, given compelling reasons, a code element

may become deprecated. When a code element is deprecated, the identifier for that code element shall not be

reassigned. (See 4.5.2 for details on maintenance of the code.)

When adapting this part of ISO 639 to languages that are written using non-Latin scripts (e.g. the Cyrillic

alphabet), language identifiers shall be formed using the Latin alphabet according to the principles of this part

of ISO 639.
4.2 Denotation of the language identifier
4.2.1 General

A language identifier represents one or more language names, each of which designates the same language.

The ultimate objects of identification are languages themselves; language names are the formal means by

which the languages denoted by language identifiers are designated.

Every language corresponds to some range of variation in linguistic expression. In this part of ISO 639, then, it

is assumed that language identifiers generally denote some range of language varieties. The range of

varieties that are denoted can have three different scopes: individual language, macrolanguage or collection.

ISO 639 includes identifiers for certain special-purpose categories, such as “undetermined language”, which

do not directly denote any range of language varieties. In this part of ISO 639, these are treated as having a

© ISO 2007 – All rights reserved 3
---------------------- Page: 8 ----------------------
ISO 639-3:2007(E)

special scope, “special purpose”. Thus, every entry in this part of ISO 639 is considered to have one of four

scopes: individual language, macrolanguage, collection or special purpose.

Languages that are represented in ISO 639 can be of various types: living languages, ancient languages,

artificially constructed languages, etc.

This part of ISO 639 provides identifiers for languages of various types and with various scopes. The following

subclauses (4.2.2 to 4.2.10) provide further explanation regarding assignment of identifiers in this part of

ISO 639 for different scopes or for different types of languages.
4.2.2 Individual languages

In this part of ISO 639, most identifiers are assumed to denote distinct individual languages. Furthermore, it is

a goal for this part of ISO 639 to provide an identifier for every distinct human language that has been

documented, whether living or extinct, and whether its primary modality is spoken, written or signed.

There is no one definition of “language” that is agreed upon by all and appropriate for all purposes. As a result,

there can be disagreement, even among speakers or linguistic experts, as to whether two varieties represent

dialects of a single language or two distinct languages. For this part of ISO 639, judgements regarding when

two varieties are considered to be the same or different languages are based on a number of factors,

including linguistic similarity, intelligibility, a common literature, the views of speakers concerning the

relationship between language and identity, and other factors. The following basic criteria are followed.

⎯ Two related varieties are normally considered varieties of the same language if speakers of each variety

have inherent understanding of the other variety (that is, can understand based on knowledge of their

own variety without needing to learn the other variety) at a functional level.

⎯ Where spoken intelligibility between varieties is marginal, the existence of a common literature or of a

common ethnolinguistic identity with a central variety that both understand can be strong indicators that

they should nevertheless be considered varieties of the same language.

⎯ Where there is enough intelligibility between varieties to enable communication, the existence of well-

established, distinct ethnolinguistic identities can be a strong indicator that they should nevertheless be

considered to be different languages.

Some of the distinctions made on this basis may not be considered appropriate by some users or for certain

applications. These basic criteria are thought to best fit the intended range of applications, however (see 4.6).

4.2.3 Macrolanguages

Other parts of ISO 639 have included identifiers designated as “individual language identifiers” that

correspond in a one-to-many manner with individual language identifiers in this part of ISO 639. For instance,

this part of ISO 639 contains over 30 identifiers designated as individual language identifiers for distinct

varieties of Arabic, while ISO 639-1 and ISO 639-2 each contain only one identifier for Arabic, “ar” and “ara”

respectively, which are designated as individual language identifiers in those parts of ISO 639. It is assumed

here that the single identifiers for Arabic in ISO 639-1 and ISO 639-2 correspond to the many identifiers

collectively for distinct varieties of Arabic in this part of ISO 639.

In this example, it may appear that the single identifiers in ISO 639-1 and ISO 639-2 should be designated as

collective language identifiers. That is not assumed here, however. In various parts of the world, there are

clusters of closely-related language varieties that, based on the criteria discussed in 4.2.2, can be considered

individual languages, yet in certain usage contexts a single language identity for all is needed. Typical

situations in which this need can occur include the following.

⎯ There is one variety that is more developed and that tends to be used for wider communication by

speakers of various closely-related languages; as a result, there is a perceived common linguistic identity

across these languages. For instance, there are several distinct spoken Arabic languages, but Standard

Arabic is generally used in business and media across all of these communities, and is also an important

aspect of a shared ethno-religious unity. As a result, a perceived common linguistic identity exists.

4 © ISO 2007 – All rights reserved
---------------------- Page: 9 ----------------------
ISO 639-3:2007(E)

⎯ There is a common written form used for multiple closely-related languages. For instance, multiple

Chinese languages share a common written form.

⎯ There is a transitional socio-linguistic situation in which sub-communities of a single language community

are diverging, creating a need for some purposes to recognize distinct languages while, for other

purposes, a single common identity is still valid. For instance, in some contexts, it is necessary to make a

distinction between Bosnian, Croatian and Serbian languages, yet there are other contexts in which these

distinctions are not discernable in language resources that are in use.

Where such situations exist in this part of ISO 639, an identifier for the single, common language identity is

considered to be a macrolanguage identifier.

Macrolanguages are distinguished from language collections in that the individual languages that correspond

to a macrolanguage must be very closely related, and there must be some domain in which only a single

language identity is recognized.
4.2.4 Dialects

The linguistic varieties denoted by each of the identifiers in this part of ISO 639 are assumed to be distinct

languages and not dialects of some language, even though for some purposes some users may consider a

variety listed in this part of ISO 639 to be a “dialect” rather than a “language” (see 4.2.2 and 4.2.3). In this

standard, the term dialect refers to any sub-variety of a language such as might be based on geographic

region, age, gender, social class, time period, or the like.

The dialects of a language are included within the denotation represented by the identifier for that language.

Thus, each language identifier represents the complete range of all the spoken or written varieties of that

language, including any standardized form.

For applications in which it is necessary to identify dialects, a separate standard may be developed that

provides identifiers for dialects, or that combines identifiers from this or other parts of ISO 639 with other

distinguishing identificational qualifiers. See 4.7 for further discussion.
4.2.5 Collective language code elements

Whereas ISO 639-2 includes identifiers for collections of languages and also uses three-letter identifiers, this

part of ISO 639 provides identifiers for individual languages and macrolanguages only.

4.2.6 Special-purpose language code elements

ISO 639 includes identifiers for certain special-purpose concepts, such as “undetermined language”. Unlike

code elements with other scopes, special-purpose code elements do not directly denote any range of

language varieties. Rather, they are provided to satisfy various special-purpose requirements in applications.

For example, if an application requires that every record in a database be assigned an ISO 639 language

identifier, the availability of the identifier “und”, denoting “undetermined language”, allows that application

requirement to be met even if the relevant language for a given record has not yet been determined or is

impossible to determine.

One special-purpose code element in this part of ISO 639 and also in ISO 639-2 is “mul”, denoting “multiple

languages”. This would be used to declare that a given information object includes content in multiple

languages or is in some other way applicable to multiple languages. In many applications, however,

information will be organized in a way that assumes that each use of a language identifier makes reference to

no more than one language. Hence, the use of “mul” will not be appropriate in many applications.

© ISO 2007 – All rights reserved 5
---------------------- Page: 10 ----------------------
ISO 639-3:2007(E)
4.2.7 Extinct, ancient and historic languages

This part of ISO 639 includes identifiers that denote extinct languages as well as living languages. The criteria

for identifying distinct languages in the case of varieties that have gone extinct in recent times are as defined

above. In the case of ancient languages, a criterion based on intelligibility would be ideal, but in the final

analysis, identifiers will be assigned to ancient languages which have a distinct literature and are treated

distinctly by the scholarly community. In order to qualify for inclusion in this code, the language must have an

attested literature or be well-documented as a language known to have been spoken by some particular

community at some point in history; it may not be a reconstructed language inferred from historical-

comparative analysis. The code also includes identifiers that denote historic languages that are considered to

be distinct from any modern languages that may be descended from them; for instance, Old English and

Middle English. Here, too, the criterion is that the language must have a literature that is treated distinctly by

the scholarly community.
4.2.8 Constructed languages

This part of ISO 639 includes identifiers that denote constructed (or artificial) languages that meet the

following criteria:
⎯ the language has a body of literature read by members of some community;
⎯ the language is designed for the purpose of human communication.

Specifically excluded are reconstructed languages and computer programming languages.

4.2.9 Scripts

A single language identifier is provided for a language even though the language may be written in more than

one script. See 4.7 for further discussion.
4.2.10 Local-use identifiers

Identifiers qaa through qtz are reserved for local use. These identifiers may be used locally, but may not be

used in interchange except by private agreement between parties.
4.3 Documentation of the intended denotation of identifiers

This part of ISO 639 provides a code table consisting of a set of language code elements. This table is

published and maintained by the Registration Authority of ISO 639-3 (ISO 639-3/RA). For more information,

see 4.5.

The normative content of each language code element consists of two parts: a language identifier, and one or

more language names that determine a particular language (see 3.3 and 3.4). Although the names are

normative with respect to this part of ISO 639 in the form in which they appear, this is only insofar as they

designate particular languages. The use of language names within this part of ISO 639 does not imply that

these names or the particular spellings that are used have any special status within the language communities

or any other domain of usage. The names provided serve only as the normative documentation of the

particular language denoted by each identifier. The names are not immutable: a name may be revised in the

course of maintaining the language code table so long as the language that is designated remains unchanged.

A language identifier must be associated in the code table with at least one name that uniquely designates the

given language being identified. In the case of two or more distinct languages that have identical names, the

reference names used to designate these languages will include parenthetic information

...

SLOVENSKI STANDARD
SIST ISO 639-3:2008
01-december-2008
.RGH]DSUHGVWDYOMDQMHLPHQMH]LNRYGHO7ULþUNRYQDNRGD]DFHORYLWR
SUHGVWDYLWHYMH]LNRY
Codes for the representation of names of languages - Part 3: Alpha-3 code for
comprehensive coverage of languages

Codes pour la représentation des noms de langues - Partie 3: Code alpha-3 pour un

traitement exhaustif des langues
Ta slovenski standard je istoveten z: ISO 639-3:2007
ICS:
01.140.20 Informacijske vede Information sciences
SIST ISO 639-3:2008 en,fr

2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

---------------------- Page: 1 ----------------------
SIST ISO 639-3:2008
---------------------- Page: 2 ----------------------
SIST ISO 639-3:2008
INTERNATIONAL ISO
STANDARD 639-3
First edition
2007-02-01
Codes for the representation of names of
languages —
Part 3:
Alpha-3 code for comprehensive
coverage of languages
Codes pour la représentation des noms de langues —
Partie 3: Code alpha-3 pour un traitement exhaustif des langues
Reference number
ISO 639-3:2007(E)
ISO 2007
---------------------- Page: 3 ----------------------
SIST ISO 639-3:2008
ISO 639-3:2007(E)
PDF disclaimer

This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but

shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In

downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat

accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.

Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation

parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In

the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.

© ISO 2007

All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,

electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or

ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO 2007 – All rights reserved
---------------------- Page: 4 ----------------------
SIST ISO 639-3:2008
ISO 639-3:2007(E)
Contents Page

Foreword............................................................................................................................................................ iv

Introduction ........................................................................................................................................................ v

1 Scope ..................................................................................................................................................... 1

2 Normative references ........................................................................................................................... 1

3 Terms and definitions........................................................................................................................... 1

4 Three-letter language code.................................................................................................................. 3

4.1 Form of the language identifier ........................................................................................................... 3

4.2 Denotation of the language identifier ................................................................................................. 3

4.3 Documentation of the intended denotation of identifiers................................................................. 6

4.4 Relationship between ISO 639-2 and ISO 639-3 ................................................................................ 7

4.5 Registration Authority and maintenance of the code....................................................................... 8

4.6 Application of language identifiers..................................................................................................... 9

4.7 Scripts and regions .............................................................................................................................. 9

5 Language code tables .......................................................................................................................... 9

Annex A (normative) Procedures for the Registration Authority and Registration Authorities

Advisory Committee for ISO 639....................................................................................................... 10

Bibliography ..................................................................................................................................................... 12

© ISO 2007 – All rights reserved iii
---------------------- Page: 5 ----------------------
SIST ISO 639-3:2008
ISO 639-3:2007(E)
Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies

(ISO member bodies). The work of preparing International Standards is normally carried out through ISO

technical committees. Each member body interested in a subject for which a technical committee has been

established has the right to be represented on that committee. International organizations, governmental and

non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the

International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.

International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.

The main task of technical committees is to prepare International Standards. Draft International Standards

adopted by the technical committees are circulated to the member bodies for voting. Publication as an

International Standard requires approval by at least 75 % of the member bodies casting a vote.

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent

rights. ISO shall not be held responsible for identifying any or all such patent rights.

ISO 639-3 was prepared by Technical Committee ISO/TC 37, Terminology and other language and content

resources, Subcommittee SC 2, Terminographical and lexicographical working methods.

ISO 639 consists of the following parts, under the general title Codes for the representation of names of

languages:
— Part 1: Alpha-2 code
— Part 2: Alpha-3 code
— Part 3: Alpha-3 code for comprehensive coverage of languages
The following parts are under preparation:
— Part 4: Implementation guidelines and general principles for language coding
— Part 5: Alpha-3 code for language families and groups

— Part 6: Alpha-4 representation for comprehensive coverage of language variation

iv © ISO 2007 – All rights reserved
---------------------- Page: 6 ----------------------
SIST ISO 639-3:2008
ISO 639-3:2007(E)
Introduction

ISO 639 provides three language codes for the representation of names of languages: one is a two-letter code

(ISO 639-1) and two others are three-letter codes (ISO 639-2 and ISO 639-3). ISO 639-1 was devised

primarily for use in terminology, lexicography and linguistics. ISO 639-2 was devised primarily for use in

terminology and bibliography; it represents all languages contained in ISO 639-1 and in addition other

languages and language collections of interest for those primary applications. ISO 639-3 was devised to

provide a comprehensive set of identifiers for all languages for use in a wide range of applications, including

linguistics, lexicography and internationalization of information systems. It attempts to represent all known

languages.

The three-letter codes in ISO 639-2 and ISO 639-3 are complementary and compatible. The two codes have

been devised for different purposes. The set of individual languages listed in ISO 639-2 is a subset of those

listed in ISO 639-3. The codes differ in that ISO 639-2 includes code elements representing some individual

languages and also collections of languages, while ISO 639-3 includes code elements for all known individual

languages but not for collections of languages. Overall, the set of individual languages listed in ISO 639-3 is

much larger than the set of individual languages listed in ISO 639-2.

The languages represented in ISO 639-1 are a subset of those represented in ISO 639-2; every language

code element in the two-letter code has a corresponding language code element in ISO 639-2, but not

necessarily vice versa. Likewise, elements other than collections listed in ISO 639-2 are a subset of those

listed ISO 639-3; each non-collective element in ISO 639-2 is included in ISO 639-3, but not necessarily vice

versa. The denotation represented by alpha-3 identifiers included in both ISO 639-2 and ISO 639-3 is the

same in each part, and the denotation represented by alpha-2 identifiers in ISO 639-1 is the same as that

represented by the corresponding alpha-3 identifiers in ISO 639-2 and ISO 639-3.
All three language codes are to be considered as open lists.

The large number of languages in the initial inventory of ISO 639-3 beyond those already included in

[1]

ISO 639-2 was derived primarily from Ethnologue , with additional ancient, historic or artificial languages

[2], [3]
obtained from Linguist List .

This part of ISO 639 also includes guidelines for the creation of language code elements and their use in

some applications.
© ISO 2007 – All rights reserved v
---------------------- Page: 7 ----------------------
SIST ISO 639-3:2008
---------------------- Page: 8 ----------------------
SIST ISO 639-3:2008
INTERNATIONAL STANDARD ISO 639-3:2007(E)
Codes for the representation of names of languages —
Part 3:
Alpha-3 code for comprehensive coverage of languages
1 Scope

This part of ISO 639 provides a code, published by the Registration Authority of ISO 639-3, consisting of

language code elements comprising three-letter language identifiers for the representation of languages. The

language identifiers according to this part of ISO 639 were devised for use in a wide range of applications,

especially in computer systems, where there is potential need to support a large number of the languages that

are known to have ever existed. Whereas ISO 639-1 and ISO 639-2 are intended to focus on the major

languages of the world that are most frequently represented in the total body of the world's literature, this part

of ISO 639 attempts to provide as complete an enumeration of languages as possible, including living, extinct,

ancient and constructed languages, whether major or minor, written or unwritten. As a result, this part of

ISO 639 deals with a very large number of lesser-known languages. Languages designed exclusively for

machine use, such as computer-programming languages and reconstructed languages, are not included in

this code.

Knowledge of the world's languages at any given time is never complete or perfect. Additional language

identifiers may be created for this list when it becomes apparent that there is a linguistic variety that is deemed

to be distinct from other languages in accordance with the definitions in Clause 3 and their elaboration in

Clause 4. In addition, the denotation of existing identifiers may be revised or identifiers may become

deprecated when it becomes apparent that they do not accurately reflect actual language distinctions. In all

such changes, careful consideration is given to minimize adverse effects on existing implementations.

2 Normative references

The following referenced documents are indispensable for the application of this document. For dated

references, only the edition cited applies. For undated references, the latest edition of the referenced

document (including any amendments) applies.

ISO 3166-1, Codes for the representation of names of countries and their subdivisions — Part 1: Country

codes

ISO 15924, Information and documentation — Codes for the representation of names of scripts

3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
3.1
code

data transformed or represented in different forms according to a pre-established set of rules

3.2
code element
individual entry in a code table
© ISO 2007 – All rights reserved 1
---------------------- Page: 9 ----------------------
SIST ISO 639-3:2008
ISO 639-3:2007(E)
3.3
language identifier
language symbol
symbol that uniquely identifies a particular language

NOTE 1 In the language code in this part of ISO 639, each language identifier is composed of three letters.

NOTE 2 In this part of ISO 639, each language identifier represents the various language names used to designate a

particular language.
3.4
name
reference name
appellation
linguistic expression used to designate an individual concept

NOTE 1 In this part of ISO 639, a language name is used to designate the concept of a particular language.

NOTE 2 In this part of ISO 639, names used to designate a language may be expressions taken from one or more

specified source languages, such as English or French. It is not guaranteed, however, that a complete set of names from

any particular language will be provided, or that the source language for any name will be indicated.

NOTE 3 In the initial code table for this part of ISO 639, the names used for many languages will be names used in

[1]

Ethnologue . In subsequent maintenance of this part of ISO 639, these names may be revised.

NOTE 4 In this part of ISO 639, a language name is considered normative insofar as it designates a particular

language. The actual form of a name is not immutable.

NOTE 5 In this part of ISO 639, reference names may include parenthetic information not generally used to designate

a given language in order to differentiate between distinct languages that have identical names. See 4.3.

3.5
language code element
code element (3.2) in a language code table

NOTE In the language code table published by the Registration Authority of ISO 639-3 (see 4.5), each language

code element consists of a language identifier and one or more language names.
3.6
scope

attribute of a language code element (3.5) that pertains to the breadth of language varieties to which it

corresponds, and to the nature of the relationship between that language code element and other language

code elements

NOTE For the purposes of this part of ISO 639, language code elements have one of four scopes: individual

language, macrolanguage, collection or special purpose. See 4.2.
3.7
individual language code element

language code element (3.5) with a scope (3.6) representing an individual language

NOTE The language represented by an individual language code element is considered distinct from those

represented by any other individual language code element; thus, there is no correspondence between different individual

language code elements. The notion of individual language is explained further in 4.2.2.

2 © ISO 2007 – All rights reserved
---------------------- Page: 10 ----------------------
SIST ISO 639-3:2008
ISO 639-3:2007(E)
3.8
macrolanguage code element

language code element (3.5) with a scope (3.6) representing multiple, closely-related individual languages

that are deemed in some usage contexts to be a single language

NOTE Every macrolanguage code element has a normative correspondence to the individual language code

elements representing the individual languages encompassed by the macrolanguage. This normative relationship between

macrolanguage code elements and individual language code elements is documented in the code tables included in this

part of ISO 639. The notion of macrolanguage is explained further in 4.2.3.
3.9
collective language code element

language code element (3.5) with a scope (3.6) representing a group of individual languages that are not

deemed to be one language in any usage context

NOTE The language code in this part of ISO 639 does not include collective language code elements.

4 Three-letter language code
4.1 Form of the language identifier

The language identifiers consist of a sequence of three letters each taken from the following set of 26 letters

of the Latin alphabet in lower case: a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z. No diacritical

marks or modified characters are used.

Language identifiers are not intended to be an abbreviation for a name of the language, but to serve as a

device to identify a given language uniquely. With thousands of languages, many pairs of which have similar

names, it is not possible to provide identifiers that resemble a language name in every case. In many cases,

language identifiers do bear some resemblance to a name for the language, but this is not guaranteed. Many

languages have alternate names used by different internal or external communities. In such cases, the form of

the language identifier does not imply that a name resembling the language identifier is considered to be

preferred.

To ensure continuity and stability, the identifier for any given language shall not be changed, though the

names listed in relation to an identifier may change. On occasion, given compelling reasons, a code element

may become deprecated. When a code element is deprecated, the identifier for that code element shall not be

reassigned. (See 4.5.2 for details on maintenance of the code.)

When adapting this part of ISO 639 to languages that are written using non-Latin scripts (e.g. the Cyrillic

alphabet), language identifiers shall be formed using the Latin alphabet according to the principles of this part

of ISO 639.
4.2 Denotation of the language identifier
4.2.1 General

A language identifier represents one or more language names, each of which designates the same language.

The ultimate objects of identification are languages themselves; language names are the formal means by

which the languages denoted by language identifiers are designated.

Every language corresponds to some range of variation in linguistic expression. In this part of ISO 639, then, it

is assumed that language identifiers generally denote some range of language varieties. The range of

varieties that are denoted can have three different scopes: individual language, macrolanguage or collection.

ISO 639 includes identifiers for certain special-purpose categories, such as “undetermined language”, which

do not directly denote any range of language varieties. In this part of ISO 639, these are treated as having a

© ISO 2007 – All rights reserved 3
---------------------- Page: 11 ----------------------
SIST ISO 639-3:2008
ISO 639-3:2007(E)

special scope, “special purpose”. Thus, every entry in this part of ISO 639 is considered to have one of four

scopes: individual language, macrolanguage, collection or special purpose.

Languages that are represented in ISO 639 can be of various types: living languages, ancient languages,

artificially constructed languages, etc.

This part of ISO 639 provides identifiers for languages of various types and with various scopes. The following

subclauses (4.2.2 to 4.2.10) provide further explanation regarding assignment of identifiers in this part of

ISO 639 for different scopes or for different types of languages.
4.2.2 Individual languages

In this part of ISO 639, most identifiers are assumed to denote distinct individual languages. Furthermore, it is

a goal for this part of ISO 639 to provide an identifier for every distinct human language that has been

documented, whether living or extinct, and whether its primary modality is spoken, written or signed.

There is no one definition of “language” that is agreed upon by all and appropriate for all purposes. As a result,

there can be disagreement, even among speakers or linguistic experts, as to whether two varieties represent

dialects of a single language or two distinct languages. For this part of ISO 639, judgements regarding when

two varieties are considered to be the same or different languages are based on a number of factors,

including linguistic similarity, intelligibility, a common literature, the views of speakers concerning the

relationship between language and identity, and other factors. The following basic criteria are followed.

⎯ Two related varieties are normally considered varieties of the same language if speakers of each variety

have inherent understanding of the other variety (that is, can understand based on knowledge of their

own variety without needing to learn the other variety) at a functional level.

⎯ Where spoken intelligibility between varieties is marginal, the existence of a common literature or of a

common ethnolinguistic identity with a central variety that both understand can be strong indicators that

they should nevertheless be considered varieties of the same language.

⎯ Where there is enough intelligibility between varieties to enable communication, the existence of well-

established, distinct ethnolinguistic identities can be a strong indicator that they should nevertheless be

considered to be different languages.

Some of the distinctions made on this basis may not be considered appropriate by some users or for certain

applications. These basic criteria are thought to best fit the intended range of applications, however (see 4.6).

4.2.3 Macrolanguages

Other parts of ISO 639 have included identifiers designated as “individual language identifiers” that

correspond in a one-to-many manner with individual language identifiers in this part of ISO 639. For instance,

this part of ISO 639 contains over 30 identifiers designated as individual language identifiers for distinct

varieties of Arabic, while ISO 639-1 and ISO 639-2 each contain only one identifier for Arabic, “ar” and “ara”

respectively, which are designated as individual language identifiers in those parts of ISO 639. It is assumed

here that the single identifiers for Arabic in ISO 639-1 and ISO 639-2 correspond to the many identifiers

collectively for distinct varieties of Arabic in this part of ISO 639.

In this example, it may appear that the single identifiers in ISO 639-1 and ISO 639-2 should be designated as

collective language identifiers. That is not assumed here, however. In various parts of the world, there are

clusters of closely-related language varieties that, based on the criteria discussed in 4.2.2, can be considered

individual languages, yet in certain usage contexts a single language identity for all is needed. Typical

situations in which this need can occur include the following.

⎯ There is one variety that is more developed and that tends to be used for wider communication by

speakers of various closely-related languages; as a result, there is a perceived common linguistic identity

across these languages. For instance, there are several distinct spoken Arabic languages, but Standard

Arabic is generally used in business and media across all of these communities, and is also an important

aspect of a shared ethno-religious unity. As a result, a perceived common linguistic identity exists.

4 © ISO 2007 – All rights reserved
---------------------- Page: 12 ----------------------
SIST ISO 639-3:2008
ISO 639-3:2007(E)

⎯ There is a common written form used for multiple closely-related languages. For instance, multiple

Chinese languages share a common written form.

⎯ There is a transitional socio-linguistic situation in which sub-communities of a single language community

are diverging, creating a need for some purposes to recognize distinct languages while, for other

purposes, a single common identity is still valid. For instance, in some contexts, it is necessary to make a

distinction between Bosnian, Croatian and Serbian languages, yet there are other contexts in which these

distinctions are not discernable in language resources that are in use.

Where such situations exist in this part of ISO 639, an identifier for the single, common language identity is

considered to be a macrolanguage identifier.

Macrolanguages are distinguished from language collections in that the individual languages that correspond

to a macrolanguage must be very closely related, and there must be some domain in which only a single

language identity is recognized.
4.2.4 Dialects

The linguistic varieties denoted by each of the identifiers in this part of ISO 639 are assumed to be distinct

languages and not dialects of some language, even though for some purposes some users may consider a

variety listed in this part of ISO 639 to be a “dialect” rather than a “language” (see 4.2.2 and 4.2.3). In this

standard, the term dialect refers to any sub-variety of a language such as might be based on geographic

region, age, gender, social class, time period, or the like.

The dialects of a language are included within the denotation represented by the identifier for that language.

Thus, each language identifier represents the complete range of all the spoken or written varieties of that

language, including any standardized form.

For applications in which it is necessary to identify dialects, a separate standard may be developed that

provides identifiers for dialects, or that combines identifiers from this or other parts of ISO 639 with other

distinguishing identificational qualifiers. See 4.7 for further discussion.
4.2.5 Collective language code elements

Whereas ISO 639-2 includes identifiers for collections of languages and also uses three-letter identifiers, this

part of ISO 639 provides identifiers for individual languages and macrolanguages only.

4.2.6 Special-purpose language code elements

ISO 639 includes identifiers for certain special-purpose concepts, such as “undetermined language”. Unlike

code elements with other scopes, special-purpose code elements do not directly denote any range of

language varieties. Rather, they are provided to satisfy various special-purpose requirements in applications.

For example, if an application requires that every record in a database be assigned an ISO 639 language

identifier, the availability of the identifier “und”, denoting “undetermined language”, allows that application

requirement to be met even if the relevant language for a given record has not yet been determined or is

impossible to determine.

One special-purpose code element in this part of ISO 639 and also in ISO 639-2 is “mul”, denoting “multiple

languages”. This would be used to declare that a given information object includes content in multiple

languages or is in some other way applicable to multiple languages. In many applications, however,

information will be organized in a way that assumes that each use of a language identifier makes reference to

no more than one language. Hence, the use of “mul” will not be appropriate in many applications.

© ISO 2007 – All rights reserved 5
---------------------- Page: 13 ----------------------
SIST ISO 639-3:2008
ISO 639-3:2007(E)
4.2.7 Extinct, ancient and historic languages

This part of ISO 639 includes identifiers that denote extinct languages as well as living languages. The criteria

for identifying distinct languages in the case of varieties that have gone extinct in recent times are as defined

above. In the case of ancient languages, a criterion based on intelligibility would be ideal, but in the final

analysis, identifiers will be assigned to ancient languages which have a distinct literature and are treated

distinctly by the scholarly community. In order to qualify for inclusion in this code, the language must have an

attested literature or be well-documented as a language known to have been spoken by some particular

community at some point in history; it may not be a reconstructed language inferred from historical-

comparative analysis. The code also includes identifiers that denote historic languages that are considered to

be distinct from any modern languages that may be descended from them; for instance, Old English and

Middle English. Here, too, the criterion is that the language must have a literature that is treated distinctly by

the scholarly community.
4.2.8 Constructed languages

This part of ISO 639 includes identifiers that denote constructed (or artificial) languages that meet the

following criteria:
⎯ the language has a body of literature read by members of some community;
⎯ the language is designed for the purpose of human communication.

Specifically excluded are reconstructed languages and computer programming languages.

4.2.9 Scripts

A single language identifier is provided for a language even though the language may be written in more than

one script. See 4.7 for further discussion.
4.2.10 Local-use identifiers

Identifiers qaa through qtz are reserved for local use. These identifiers may be used locally, but may not be

used in interchange except by private agreement between parties.
4.3 Documentation of the intended denotation of identifiers

This part of ISO 639 provides a code table consisting of a set of language code elements. This table is

published and maintained by the Registration Authority of ISO 639-3 (ISO 639-3/RA). For more information,

see 4.5.
The normative content of each language code element consists of two parts
...

NORME ISO
INTERNATIONALE 639-3
Première édition
2007-02-01
Codes pour la représentation des noms
de langues —
Partie 3:
Code alpha-3 pour un traitement
exhaustif des langues
Codes for the representation of names of languages —
Part 3: Alpha-3 code for comprehensive coverage of languages
Numéro de référence
ISO 639-3:2007(F)
ISO 2007
---------------------- Page: 1 ----------------------
ISO 639-3:2007(F)
PDF – Exonération de responsabilité

Le présent fichier PDF peut contenir des polices de caractères intégrées. Conformément aux conditions de licence d'Adobe, ce fichier

peut être imprimé ou visualisé, mais ne doit pas être modifié à moins que l'ordinateur employé à cet effet ne bénéficie d'une licence

autorisant l'utilisation de ces polices et que celles-ci y soient installées. Lors du téléchargement de ce fichier, les parties concernées

acceptent de fait la responsabilité de ne pas enfreindre les conditions de licence d'Adobe. Le Secrétariat central de l'ISO décline toute

responsabilité en la matière.
Adobe est une marque déposée d'Adobe Systems Incorporated.

Les détails relatifs aux produits logiciels utilisés pour la création du présent fichier PDF sont disponibles dans la rubrique General Info

du fichier; les paramètres de création PDF ont été optimisés pour l'impression. Toutes les mesures ont été prises pour garantir

l'exploitation de ce fichier par les comités membres de l'ISO. Dans le cas peu probable où surviendrait un problème d'utilisation,

veuillez en informer le Secrétariat central à l'adresse donnée ci-dessous.
© ISO 2007

Droits de reproduction réservés. Sauf prescription différente, aucune partie de cette publication ne peut être reproduite ni utilisée sous

quelque forme que ce soit et par aucun procédé, électronique ou mécanique, y compris la photocopie et les microfilms, sans l'accord écrit

de l'ISO à l'adresse ci-après ou du comité membre de l'ISO dans le pays du demandeur.

ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax. + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Publié en Suisse
ii © ISO 2007 – Tous droits réservés
---------------------- Page: 2 ----------------------
ISO 639-3:2007(F)
Sommaire Page

Avant-propos..................................................................................................................................................... iv

Introduction ........................................................................................................................................................ v

1 Domaine d'application.......................................................................................................................... 1

2 Références normatives ........................................................................................................................ 1

3 Termes et définitions............................................................................................................................ 1

4 Code de langue à trois lettres ............................................................................................................. 3

4.1 Forme des indicatifs de langue ........................................................................................................... 3

4.2 Dénotation des indicatifs de langue ................................................................................................... 3

4.3 Documentation de la dénotation prévue des indicatifs.................................................................... 6

4.4 Relation entre l'ISO 639-2 et l'ISO 639-3 ............................................................................................. 7

4.5 Agences d'enregistrement et maintenance du code......................................................................... 8

4.6 Utilisation des indicatifs de langue..................................................................................................... 9

4.7 Systèmes d'écriture et régions............................................................................................................ 9

5 Tableaux des codes de langue.......................................................................................................... 10

Annexe A (normative) Procédures concernant l'Agence d'enregistrement et le Comité consultatif

pour les Agences d'enregistrement de l'ISO 639 ............................................................................ 11

Bibliographie .................................................................................................................................................... 14

© ISO 2007 – Tous droits réservés iii
---------------------- Page: 3 ----------------------
ISO 639-3:2007(F)
Avant-propos

L'ISO (Organisation internationale de normalisation) est une fédération mondiale d'organismes nationaux de

normalisation (comités membres de l'ISO). L'élaboration des Normes internationales est en général confiée

aux comités techniques de l'ISO. Chaque comité membre intéressé par une étude a le droit de faire partie du

comité technique créé à cet effet. Les organisations internationales, gouvernementales et non

gouvernementales, en liaison avec l'ISO participent également aux travaux. L'ISO collabore étroitement avec

la Commission électrotechnique internationale (CEI) en ce qui concerne la normalisation électrotechnique.

Les Normes internationales sont rédigées conformément aux règles données dans les Directives ISO/CEI,

Partie 2.

La tâche principale des comités techniques est d'élaborer les Normes internationales. Les projets de Normes

internationales adoptés par les comités techniques sont soumis aux comités membres pour vote. Leur

publication comme Normes internationales requiert l'approbation de 75 % au moins des comités membres

votants.

L'attention est appelée sur le fait que certains des éléments du présent document peuvent faire l'objet de

droits de propriété intellectuelle ou de droits analogues. L'ISO ne saurait être tenue pour responsable de ne

pas avoir identifié de tels droits de propriété et averti de leur existence.

L'ISO 639-3 a été élaborée par le comité technique ISO/TC 37, Terminologie et autres ressources langagières

et ressources de contenu, sous-comité SC 2, Méthodes de travail terminographiques et lexicographiques.

L'ISO 639 comprend les parties suivantes, présentées sous le titre général Codes pour la représentation des

noms de langues:
— Partie 1: Code alpha-2
— Partie 2: Code alpha-3
— Partie 3: Code alpha-3 pour un traitement exhaustif des langues
Les parties suivantes sont en cours d'élaboration:
— Partie 4: Guide d'implémentation et principes généraux des codes de langue
— Partie 5: Code alpha-3 pour les familles de langues et groupes de langues

— Partie 6: Représentation alpha-4 pour un traitement exhaustif de la variation linguistique

iv © ISO 2007 – Tous droits réservés
---------------------- Page: 4 ----------------------
ISO 639-3:2007(F)
Introduction

L'ISO 639 fournit trois codes de langue pour représenter les noms de langue, l'un qui est un code

alphabétique à deux lettres (ISO 639-1) et les deux autres qui sont des codes alphabétiques à trois lettres

(ISO 639-2 et ISO 639-3). L'ISO 639-1 a été conçue principalement pour être utilisée en terminologie,

lexicographie et linguistique. L'ISO 639-2 a été conçue principalement pour être utilisée en terminologie et

bibliographie. Elle reprend toutes les langues contenues dans l'ISO 639-1 ainsi que toutes les autres langues

et tous les groupes de langues utilisés par ces applications. L'ISO 639-3 a été conçue pour fournir un

ensemble exhaustif d'indicatifs de langue utilisés dans une gamme d'applications plus large, comprenant la

linguistique, la lexicographie et l'internationalisation des systèmes d'informations. Elle vise à représenter

toutes les langues connues.

Les codes alphabétiques à trois lettres de l'ISO 639-2 et de l'ISO 639-3 sont complémentaires et compatibles.

Les deux codes ont été conçus pour des besoins différents. L'ensemble de langues individuelles traitées dans

l'ISO 639-2 est un sous-ensemble de celles traitées dans l'ISO 639-3. Les codes diffèrent en ce que

l'ISO 639-2 comprend des codets représentant un certain nombre de langues individuelles et également des

groupes de langues, alors que l'ISO 639-3 comprend des codets pour toutes les langues individuelles

connues mais pas pour des groupes de langues. Globalement, l'ensemble de langues individuelles traitées

dans l'ISO 639-3 est beaucoup plus vaste que l'ensemble de langues individuelles traitées dans l'ISO 639-2.

Les langues traitées dans l'ISO 639-1 sont un sous ensemble des langues traitées dans l'ISO 639-2; à

chaque codet du code de langue à deux lettres correspond un codet du code de langue dans l'ISO 639-2,

mais l'inverse n'est pas nécessairement vrai. De même, les codets autres que les groupes traités dans

l'ISO 639-2 sont un sous-ensemble de ceux traités dans l'ISO 639-3; chaque codet non collectif énuméré

dans l'ISO 639-2 est inclus dans l'ISO 639-3, mais l'inverse n'est pas nécessairement vrai. La dénotation

représentée par des indicatifs alpha-3 inclus à la fois dans l'ISO 639-2 et dans l'ISO 639-3 est la même dans

chaque partie et la dénotation représentée par des indicatifs alpha-2 inclus dans l'ISO 639-1 est la même que

celle représentée par les indicatifs alpha-3 correspondants dans l'ISO 639-2 et dans l'ISO 639-3.

Tous les trois codes de langue doivent être considérés comme des listes ouvertes.

Le grand nombre de langues dans l'inventaire initial de l'ISO 639-3 outre celles déjà incluses dans l'ISO 639-2

[1]

a été tiré de Ethnologue , avec l'addition de langues anciennes, historiques ou artificielles tirées de Linguist

[2], [3]
List .

La présente partie de l'ISO 639 inclut également des principes directeurs pour la création de codets de langue

et leur utilisation dans un certain nombre d'applications.
© ISO 2007 – Tous droits réservés v
---------------------- Page: 5 ----------------------
NORME INTERNATIONALE ISO 639-3:2007(F)
Codes pour la représentation des noms de langues —
Partie 3:
Code alpha-3 pour un traitement exhaustif des langues
1 Domaine d'application

La présente partie de l'ISO 639 fournit un code, publié par l'Agence d'enregistrement de l'ISO 639-3, composé

de codets de langue formés avec des indicatifs de langue à trois lettres pour la représentation des langues.

Les indicatifs de langue de la présente partie de l'ISO 639 ont été conçus pour être utilisés dans une large

gamme d'applications, en particulier dans les systèmes informatiques, lorsqu'il y a un besoin potentiel de prise

en charge du grand nombre de langues dont l'existence est connue. Alors que l'ISO 639-1 et l'ISO 639-2 se

focalisent sur les grandes langues du monde les plus fréquemment représentées dans le corpus de la

littérature mondiale, la présente partie de l'ISO 639 vise à fournir une énumération de langues la plus

complète possible, y compris les langues vivantes, les langues mortes, les langues anciennes et les langues

construites artificiellement, qu'elles soient majeures ou mineures, écrites ou orales. Par conséquent, la

présente partie de l'ISO 639 traite d'un très grand nombre de langues moins connues. Les langages créés

pour être exclusivement utilisés dans des machines, comme les langages de programmation, ainsi que les

langues reconstituées ne sont pas inclus dans le présent code.

La connaissance des langues du monde à un moment donné n'est jamais complète ou parfaite. Des indicatifs

de langue additionnels peuvent être créés lorsqu'il devient apparent qu'une variété linguistique jugée distincte

d'autres langues existe, conformément aux définitions indiquées dans l'Article 3 et à leur élaboration indiquée

dans l'Article 4. En outre, la dénotation d'indicatifs existants peut être révisée ou les indicatifs peuvent devenir

obsolètes lorsqu'il est évident qu'ils ne reflètent pas de façon précise les distinctions linguistiques réelles. En

cas de modifications, on fera montre d'un grand soin afin de réduire les effets défavorables sur les usages

existants de leur mise en œuvre.
2 Références normatives

Les documents de référence suivants sont indispensables pour l'application du présent document. Pour les

références datées, seule l'édition citée s'applique. Pour les références non datées, la dernière édition du

document de référence s'applique (y compris les éventuels amendements).

ISO 3166-1, Codes pour la représentation des noms de pays et de leurs subdivisions — Partie 1: Codes de

pays

ISO 15924, Information et documentation — Codes pour la représentation des noms d'écritures

3 Termes et définitions

Pour les besoins du présent document, les termes et définitions suivants s'appliquent.

3.1
code

ensemble de données transformées ou représentées sous différentes formes, selon un jeu de règles

préétablies
© ISO 2007 – Tous droits réservés 1
---------------------- Page: 6 ----------------------
ISO 639-3:2007(F)
3.2
codet
entrée individuelle d'un tableau de codes
3.3
indicatif de langue
symbole qui identifie de façon unique une langue particulière

NOTE 1 Dans le code de langue décrit dans la présente partie de l'ISO 639, chaque indicatif de langue est composé

de trois lettres.

NOTE 2 Dans la présente partie de l'ISO 639, chaque indicatif de langue représente les divers noms de langue utilisés

pour désigner une langue particulière.
3.4
nom
nom de référence
appellation
expression linguistique utilisée pour désigner un concept unique

NOTE 1 Dans la présente partie de l'ISO 639, un nom de langue est utilisé pour désigner le concept d'une langue

particulière.

NOTE 2 Dans la présente partie de l'ISO 639, les noms utilisés pour désigner une langue peuvent être des

expressions issues d'une ou plusieurs langues sources spécifiées, telles que l'anglais ou le français. Toutefois, il n'est pas

garanti qu'un jeu complet de noms issus d'une langue particulière quelconque sera fourni ou que la langue source pour un

nom quelconque sera indiquée.

NOTE 3 Dans le tableau de codes initial de cette partie de l'ISO 639, les noms utilisés pour de nombreuses langues

[1]

proviennent de l'Ethnologue . Dans les futures révisions de la présente partie de l'ISO 639, il se peut que ces noms

changent.

NOTE 4 Dans la présente partie de l'ISO 639, un nom de langue est considéré normatif pour autant qu'il désigne une

langue particulière. La forme réelle d'un nom n'est pas immuable.

NOTE 5 Dans la présente partie de l'ISO 639, les noms de référence peuvent inclure des informations entre

parenthèses non utilisées de façon habituelle pour désigner une langue donnée, afin de faire la différence entre des

langues distinctes portant le même nom (voir 4.3).
3.5
codet de langue
codet (3.2) dans un tableau de codes de langue

NOTE Dans le tableau de codes de langue publié par l'Agence d'enregistrement de l'ISO 639-3 (voir 4.5), chaque

codet de langue consiste en un indicatif de langue et en un ou plusieurs noms de langue.

3.6
domaine d'application

attribut d'un codet de langue (3.5) qui appartient à l'étendue des variétés de langue auxquelles il correspond,

et à la nature de la relation entre ce codet de langue et d'autres codets de langue

NOTE Pour les besoins de la présente partie de l'ISO 639, les codets de langue ont l'un des quatre domaines

d'application suivants: langue individuelle, macro-langue, groupe ou spécial (voir 4.2).

3.7
codet de langue individuelle

codet de langue (3.5) avec un domaine d'application (3.6) représentant une langue individuelle

NOTE La langue représentée par un codet de langue individuelle est considérée comme distincte des langues

représentées par tout autre codet de langue individuelle; par conséquent, il n'y a aucun recouvrement entre des codets

différents de code de langue individuelle. On trouvera en 4.2.2 une explication plus détaillée de la notion de langue

individuelle.
2 © ISO 2007 – Tous droits réservés
---------------------- Page: 7 ----------------------
ISO 639-3:2007(F)
3.8
codet de macro-langue

codet de langue (3.5) avec un domaine d'application (3.6) représentant plusieurs langues individuelles

étroitement apparentées qui sont jugées comme étant une seule langue dans un certain nombre de contextes

d'usage

NOTE Chaque codet de macro-langue a une correspondance normative avec les codets de langue individuelle

représentant les langues individuelles englobées dans la macro-langue. Cette correspondance normative entre codets de

macro-langue et codets de langue individuelle est documentée dans les tableaux de code inclus dans la présente partie

de l'ISO 639. On trouvera en 4.2.3 une explication plus détaillée de la notion de macro-langue.

3.9
codet de langue collective

codet de langue avec un domaine d'application (3.6) représentant un groupe de langues individuelles qui ne

sont pas jugées comme étant une seule langue quel que soit le contexte d'usage

NOTE Le code de langue donné dans la présente partie de l'ISO 639 n'inclut pas les codets de langue collective.

4 Code de langue à trois lettres
4.1 Forme des indicatifs de langue

Les indicatifs de langue sont composés d'une séquence de trois lettres prises chacune parmi les 26 lettres

minuscules de l'alphabet latin: a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, à l'exclusion de

tout signe diacritique et de tout caractère modifié.

Les indicatifs de langue n'ont pas été conçus pour tenir lieu d'abréviation du nom de la langue mais pour

servir de symbole afin d'identifier de manière unique une langue donnée. Avec les milliers de langues, dont

plusieurs paires ont des noms similaires, il n'est pas possible de fournir des indicatifs qui ressemblent au nom

de la langue dans chaque cas. Dans de nombreux cas, les indicatifs de langue ont une certaine ressemblance

avec le nom de la langue mais cela n'est pas garanti. De nombreuses langues ont des noms en variante

utilisés par différentes communautés intérieures ou extérieures. Dans ces cas, la forme de l'indicatif de langue

n'implique pas qu'un nom ressemblant à l'indicatif de langue soit considéré comme ayant la préférence.

Afin d'assurer la continuité et la stabilité du code, l'indicatif d'une langue donnée ne doit pas être modifié,

même si les noms énumérés en rapport avec un indicatif peuvent changer. À l'occasion, pour des raisons

impérieuses, un codet peut devenir obsolète. Dans ce cas, l'indicatif pour ce codet ne doit pas être réattribué.

(Voir 4.5.2 pour plus d'informations sur la maintenance du code.)

Lorsque l'on adapte la présente partie de l'ISO 639 à des langues qui utilisent d'autres systèmes d'écriture

que le système latin (par exemple le cyrillique), les indicatifs de langue doivent être créés à l'aide de l'alphabet

latin selon les principes de la présente partie de l'ISO 639.
4.2 Dénotation des indicatifs de langue
4.2.1 Généralités

Un indicatif de langue représente un ou plusieurs noms de langue, désignant chacun la même langue. Les

objets ultimes de l'identification sont les langues elles-mêmes; les noms de langue sont des moyens formels

par lesquels sont désignées les langues dénommées par les indicatifs de langue.

Chaque langue correspond à une certaine plage de variation de l'expression linguistique. Dans la présente

partie de l'ISO 639, il est donc présupposé que les indicatifs de langues désignent généralement une certaine

plage de variétés de langues. La plage de variétés qui sont désignées peut avoir trois domaines d'application

différents: langue individuelle, macro-langue ou groupe.
© ISO 2007 – Tous droits réservés 3
---------------------- Page: 8 ----------------------
ISO 639-3:2007(F)

L'ISO 639 inclut des indicatifs pour certaines catégories à but spécifique, telles que «langue non déterminée»,

qui ne dénotent pas directement une plage de variétés de langues. Dans la présente partie de l'ISO 639, ils

sont traités comme ayant un domaine d'application «spécial». Ainsi, toute entrée dans la présente partie de

l'ISO 639 est considérée comme ayant un domaine d'application parmi les quatre suivantes: langue

individuelle, macro-langue, groupe, ou spécial.

Les langues représentées dans ISO 639 peuvent être de divers types: langues vivantes, langues anciennes,

langues construites artificiellement, etc.

La présente partie de l'ISO 639 fournit des indicatifs pour des langues avec différents types et domaines

d'application. Les paragraphes 4.2.2 à 4.2.10 fournissent une explication plus détaillée de l'attribution des

indicatifs dans la présente partie de l'ISO 639 pour les différents domaines d'application ou pour différents

types de langues.
4.2.2 Langue individuelle

Dans la présente partie de l'ISO 639, la plupart des indicatifs sont supposés désigner des langues

individuelles distinctes. En outre, un but de la présente partie de l'ISO 639 est de fournir un indicatif pour

chaque langue humaine distincte qui a été documentée, qu'il s'agisse d'une langue vivante ou morte, d'une

langue principalement parlée, écrite ou d'une langue de signes.

Il n'existe pas une définition unique de «langue» qui soit acceptée de tous et qui convienne à tous les besoins.

Par conséquent, il peut y avoir désaccord, même entre locuteurs ou entre experts linguistiques, sur la

question de savoir si deux variétés représentent des dialectes d'une même langue ou deux langues distinctes.

Pour la présente partie de l'ISO 639, les jugements pour savoir si deux variétés sont considérées comme

étant la même langue ou des langues différentes sont basés sur un nombre de facteurs, notamment la

similarité linguistique, l'intelligibilité, une littérature commune, les points de vue des locuteurs sur la relation

entre langue et identité, et bien d'autres facteurs. Les critères fondamentaux suivants sont suivis:

⎯ Deux variétés liées sont normalement considérées comme étant des variétés de la même langue si les

personnes parlant chaque variété ont une compréhension spontanée de l'autre variété (c'est-à-dire

qu'elles peuvent la comprendre en s'appuyant sur la connaissance de leur propre variété sans avoir

besoin d'apprendre l'autre variété) à un niveau fonctionnel.

⎯ Lorsque l'intelligibilité orale entre variétés est marginale, l'existence d'une littérature commune ou d'une

identité ethnolinguistique commune avec une variante centrale comprise par ces deux variétés peut être

un indicateur fort qu'il conviendrait néanmoins de les considérer comme des variétés de la même langue.

⎯ Lorsqu'une intelligibilité suffisante existe entre les variétés de manière à permettre la communication,

l'existence d'identités ethnolinguistiques distinctes bien établies peut être un indicateur fort qu'il

conviendrait néanmoins de les considérer comme étant des langues différentes.

Un certain nombre de distinctions faites sur cette base peuvent ne pas être considérées comme appropriées

par un certain nombre d'usagers ou pour certaines applications. Toutefois, ces critères fondamentaux sont

considérés s'adapter le mieux à la gamme visée d'applications (voir 4.6).
4.2.3 Macro-langue

Les autres parties de l'ISO 639 incluent des indicatifs appelés «indicatifs de langue individuelle» qui

correspondent, dans une relation de un à plusieurs, à des indicatifs de langue individuelle de la présente

partie de l'ISO 639. Par exemple, la présente partie de l'ISO 639 contient plus de 30 indicatifs appelés

identificateurs de langue individuelle pour des variétés distinctes de l'arabe, alors que l'ISO 639-1 et

l'ISO 639-2 ne contiennent chacune respectivement qu'un seul indicatif pour l'arabe, «ar» et «ara», qui sont

appelés «indicatifs de langue individuelle» dans ces parties de l'ISO 639. On admet ici que les identificateurs

uniques pour l'arabe dans l'ISO 639-1 et l'ISO 639-2 correspondent aux différents indicatifs collectivement

désignés pour les variétés de l'arabe dans la présente partie de l'ISO 639.

Dans cet exemple, il peut sembler qu'il conviendrait d'appeler «indicatifs de langue collective» les

identificateurs uniques de l'ISO 639-1 et de l'ISO 639-2. Cela n'est toutefois pas l'hypothèse admise ici. Dans

4 © ISO 2007 – Tous droits réservés
---------------------- Page: 9 ----------------------
ISO 639-3:2007(F)

diverses parties du monde, il existe des groupes de variétés de langues intimement apparentées qui, sur la

base des critères exposés en 4.2.2, peuvent être considérés comme étant des langues individuelles, alors

que, dans certains contextes d'utilisation, on n'aura besoin que d'une seule identité de langue pour l'ensemble

de ces langues. Les situations types où ce besoin peut apparaître comprennent les cas suivants:

⎯ Il existe une variété qui est plus développée et qui tend à être utilisée pour une communication plus large

par les locuteurs de diverses langues étroitement apparentées; par conséquent, il est perçu une identité

linguistique commune à travers ces langues. Par exemple, il existe plusieurs langues distinctes pour

l'arabe parlé mais l'arabe standard est utilisé d'une manière générale dans les affaires et dans les médias

à travers toutes ces communautés et représente également un aspect important de l'unité ethno-

religieuse partagée. Par conséquent, il existe une identité linguistique commune perçue.

⎯ Il existe une forme écrite commune qui est utilisée pour plusieurs langues étroitement apparentées. Par

exemple, plusieurs langues chinoises partagent la même forme écrite.

⎯ Il existe une situation sociolinguistique transitoire dans laquelle des sous-communautés d'une même

communauté linguistique divergent, créant la nécessité, pour un certain nombre de besoins, de

reconnaître des langues distinctes alors que, pour d'autres besoins, une identité commune unique reste

valide. Par exemple, dans un certain nombre de contextes, il est nécessaire de faire la distinction entre

les langues bosniaque, croate et serbe alors que dans d'autres contextes ces distinctions ne sont pas

discernables en termes de ressources linguistiques utilisées.

Lorsqu'il existe de telles situations, la présente partie de l'ISO 639 considère un indicatif pour l'unique identité

linguistique commune comme étant un indicatif de macro-langue.

Les macro-langues se distinguent des groupes de langues en ce que les langues individuelles qui

correspondent à une macro-langue doivent être très étroitement apparentées et qu'il doit exister un certain

domaine où seule une identité linguistique unique est reconnue.
4.2.4 Dialectes

Les variétés linguistiques désignées par chacun des indicatifs dans la présente partie de l'ISO 639 sont

supposées être des langues distinctes et non des dialectes d'une langue même si, pour un certain nombre de

besoins, certains usagers peuvent considérer une variété énumérée dans la présente partie de l'ISO 639

comme étant un «dialecte» plutôt qu'une «langue» (voir 4.2.2 et 4.2.3). Dans la présente partie de l'ISO 639,

le terme dialecte désigne n'importe quelle sous-variété d'une langue qui pourrait être basée sur la région

géographique, l'âge, le sexe, la classe sociale, la période de temps ou un facteur similaire.

Les dialectes d'une langue sont inclus dans la dénotation représentée par l'indicatif pour cette langue en

question. Ainsi, chaque indicatif de langue représente la gamme complète de toutes les variétés parlées ou

écrites de la langue en question, y compris toute forme normalisée.

Pour des applications nécessitant d'identifier les dialectes, une norme séparée pourra être élaborée qui

fournira des indicatifs de dialectes ou qui combinera des indicatifs issus de la présente partie ou des autres

parties de l'ISO 639 avec d'autres qualificatifs d'identification distinctifs. Voir 4.7 pour plus de détails.

4.2.5 Codet de langue collective

Alors que l'ISO 639-2 inclut des indicatifs pour les groupes de langues et utilise également des indicatifs à

trois lettres, la présente partie de l'ISO 639 fournit des indicatifs seulement pour les langues individuelles et

les macro-langues.
4.2.6 Codets spéciaux de langue

L'ISO 639 inclut des indicatifs pour certains concepts particuliers, tels que «langue indéterminée». À la

différence des codets ayant d'autres domaines d'application, ces codets ne dénotent pas directement une

plage de variétés de langues, mais sont fournis pour répondre à divers besoins spécifiques dans les

applications.
© ISO 2007 – Tous droits réservés 5
---------------------- Page: 10 ----------------------
ISO 639-3:2007(F)

Par exemple, si une application requiert que tous les enregistrements d'une base de données aient un

indicatif de langue ISO 639, l'indicatif «und», désignant une «langue indéterminée («undetermined»)», permet

de classer les enregistrements pour lesquels la langue ne peut pas être déterminée.

Un codet spécial de langue, dans la présente partie de l'ISO 639 et dans l'ISO 639-2 également, est «mul»,

dénotant des «langues multiples». Ce codet permet de déclarer qu'un objet informationnel inclut un contenu

en plusieurs langues ou s'applique, d'une façon ou d'une autre, à plusieurs langues. Dans de nombreuses

applications, cependant, les informations sont organisées d'une façon qui présuppose que chaque indicatif de

langue ne fait référence qu'à une seule langue. L'utilisation de «mul» sera donc inappropriée dans de

nombreuses applications.
4.2.7 Langues mortes, anciennes et historiques

La présente partie de l'ISO 639 inclut des indicatifs qui désignent tant des langues mortes que des langues

vivantes. Les critères pour identifier des langues distinctes dans le cas de variétés éteintes récemment sont

tels que définis ci-dessus. Dans le ces des langues anciennes, un critère basé sur l'intelligibilité

...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.