ISO/IEC 10646-1:1993/Cor 2:1998
(Corrigendum)Information technology — Universal Multiple-Octet Coded Character Set (UCS) — Part 1: Architecture and Basic Multilingual Plane — Technical Corrigendum 2
Information technology — Universal Multiple-Octet Coded Character Set (UCS) — Part 1: Architecture and Basic Multilingual Plane — Technical Corrigendum 2
Technologies de l'information — Jeu universel de caractères codés à plusieurs octets — Partie 1: Architecture et table multilingue — Rectificatif technique 2
General Information
Relations
Standards Content (Sample)
INTERNATIONAL STANDARD ISO/IEC 10646-1:1993
TECHNICAL CORRIGENDUM 2
bc
Published 1998-07-01
INTERNATIONAL ORGANIZATION FOR STANDARDIZATION • �¯˘˜�˝�—˛˜˝�� ˛—ˆ�˝¨˙��¨� ˇ˛ ���˝˜�—�¨˙��¨¨ • ORGANISATION INTERNATIONALE DE NORMALISATION
INTERNATIONAL ELECTROTECHNICAL COMMISSION • �¯˘˜�˝�—˛˜˝�� �¸¯˚�—˛�¯�˝¨�¯�˚�� ˚˛�¨��¨� • COMMISSION ÉLECTROTECHNIQUE INTERNATIONALE
Information technology — Universal Multiple-Octet Coded
Character Set (UCS) —
Part 1:
Architecture and Basic Multilingual Plane
TECHNICAL CORRIGENDUM 2
Technologies de l’information — Jeu universel de caractères codés à plusieurs octets —
Partie 1: Architecture et table multilingue
RECTIFICATIF TECHNIQUE 2
Technical Corrigendum 2 to International Standard ISO/IEC 10646-1:1993 was prepared by Joint Technical
Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 2, Coded character sets.
4 Definitions
Replace 4.2 (block) with:
4.2 block: A contiguous range of code positions to which a set of characters that share common characteristics,
such as script, are allocated. A block cannot overlap another block. One or more of the code positions within a
block may have no character allocated to it.
Renumber 4.11 to 4.15 as 4.12 to 4.16.
Renumber 4.16 to 4.37 as 4.18 to 4.39 (see Amendment 1).
After 4.10 (code table) insert new definition as follows:
4.11 collection: A set of coded characters which is numbered and named and which consists of those coded
characters whose code positions lie within one or more identified ranges.
NOTE - If any of the identified ranges include code positions to which no character is allocated, the repertoire of the collection will change
if an additional character is assigned to any of those positions at a future amendment of this part of ISO/IEC 10646. However it is
intended that the collection number and name will remain unchanged in future editions of this part of ISO/IEC 10646.
ICS 35.040 Ref. No. ISO/IEC 10646-1:1993/Cor.2:1998(E)
Descriptors: data processing, information interchange, text processing, graphic characters, character sets, representation of characters,
coded character sets, architecture.
© ISO/IEC 1998
Printed in Switzerland
---------------------- Page: 1 ----------------------
©
ISO/IEC
ISO/IEC 10646-1:1993/Cor.2:1998(E)
After 4.16 (default state) insert new definition as follows:
4.17 fixed collection: A collection in which every code position within the identified range(s) has a character
allocated to it, and which is intended to remain unchanged in future editions of this part of ISO/IEC 10646.
19 Block names
Replace entire clause with:
Named blocks of contiguous code positions are specified within a plane for the purpose of allocation of characters
sharing some common characteristic, such as script. The blocks specified within the BMP are listed in A.2 of
Annex A, and are illustrated in Figures 3 and 4 (see Amendment 5).
Annex A
Replace Annex A with revised text as on the following pages.
2
---------------------- Page: 2 ----------------------
©
ISO/IEC
ISO/IEC 10646-1:1993/Cor.2:1998(E)
Annex A
(normative)
Collections of graphic characters for subsets
A.1 Collections of coded graphic characters
20 ORIYA 0B00 - 0B7F
The collections listed below are ordered by collection
200C, 200D
number. An * in the “positions” column indicates that
21 TAMIL 0B80 - 0BFF
the collection is a fixed collection.
200C, 200D
See Note 2 for an alphabetically-ordered index of the
22 TELUGU 0C00 - 0C7F
principal terms used in the names of these
200C, 200D
collections.
23 KANNADA 0C80 - 0CFF
NOTE 1 - Use of implementation levels 1 and 2 restricts the
200C, 200D
repertoire of some character collections (see 23.4).
Collections which include combining characters are 7, 10, 13
24 MALAYALAM 0D00 - 0D7F
to 26, 35, 49 , 50, 63, 65 and 72.
200C, 200D
The following collections are from the Basic
25 THAI 0E00 - 0E7F
Multilingual Plane.
26 LAO 0E80 - 0EFF
Collection number and name Positions
27 BASIC GEORGIAN 10D0 - 10FF
1 BASIC LATIN 0020 - 007E *
28 GEORGIAN EXTENDED 10A0 - 10CF
2 LATIN-1 SUPPLEMENT 00A0 - 00FF *
29 HANGUL JAMO 1100 - 11FF
3 LATIN EXTENDED-A 0100 - 017F *
30 LATIN EXTENDED ADDITIONAL 1E00 - 1EFF
4 LATIN EXTENDED-B 0180 - 024F
31 GREEK EXTENDED 1F00 - 1FFF
5 IPA EXTENSIONS 0250 - 02AF
32 GENERAL PUNCTUATION 2000 - 206F
6 SPACING MODIFIER LETTERS 02B0 - 02FF
33 SUPERSCRIPTS AND SUBSCRIPTS
2070 - 209F
7 COMBINING DIACRITICAL MARKS
0300 - 036F
34 CURRENCY SYMBOLS 20A0 - 20CF
8 BASIC GREEK 0370 - 03CF
35 COMBINING DIACRITICAL MARKS FOR
SYMBOLS 20D0 - 20FF
9 GREEK SYMBOLS AND COPTIC 03D0 - 03FF
36 LETTERLIKE SYMBOLS 2100 - 214F
10 CYRILLIC 0400 - 04FF
37 NUMBER FORMS 2150 - 218F
11 ARMENIAN 0530 - 058F
38 ARROWS 2190 - 21FF
12 BASIC HEBREW 05D0 - 05EA*
39 MATHEMATICAL OPERATORS 2200 - 22FF
13 HEBREW EXTENDED 0590 - 05CF
05EB - 05FF
40 MISCELLANEOUS TECHNICAL 2300 - 23FF
14 BASIC ARABIC 0600 - 065F
41 CONTROL PICTURES 2400 - 243F
15 ARABIC EXTENDED 0660 - 06FF
42 OPTICAL CHARACTER RECOGNITION
2440 - 245F
16 DEVANAGARI 0900 - 097F
200C, 200D
43 ENCLOSED ALPHANUMERICS 2460 - 24FF
17 BENGALI 0980 - 09FF
44 BOX DRAWING 2500 - 257F*
200C, 200D
45 BLOCK ELEMENTS 2580 - 259F
18 GURMUKHI 0A00 - 0A7F
200C, 200D
46 GEOMETRIC SHAPES 25A0 - 25FF
19 GUJARATI 0A80 - 0AFF
47 MISCELLANEOUS SYMBOLS 2600 - 26FF
200C, 200D
48 DINGBATS 2700 - 27BF
3
---------------------- Page: 3 ----------------------
ISO/IEC 10646-1:1993/Cor.2:1998(E) © ISO/IEC
49 CJK SYMBOLS AND PUNCTUATION 205 CHARACTER SHAPING SELECTORS
3000 - 303F 206A - 206D
50 HIRAGANA 3040 - 309F 206 NUMERIC SHAPE SELECTORS
206E - 206F
51 KATAKANA 30A0 - 30FF
52 BOPOMOFO 3100 - 312F
The following specify collections which are the union
of particular collections defined above.
53 HANGUL COMPATIBILITY JAMO 3130 - 318F
54 CJK MISCELLANEOUS 3190 - 319F
250 GENERAL FORMAT CHARACTERS
Collections 200 - 203
55 ENCLOSED CJK LETTERS AND MONTHS
3200 - 32FF
251 SCRIPT-SPECIFIC FORMAT CHARACTERS
Collections 204 - 206
56 CJK COMPATIBILITY 3300 - 33FF
57 [deleted at AMD.5]
The following specify other collections.
58 [deleted at AMD.5]
270 COMBINING CHARACTERS
59 [deleted at AMD.5]
characters specified in annex B.1
60 CJK UNIFIED IDEOGRAPHS 4E00 - 9FFF
271 COMBINING CHARACTERS B-2
61 PRIVATE USE AREA E000 - F8FF characters specified in annex B.2
62 CJK COMPATIBILITY IDEOGRAPHS
[299 BMP FIRST EDITION] see A.3 *
F900 - FAFF
300 BMP 0000 - D7FF
63 ALPHABETIC PRESENTATION FORMS
E000 - FFFD
FB00 - FB4F
301 BMP-AMD.7 see A.3 *
64 ARABIC PRESENTATION FORMS-A
FB50 - FDFF
The following collections are outside the Basic
65 COMBINING HALF MARKS FE20 - FE2F
Multilingual Plane.
66 CJK COMPATIBILITY FORMS FE30 - FE4F
400 PRIVATE USE PLANES G=00,
67 SMALL FORM VARIANTS FE50 - FE6F
P=0F, 10 & E0 - FF
68 ARABIC PRESENTATION FORMS-B
500 PRIVATE USE GROUPS G=60 - 7F
FE70 - FEFE
69 HALFWIDTH AND FULLWIDTH FORMS NOTE 2 - The principal terms (keywords) used in the
collection names shown above are listed below in
FF00 - FFEF
alphabetical order. The entry for a term shows the collection
70 SPECIALS FFF0 - FFFD number of every collection whose name includes the term.
These terms do not provide a complete cross-reference to
71 HANGUL SYLLABLES AC00 - D7A3*
all the collections where characters sharing a particular
attribute, such as script name, may be found. Although
72 BASIC TIBETAN 0F00 - 0FBF
most of the terms identify an attribute of the characters
within the collection, some characters that possess that
attribute may be present in other collections whose numbers
The following collections specify characters used for
do not appear in the entry for that term.
alternate formats and script-specific formats. See
Alphabetic 63
annex D for more information.
Alphanumeric 43
Arabic 14 15 64 68
200 ZERO-WIDTH B
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.