Information technology — Universal Multiple-Octet Coded Character Set (UCS) — Part 1: Architecture and Basic Multilingual Plane — Amendment 1: Transformation Format for 16 planes of group 00 (UTF-16)

Technologies de l'information — Jeu universel de caractères codés à plusieurs octets — Partie 1: Architecture et table multilingue — Amendement 1: Format de transformation pour 16 tables du groupe 00 (UTF-16)

General Information

Status
Withdrawn
Publication Date
23-Oct-1996
Withdrawal Date
23-Oct-1996
Current Stage
9599 - Withdrawal of International Standard
Completion Date
05-Oct-2000
Ref Project

Relations

Buy Standard

Standard
ISO/IEC 10646-1:1993/Amd 1:1996 - Transformation Format for 16 planes of group 00 (UTF-16)
English language
6 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO/IEC
STANDARD
First edition
1993-05-01
AMENDMENT 1
1996-l O-l 5
Information technology - Universal
Multiple-Octet Coded Character
Set (UCS) -
Part 1:
Architecture and Basic Multilingual Plane
AMENDMENT I : Transformation Format for
16 planes of group 00 (UTF-16)
Technologies de I’informa tion - Jeu universe/ de car-act&res cod& 2
plusieurs octets -
Partie 7: Architecture et table multilingue
AMENDEMENT 7: Format de transformation pour 16 tables du
groupe 00 (UTF-76)
Reference number
ISO/I EC 10646-I : 1993/Amd. 1: 1996(E)

---------------------- Page: 1 ----------------------
lSO/lEC 106464: 1993/Amd.l:1996 (E)
Contents
Page
. . .
Foreword . . . . . . . . . . . . . . .I.~.~. III
Introduction I,.,.“.,.,., iv
2 Conformance . . . . . . . . . . .*.*.*.*. 1
4 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*.*.**. 1
....................................................................... 1
5 General structure of the UCS
......................................................................... 1
7 Special features of the UCS
........................................................................ 2
8 The Basic Multilingual Plane
2
9 Other planes .
......................................... 2
9.1 Planes reserved for future standardization
............................................................. 2
9.1 Planes accessible by UTF-16
..................................................................
11 Private Use groups and planes 2
................................................................................
14.1 Two-octet BMP form 2
Annexes
A Collections of graphic characters for subsets . 3
F The use of “signatures” to identify UCS . 3
M External references to character repertoires . 3
.......................
Q Transformation format for 16 planes of group 00 (UTF-16) 4
Q.l Specification of UTF-16 . 4
Q.2 Notation . 4
Q.3 Mapping between UCS-4 form and UTF-16 form . 4
Q.4 Mapping between UTF-16 form and UCS-4 form . 5
Q.5 Identification of UTF-16 . 5
Q.6 Unpaired RC-elements: Interpretation by receiving devices . 5
Q.7 Receiving devices, advisory notes . 5
0 ISO/IEC 1996
All rights resewed. Unless otherwise specified no part of this publication may bg
reproduced or utilized in any form or by any means, electronic or mechanical
including photocopying and microfilm, without permission in writing from thg
publisher.
ISO/IEC Copyright Office . Case postale 56 CH-1211 Geneve 20 . Switzerland
Printed in Switzerland

---------------------- Page: 2 ----------------------
0 ISOAEC lSO/lEC 10646-1:1993/Amd.l:1996 (E)
Foreword
IS0 (the International Organization for Standardization) and IEC (the
International Electrotechnical Commission) form the specialized system for
worldwide standardization. National bodies that are members of IS0 or IEC
participate in the development of International Standards through technical
committees established by the respective organization to deal with particular
fields of technical activity. IS0 and IEC technical committees collaborate in
fields of mutual interest. Other international organizations, governmental and
non-governmental, in liaison with IS0 and IEC, also take part in the work.
In the field of information technology, IS0 and IEC have established a joint
technical committee, ISO/IEC JTC 1. Draft International Standards adopted
by the joint technical committee are circulated to national bodies for voting.
Publication as an International Standard requires approval by at least 75 % of
the national bodies casting a vote.
Amendment 1 to International Standard ISO/lEC 10646-l :1993 was prepared
by Joint Technical Committee ISO/IEC JTC 1, information technology.
. . .
III

---------------------- Page: 3 ----------------------
ISO/lEC 106464 : 1993/Amd.l :1996 (E) 0 ISOAEC
Introduction
ISOAEC 10646 specifies the Universal Multiple-Octet Coded Character
Set (UCS). It is applicable to the representation, transmission, inter-
change, processing, storage, input and presentation of the written form of
the languages (scripts) of the world as well as additional symbols.
This amendment to ISOAEC 10646 specifies an additional transformation
format, UTF-16. UTF-16 is a coded representation that permits over a
million graphic characters of the UCS to be represented in a form which is
compatible with the two-octet BMP form.
iv

---------------------- Page: 4 ----------------------
lSO/lEC10646-1: 1993IAmd.l: 1996 (E)
Information technology - Universal Multiple-Octet Coded
Character Set (UCS) -
Part 1:
Architecture and Basic Multilingual Plane
AMENDMENT 1: Transformation Format for 16 planes of
group 00 (UTF-16)
l an K-element from the high-half zone
2 Conformance
that is not immediately followed by an RC-
Clause 2 applies with the text of 2.2 a) amended to
element from the low-half zone, or
read:
l an K-element from the low-half zone
all the coded representations of graphic
a) that is not immediately preceded by a
characters within that CC-data-element conform to
high-half RC-element from the high-half
clauses 6 and 7, to an identified form chosen from
zone.
clause 14 or Annex Q, and to an identified
implementation level chosen from clause 15; 5 General structure of the UCS
Clause 5 applies with the text amended as follows.
4 Definitions
In the sixth paragraph on page 4, replace:
Clause 4 applies with the following additions and
The 32 planes with Plane-octet values EO to FF of
amendments:
Group 00 are for Private Use. The 32 groups with
Renumber 4.21 - 4.22 as 4.22 - 4.23
Group-octet values 60 to 7F of this coded character
set are also for Private Use.
Renumber 4.23 - 4.27 as 4.25 - 4.29
with:
Renumber 4.28 - 4.31 as 4.31 - 4.34
The planes that are resewed for Private Use are
as 4.36 - 4.37
Renumber 4.32 - 4.33
specified in clause 11.
4.21 high-half zone: a set of cells resewed for use
and add the following new paragraph at the end of
in UTF-16 (see Annex Q); an K-element
the clause:
corresponding to any of these cells may be
used as the first of a pair of RC-elements
A UCS Transformation Format (UTF-16) is specified
which represents a character from a plane
in Annex Q which can be used to represent
other than the BMP.
characters from 16 planes of group 00, additional to
the BMP, in a form that is compatible with the two-
4.24 low-half zone: a set of cells reserved for use
octet BMP form.
in UTF-16 (see Annex Q); an K-element
corresponding to any of these cells may be
7 Special features of the UCS
used as the second of a pair of K-elements
which represents a character from a plane Clause 7 applies with the text in paragraph 2
other than the BMP. amended to read:
4.30 RC-element: a two-octet sequence com- 2. Code positions to which a character is not
prising the R-octet and the C-octet (see 6.2) allocated, except for the positions reserved for
Private Use characters or for transformation
from the four octet sequence that corresponds
formats, are reserved for future standardisation
to a cell in the coding space of this coded
and shall not be used for any other purpose.
character set.
Future editions of ISOAEC 10646 will not allocate
4.35
unpaired RC-element. An RC-element in a
any characters to code positions reserved for
CC-data element that is either:
Private Use characters or for transformation
formats.
1

---------------------- Page: 5 ----------------------
lSO/lEC 106464 : 1993/Amd.l:1996 (E) 0 ISO/lEC
In the last paragaph, after:
8 The Basic Multilingual Plane
The O-zone is reserved for future standardisation.
Clause 8 applies with the text amended as follows,
insert:
and the diagram replaced with an amended version
as shown. Replace:
The S-zone is reserved for the use of UTF-16 (see
Annex Q).
The Basic Multilingual Plane shall be divided into four
zones:
9 Other planes
A-zone: code positions 0000 to 4DFF
Clause 9 is amended to read:
l-zone: code positions 4EOO to 9FFF
9.1 Planes resewed for future standardisation
O-zone: code positions A000 to DFFF
Planes 11 to DF in Group 00 and planes 00 to FF in
R-zone: code positions EOOO to FFFD
Groups 01 to 5F are reserved for future stan-
with:
dardisation, and thus those code positions shall not
The Basic Multilingual Plane shall be divided into five
be used for any other purpose.
zones:
9.2 Planes accessible by UTF-16
zone code positions
Each code position in planes 01 to 10 of group 00
A-zone: 0000 0000 to 0000 4DFF
has a unique mapping to a four-octet sequence in
I-zone: 0000 4EOO to 0000 9FFF
accordance with the UTF-16 form of coded
O-zone: 0000 A000 to 0000 D7FF representation (see Annex Q). This form is com-
patible with the two-octet BMP form of UCS-2 (see
S-zone: 0000 0800 to 0000 DFFF
14.1).
0000 EOOO to 0000 FFFD
R-zone:
Code positions in planes II to FF of group 00, or in
planes 00 to FF of other groups, do not have a
The amended version of the diagram is as follows: mapping to the UTF-16 form.
00 FF
11 Private Use groups and planes
00
Clause 17 applies with the text amended as follows.
A-zone (19903 positions) Replace:
I I
The code positions of 32 planes from Plane EO to
4E
Plane FF of Group 00 shall be for Private Use.
I-zone (20992 positions)
I
with:
A0
The code positions of Plane OF and Plane 10, and of
O-zone (14336 positions)
I I
the 32 planes from Plane EO to Plane FF, of Group
00 shall be for Private Use.
S-zone (2048 positions)
D0 I
EO R-zone (8190 positions)
14.1 Two-octet BMP form
Clause 14.1 applies with the text amended as
follows. At the end of the second paragraph replace:
Replace:
in 6.2.
Code positions 0000 to 001 F in the BMP are
resewed for control characters, and code position
with:
007F is reserved for the character DELETE (see
in 6.2
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.