ISO/IEC 8859-6:1999
(Main)Information technology — 8-bit single-byte coded graphic character sets — Part 6: Latin/Arabic alphabet
Information technology — 8-bit single-byte coded graphic character sets — Part 6: Latin/Arabic alphabet
This part of ISO/IEC 8859 specifies a set of 146 coded graphic characters identified as Latin/Arabic alphabet. This set of coded graphic characters is intended for use in data and text processing applications and also for information interchange. The set contains graphic characters used for general purpose applications in typical office environments in at least the following languages: Arabic, English and Latin. Some of the characters in this set are combining characters (see clause 6). This set of coded graphic characters may be regarded as a version of an 8-bit code according to ISO/IEC 2022 or ISO/IEC 4873 at level 1. This part of ISO/IEC 8859 may not be used in conjunction with any other parts of ISO/IEC 8859. If coded characters from more than one part are to be used together, by means of code extension techniques, the equivalent coded character sets from ISO/IEC 10367 should be used instead within a version of ISO/IEC 4873 at level 2 or level 3. The coded characters in this set may be used in conjunction with coded control functions selected from ISO/IEC 6429. However, control functions are not used to create composite graphic symbols from two or more graphic characters (see clause 6). NOTE ? ISO/IEC 8859 is not intended for use with Telematic services defined by ITU-T. If information coded according to ISO/IEC 8859 is to be transferred to such services, it will have to conform to the requirements of those services at the access-point.
Technologies de l'information — Jeux de caractères graphiques codés sur un seul octet — Partie 6: Alphabet latin/arabe
General Information
Relations
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 8859-6
First edition
1999-01-15
Information technology — 8-bit single-byte
coded graphic character sets —
Part 6:
Latin/Arabic alphabet
Technologies de l'information — Jeux de caractères graphiques codés sur
un seul octet —
Partie 6: Alphabet latin/arabe
Reference number
B C
ISO/IEC 8859-6:1999(E)
---------------------- Page: 1 ----------------------
ISO/IEC 8859-6:1999 (E)
Contents
Page
Foreword . . iii
Introduction . . iv
1 Scope . 1
2 Conformance . 1
3 Normative references . 1
4 Definitions . 2
5 Notation, code table and names . 2
6 Specification of the coded character set . 3
7 Identification of the character set . 7
Annex A: Coverage of languages by parts 1 to 10 of
ISO/IEC 8859 . 8
Annex B: Main differences between ISO 8859-6:1987 and
this first edition of this part of ISO/IEC 8859 . 10
Annex C: Bibliography . 11
© ISO/IEC 1999
All rights reserved. Unless otherwise specified, no part of this publication may be
reproduced or utilized in any form or by any means, electronic or mechanical,
including photocopying and microfilm, without permission in writing from the publisher.
ISO/IEC Copyright Office Case Postale 56 CH-1211 Genève 20 Switzerland
• • •
Printed in Switzerland
ii
---------------------- Page: 2 ----------------------
© ISO/IEC ISO/IEC 8859-6:1999 (E)
Foreword
ISO (the International Organization for Standardization) and IEC (the
International Electrotechnical Commission) form the specialized
system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of
International Standards through technical committees established by
the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of
mutual interest. Other international organizations, governmental and
nongovernmental, in liaison with ISO and IEC, also take part in the
work.
In the field of information technology, ISO and IEC have established
a joint technical committee, ISO/IEC JTC1. Draft International
Standards adopted by the joint technical committee are circulated to
national bodies for voting. Publication as an International Standard
requires approval by at least 75% of the national bodies casting a
vote.
International Standard ISO/IEC 8859-6 was prepared by Joint
Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 2, Coded character sets.
This edition cancels and replaces ISO 8859-6:1987 which has been
technically revised.
ISO/IEC 8859 consists of the following parts, under the general title
Information technology – 8-bit single-byte coded graphic character
sets:
– Part 1: Latin alphabet No. 1
– Part 2: Latin alphabet No. 2
– Part 3: Latin alphabet No. 3
– Part 4: Latin alphabet No. 4
– Part 5: Latin/Cyrillic alphabet
– Part 6: Latin/Arabic alphabet
– Part 7: Latin/Greek alphabet
– Part 8: Latin/Hebrew alphabet
– Part 9: Latin alphabet No. 5
– Part 10: Latin alphabet No. 6
Annexes A to C of this part of ISO/IEC 8859 are for information only.
iii
---------------------- Page: 3 ----------------------
ISO/IEC 8859-6:1999 (E) © ISO/IEC
Introduction
ISO/IEC 8859 consists of several parts. Each part specifies a set of
up to 191 graphic characters and the coded representation of these
characters by means of a single 8-bit byte. Each set is intended for
use for a particular group of languages.
iv
---------------------- Page: 4 ----------------------
INTERNATIONAL STANDARD © ISO/IEC ISO/IEC 8859-6:1999 (E)
Information technology –
8-bit single-byte coded graphic character sets –
Part 6: Latin/Arabic alphabet
2.2 Conformance of devices
1 Scope
A device is in conformance with this part of
This part of ISO/IEC 8859 specifies a set of 146
ISO/IEC 8859 if it conforms to the requirements of
coded graphic characters identified as Latin/Arabic
2.2.1, and either or both of 2.2.2 and 2.2.3. A claim
alphabet.
of conformance shall identify the document which
This set of coded graphic characters is intended for
contains the description specified in 2.2.1.
use in data and text processing applications and
also for information interchange.
2.2.1 Device description
A device that conforms to this part of ISO/IEC 8859
The set contains graphic characters used for
shall be the subject of a description that identifies
general purpose applications in typical office
the means by which the user may supply characters
environments in at least the following languages:
to the device, or may recognize them when they are
Arabic, English and Latin.
made available to him, as specified respectively in
2.2.2 and 2.2.3.
Some of the characters in this set are combining
characters (see clause 6).
2.2.2 Originating devices
This set of coded graphic characters may be
An originating device shall allow its user to supply
regarded as a version of an 8-bit code according to
any sequence of characters from those specified in
ISO/IEC 2022 or ISO/IEC 4873 at level 1.
clause 6, and shall be capable of transmitting their
coded representations within a CC-data-element.
This part of ISO/IEC 8859 may not be used in
conjunction with any other parts of ISO/IEC 8859.
2.2.3 Receiving devices
If coded characters from more than one part are to
A receiving device shall be capable of receiving and
be used together, by means of code extension
interpreting any coded representations of characters
techniques, the equivalent coded character sets
that are withina CC-data-element, and that conform
from ISO/IEC 10367 should be used instead within
to clause 6, and shall make the corresponding
a version of ISO/IEC 4873 at level 2 or level 3.
characters available to its user in such a way that
The coded characters in this set may be used in
the user can identify them from among those
conjunction with coded control functions selected
specified there, and can distinguish them from each
from ISO/IEC 6429. However, control functions are
other.
not used to create composite graphic symbols from
two or more graphic characters (see clause 6).
3 Normative references
NOTE – ISO/IEC 8859 is not intended for use with
The following standards contain provisions which,
Telematic services defined by ITU-T. If information coded
through reference in this text, constitute provisions
according to ISO/IEC 8859 is to be transferred to such
services, it will have to conform to the requirements of
of this part of ISO/IEC 8859. At the time of publica-
those services at the access-point.
tion, the editions indicated were valid. All standards
are subject to revision, and parties to agreements
2 Conformance
based on this part of ISO/IEC 8859 are encouraged
to investigate the possibility of applying the most
2.1 Conformance of information interchange
recent editions of the standards indicated below.
A coded-character-data-element (CC-data-element)
Members of IEC and ISO maintain registers of
within coded information for interchange is in
currently valid International Standards.
conformance with this part of ISO/IEC 8859 if all the
coded representations of graphic characters within
that CC-data-element conform to the requirements
of clause 6.
1
---------------------- Page: 5 ----------------------
ISO/IEC 8859-6:1999 (E) © ISO/IEC
ISO/IEC 2022:1994, Information technology – The bit combinations may be interpreted to
Character code structure and extension techniques. represent numbers in binary notation by attributing
the following weights to the individual bits:
ISO/IEC 4873:1991, Information technology –
ISO 8-bit code for information interchange –
Bit b b b b b b b b
8 7 6 5 4 3 2 1
Structure and rules for implementation.
Weight 128 64 32 16 8 4 2 1
ISO/IEC 8824-1:1995, Information technology –
Abstract Syntax Notation One (ASN.1): Specifica-
Using these weights, the bit combinations are
tion of basic notation.
identified by notations of the form xx/yy, where xx
and yy are numbers in the range 00 to 15. The
correspondence between the notations of the form
4 Definitions
xx/yy and the bit combinations consisting of the bits
For the purposes of this part of ISO/IEC 8859 the
b to b is as follows:
8 1
following definitions apply:
– xx is the number represented by b ,b ,b and
8 7 6
4.1 bit combination: An ordered set of bits used
b where these bits are given the weights 8, 4, 2,
5
for the representation of characters.
and 1 respectively.
4.2 byte: A bit string that is operated upon asa unit.
– yy is the number represented by b ,b ,b and
4 3 2
b where these bits are given the weights 8, 4, 2,
4.3 character: A member of a set of elements
1
and 1 respectively.
used for the organization, control, or representation
of data.
The bit combinations are also identified by notations
of the form hk, where h and k are numbers in the
4.4 code table: A table showing the characters
range 0 to F in hexadecimal notation. The number
allocated to each bit combination in a code.
h is the same as the number xx described above,
4.5 coded character set; code: A set of
and the number k the same as the number yy
unambiguous rules that establishes a character set
described above.
and the one-to-one relationship between the
characters of the set and their bit combinations. 5.2 Layout of the code table
4.6 coded-character-data-element (CC-data-
An 8-bit code table consists of 256 positions
element): An element of interchanged information
arranged in 16 columns and 16 rows. The columns
that is specified to consist of a sequence of coded
and the rows are numbered 00 to 15. In hexa-
representations of characters, in accordance with
decimal notation the columns and the rows are
one or more identified standards for coded
numbered 0 to F.
character sets.
The code table positions are identified by notations
4.7 graphic character: A character, other than a
of the form xx/yy, where xx is the column number
control function, that has a visual representation
and yy is the row number. The column and row
normally handwritten, printed or displayed, and that
numbers are shown at the top and left edges of the
has a coded representation consisting of one or
table respectively. The code table positions are
more bit combinations.
also identified by notations of the form hk, where h
NOTE – In ISO/IEC 8859 a single bit combination is used is the column number and k is the row number in
to represent each character.
hexadecimal notation. The column and row
numbers are shown at the bottom and right edges of
4.8 graphic symbol: A visual representation of a
the table respectively.
graphic character or of a control function.
The positions of the code table are in one-to-one
4.9 position: That part of a code table identified
correspondence with the bit combinations of the
by its column and row coordinates.
code. The notation of a code table position, of the
form xx/yy, or of the form hk, is the same as that of
5 Notation, code table and names
the corresponding bit combination.
5.1 Notation
5.3 Names and meanings
The bits of the bit combinations of the 8-bit code are
identified by b ,b ,b ,b ,b ,b ,b , and b , where
8 7 6 5 4 3 2 1 This part of ISO/IEC 8859 assigns a unique name
b is the highest-order, or most-significant bit and b
8 1 and a unique identifier to each graphic character.
is the lowest-order, or least-significant bit.
These names and identifiers have been taken from
ISO/IEC 10646-1 (E). This part of ISO/IEC 8859
2
---------------------- Page: 6 ----------------------
© ISO/IEC ISO/IEC 8859-6:1999 (E)
also specifies an acronym for each of the characters
6 Specification of the coded character set
SPACE, NO-BREAK SPACE and SOFT HYPHEN.
This part of ISO/IEC 8859 specifies 146 characters
For acronyms only Latin capital letters A to Z are
allocated to the bit combinations of the code table
used. It is intended that the acronyms be retained in
(table 2).
all translations of the text.
Some of these characters are combining characters.
Except for SPACE (SP), NO-BREAK SPACE
They are identified in table 1 as such.
(NBSP) and SOFT HYPHEN (SHY), this part of
NOTE – Combining characters are described in ISO/IEC
ISO/IEC 8859 does not define and does not restrict
2022:1994 subclause 6.3.3.
the meanings of graphic characters.
The coded representation of a combining character
This part of ISO/IEC 8859 specifies a graphic
shall follow that of the base character with which it
symbol for each graphic character. This symbol is
is associated. Any combining character may be
shown in the corresponding position of the code
associated with any non-combining character in the
table. However, this part, or any other part, of
ranges 12/01 to 13/10 and 14/01 to 14/10
ISO/IEC 8859 does not specify a particular style or
(hexadecimal C1 to DA and E1 to EA).
font design for imaging graphic characters. Annex
B of ISO/IEC 10367 gives further information on this
Control functions, such as BACKSPACE or
subject.
CARRIAGE RETURN, shall not be used to create
composite graphic symbols, which are made up
5.3.1 SPACE (SP)
from the graphic representations of two or more
characters.
A graphic character the visual representation of
which consists of the absence of a graphic symbol.
NOTE – There is only one set of DIGITS in this part.
How these will be imaged, isa matter of local conventions.
5.3.2 NO-BREAK SPACE (NBSP)
In the code table, graphic symbols for the most common
styles of writing digits are given next to each other. In this
A graphic character the visual representation of
way data communication between various Arabic writing
which consists of the absence of a graphic symbol,
countries remains possible without code conversion.
for use when a line break is to be prevented in the
6.1 Characters of the set and their coded
text as presented.
representation
5.3.3 SOFT HYPHEN (SHY)
See table 1.
A graphic character that is imaged by a graphic
symbol identical with, or similar to, that representing
HYPHEN, for use when a line break has been
established within a word.
3
---------------------- Page: 7 ----------------------
ISO/IEC 8859-6:1999 (E) © ISO/IEC
Table1 – Character set, coded representation Table 1 (continued)
Bit Bit
combi- Hex Identifier Name combi- Hex Identifier Name
nation nation
02/00 20 U+0020 SPACE 05/00 50 U+0050 LATIN CAPITAL LETTER P
02/01 21 U+0021 EXCLAMATION MARK 05/01 51 U+0051 LATIN CAPITAL LETTER Q
02/02 22 U+0022 QUOTATION MARK 05/02 52 U+0052 LATIN CAPITAL LETTER R
02/03 23 U+0023 NUMBER SIGN 05/03 53 U+0053 LATIN CAPITAL LETTER S
02/04 24 U+0024 DOLLAR SIGN 05/04 54 U+0054 LATIN CAPITAL LETTER T
02/05 25 U+0025 PERCENT SIGN 05/05 55 U+0055 LATIN CAPITAL LETTER U
02/06 26 U+0026 AMPERSAND 05/06 56 U+0056 LATIN CAPITAL LETTER V
02/07 27 U+0027 APOSTROPHE 05/
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.