Programming languages — C — Amendment 1: C Integrity

Langages de programmation — C — Amendement 1: Intégrité de C

General Information

Status
Withdrawn
Publication Date
29-Mar-1995
Withdrawal Date
29-Mar-1995
Current Stage
9599 - Withdrawal of International Standard
Completion Date
16-Dec-1999
Ref Project

Relations

Effective Date
15-Apr-2008

Buy Standard

Standard
ISO/IEC 9899:1990/Amd 1:1995 - C Integrity
English language
51 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO/IEC
STANDARD
First edition
1990-12-15
AMENDMENT I
1995-04-O 1
Programming languages - C
AMENDMENT 1: C Integrity
Langages de programmation - C
AMENDEMENT I: lntkgritk de C
Reference number
ilSO/lEC 9899:1990/Amd.1:1995(E)

---------------------- Page: 1 ----------------------
0 ISO/IEC
ISO/IEC 9899: 1990/Amd. 1: 1995 (E)
Contents
.
1Scope .
.
4 Compliance .
............................................... -1
6 Language
.l
6.1.5 Operators .
.,,.2
6.1.6Punctuators .
6.8.8 Predefined macro names l . l l l l l l l l l l l l q l l l l . l l l l l l q q . l l l l e l l l .2
...2
7Library .
....................................... .2
7.1.1 Definitions ofterms
.2
7.1.2 Standard headers. .
.3
7.1.4 Errors .
.
7.9 Input/output .
.3
7.9.1 Introduction, .
.3
7.9.2 Streams .
.
7.9.3FiIes. .
.4
...............................
7.9.6 Formatted input/output functions
.7
......................................
7.13 Future library directions
............ .7
7.13.1 Wide-character classification and mapping utilities
. h> . 7
7.13.2 Extended multibyte and wide-character utilities .7
7.14 Alternativespellings .
. h> . 7
7.15 Wide-character classification and mapping utilities .7
7.151 Introduction .
.............................. 8
7.152 Wide-character classification utilities
........................... 8
7.1X2.1 Wide-character classification functions
.................... 10
7.X2.2 Extensible wide-character classification functions
............................... 11
7.15.3 Wide-character mapping utilities
.......................... 11
7.15.3.1 Wide-character case-mapping functions
...................... 11
7.15.3.2 Extensible wide-character mapping functions
. h> . 12
7.16 Extended multibyte and wide-character utilities ........................................... 12
7.16.1 Introduction
13
7.16.2 Formatted wide-character input/output functions .
21
7.163 Wide-character input/output functions .
24
7.16.4 General wide-string utilities .
24
7.16.4.1 Wide-string numeric conversion functions .
26
7.16.4.2 Wide-string copying functions .
27
7.16.4.3 Wide-string concatenation functions .
27
7.16.4.4 Wide-string comparison functions .
29
7.16.4.5 Wide-string search functions .
31
7.16.4.6 Wide-character array functions .
32
7.16.5 The wcsftime function .
33
................
7.16.6 Extended multibyte and wide-character conversion utilities
..................... 33
7.16.6.1 Single-byte wide-character conversion functions
................................... 34
7.16.6.2 Thembsinit function
............... 34
7.16.6.3 Restartable multibyte/wide-character conversion functions
35
.................
7.16.6.4 Restartable multibyte/wide-string conversion functions
37
Annex D: (informative) Library summary .
39
Annex H: (informative) Rationale .
..S 1
Index. .
0 ISO/IEC 1995
All rights reserved. Unless otherwise specified, no part of this publication may be
reproduced or utilized in any form or by any means, electronic or mechanical, including
photocopying and microfilm, without permission in writing from the publisher.
ISO/IEC Copyright Office l Case postale 56 l CH- 1211 Geneve 20 l Switzerland
Printed in Switzerland
ii

---------------------- Page: 2 ----------------------
ISO/IEC 9899: 199O/Amd.l: 1995(E)
o ISO/IEC
Foreword
IS0 (the International Organization for Standardization) and IEC (the Inter-
national Electrotechnical Commission) form the specialized system for worldwide
standardization. National bodies that are members of IS0 or IEC participate in the
development of International Standards through technical committees established
by the respective organization to deal with particular fields of technical activity.
IS0 and IEC technical committees collaborate in fields of mutual interest. Other
international organizations, governmental and non-governmental, in liaison with
IS0 and IEC, also take part in the work.
In the field of information technology, IS0 and IEC have established a joint
technical committee, ISO/IEC JTC 1. Draft International Standards adopted by the
joint technical committee are circulated to national bodies for voting. Publication
as an International Standard requires approval by at least 75 % of the national
bodies casting a vote.
Amendment 1 to International Standard ISO/IEC 9899:1990 was prepared by
Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee 22, Programming languages, their environments and system
software interfaces.
Annexes A and B of this amendment are for informati on only.
. . .
111

---------------------- Page: 3 ----------------------
ISO/IEC 9899: 1990iAmd. 1: 1995 (E) 0 ISO/IEC
Introduction
This first amendment to ISO/IEC 9899: 1990 primarily consists of a set of library extensions that provide a
complete and consistent set of utilities for application programming using multibyte and wide characters. It
also contains extensions that provide alternate spellings for certain tokens.
The base standard deliberately chose not to include a complete multibyte and wide-character library. Instead,
it defined just enough support to provide a firm foundation, both in the library and language proper, on which
implementations and programming expertise could grow. Vendors did implement such extensions; this first
amendment reflects the studied and careful inclusion of the best of today’s existing art in this area.
The base standard also chose to provide only minimal support for writing C source code in character sets
that redefine some of the punctuation characters, such as national variants of IS0 646. The alternate spellings
provided here can be used to write many (but not all) tokens that are less readable when expressed in terms of
trigraphs.
This first amendment to ISO/IEC 9899: 1990 is divided into three major subdivisions:
- those additions and changes that affect the preliminary subdivision of ISO/IEC 9899:1990 (clauses 1
through 4);
- those additions and changes that affect the language syntax, constraints, and semantics (ISO/IEC 9899: 1990
clause 6);
- those additions and changes that affect library facilities (ISO/IEC 9899: 1990 clause 7).
Examples are provided to illustrate possible forms of the constructions described. Footnotes are provided
to emphasize consequences of the rules described in that subclause or elsewhere in this first amendment.
References are used to refer both to the base standard and to related subclauses within this document. These
two can be distinguished either by context or are labeled as referring to the base standard (as above). Annex A
summarizes the contents of this first amendment. Annex B provides a rationale.
iv

---------------------- Page: 4 ----------------------
ISO/IEC 9899:1990/Amendment 1: 1995 (E)
0 ISO/IEC
Programming languages - C
AMENDMENT 1: C Integrity
1 Scope
This amendment defines extensions to ISO/IEC 9899: 1990 that provide a more complete set of multibyte
and wide-character utilities, as well as alternative spellings for certain tokens. Use of these features can help
promote international portability of C programs.
This amendment specifies extensions that affect various clauses of ISO/IEC 9899: 1990:
- To the compliance clause (clause 4), the additional header is provided by both freestanding
and hosted implementations.
- To the language clause (clause 6), six additional tokens are accepted.
- To the library clause (clause 7), new capabilities are specified for the existing formatted input/output
functions (7.9.6).
- To the library clause (clause 7), the additional header is provided, which defines a macro,
several types, and many functions, including:
l wide-character testing functions, iswalnurn for example;
and iswctype;
l extensible wide-character classification functions, wctype
l wide-character case-mapping functions, towlower and towupper;
l extensible wide-character case-mapping functions, wet rans and t owct rans.
- To the library clause (clause 7), the additional header is provided, which defines several
macros, several types, and many functions, including:
l formatted wide-character input/output functions, fwprint f for example;
wide-character input/output functions, f get wc for example;
wide-string numeric conversion functions, wcs t od for example;
l wide-string general utility functions, wcscpy for example;
l a wide-string time conversion function, WCS f t ime;
example;
l restartable multibyte/wide-character conversion functions, mbrt owe for
multibyte/wide-string conversion functions, mbs rt owes and wcsrtornbs.
l restartable
4 Compliance
The description is adjusted so that the standard header is included in the list of headers that
must be provided by both freestanding and hosted implementations.
Forward References: alternate spellings (7.14).
6 Language
Subclauses 6.1.5 and 6.1.6 of ISO/IEC 9899: 1990 are adjusted to include the following six additional tokens.
In all aspects of the language, these six tokens
%> %: %:%:
<: :> <%
behave, respectively, the same as these existing six tokens
# ##
[ 1 { 1
except for their spelling?
6.1.5 Operators
Syntax
also one of
operator:
1) Thus [ and < : behave differently when “stringized” (see ISO/IEC 9899:1990 subclause 6X3.2), but can otherwise be freely
interchanged.
1

---------------------- Page: 5 ----------------------
ISO/IEC 9899: 1990/Amendment 1: 1995 (E) 0 ISO/IEC
<: :> %: %:%:
Constraints
Theoperators [ 1, (),and? : (independent of spelling) shall occur in pairs, possibly separated by
expressions. The operators # and ## (also spelled % : and % : % : , respectively) shall occur in macro-defining
preprocessing directives only.
6.1.6 Punctuators
Syntax
punctuator: also one of
:> <% %> %:
<:
Constraints
The punctuators [ ] , ( ) , and ( } (independent of spelling) shall occur (after translation phase 4) in pairs,
possibly separated by expressions, declarations, or statements. The punctuator # (also spelled % : ) shall occur
in preprocessing directives only.
6.8.8 Predefined macro names
Subclause 6.8.8 is adjusted to include the following macro name defined by the implementation:
STDC VERSION
-- --
-
which expands to the decimal constant 19 94 0 9L, intended to indicate an implementation conforming to this
amendment.
7 Library
Various portions of clause 7 of ISO/IEC 9899: 1990 are adjusted to include the following specifications.
The identifiers with external linkage declared in either or which are not already
reserved as identifiers with external linkage by ISO/IEC 9899:1990 are reserved for use as identifiers with
external linkage only if at least one inclusion of either or occurs in one or more
of the translation units that constitute the prog~am.~)
7.1.1 Definitions of terms
A wide character is a code value (a binary encoded integer) of an object of type wchar t that corresponds
-
to a member of the extended character set.3)
A null wide character is a wide character with code value zero.
A wide string is a contiguous sequence of wide characters terminated by and including the first null wide
character. Apointer to a wide string is a pointer to its initial (lowest addressed) wide character. The length of
a wide string is the number of wide characters preceding the null wide character and the value of a wide string
is the sequence of code values of the contained wide characters, in order.
Ash@ sequence is acontiguous sequence of bytes within a multibyte string that (potentially) causes a change
in shift state. (See ISO/IEC 9899: 1990 subclause 5.2.1.2.) A shift sequence shall not have a corresponding
wide character; it is instead taken to be an adjunct to an adjacent multibyte character.4’
7.1.2 Standard headers
The list of standard headers is adjusted to include three new headers, ,
standard
Xwctype. and .
h>,
2) This behavior differs from those identifiers with external linkage associated with the headers listed in and referenced by ISO/IEC
9899:1990 subclauses 7.1.2 and 7.1.3, which are always reserved. Note that including either of these headers in a translation unit will
affect other translation units in the same program, even though they do not include either header. The Standard C library should not
itself include any of these headers.
3) An equivalent definition can be found in subclause 6.1.3.4 of ISO/IEC 9899: 1990.
4) For state-dependent encodings, the values for MB~CURJ4AX and MB-UN kf?~X must thus be large enough to count all the bytes in
any complete multibyte character plus at least one adjacent shift sequence ofmaximum length. Whether these counts provide for more
than one shift sequence is the implementation’s choice.
2

---------------------- Page: 6 ----------------------
ISO/IEC 9899:1990/Amendment 1: 1995 (E)
0 ISO/IEC
7.1.4 Errors
The list of macros defined inkerrno . h> is adjusted to include a new macro, EILSEQ.
7.9 Input/output
7.9.1 Introduction
The header declares a number of functions useful for wide-character input and output.
The wide-character input/output functions described in this subclause provide operations analogous to most
of those described in ISO/IEC 9899:1990 subclause 7.9, except that the fundamental units internal to the
program are wide characters. The external representation (in the file) is a sequence of “generalized” multibyte
characters, as described further in subclause 7.9.3, below.
The input/output functions described here and in ISO/IEC 9899:1990 are given the following collective
terms:
The wide-character inputjimctions - those functions described in this subclause that perform input into
wide characters and wide strings: fgetwc, fgetws, getwc, getwchar, fwscanf, and wscanf.
The wide-character output jimctions - those functions described in this subclause that perform output
from widecharactersandwidestrings: fputwc, fputws,putwc,putwchar, fwprintf,wprintf,
vfwprintf,andvwprintf.
The wide-character inputloutputjknctions - the union of the ungetwc function, the wide-character input
functions, and the wide-character output functions.
The byte input/output jknctions - the ungetc function and the input/output functions described in
ISO/IEC 9899:1990 subclause 7.9: fgetc, fgets, fprintf, fputc, fputs, fread, fscanf,
fwrite, getc, getchar, gets, printf, putt, putchar, puts, scanf, vfprintf, and
vprintf.
,
7.9.2 Streams
The definition of a stream is adjusted to include an orientation for both text and binary streams. After a
stream is associated with an external file, but before any operations are performed on it, the stream is without
orientation. Once a wide-character input/output function has been applied to a stream without orientation, the
stream becomes wide-oriented. Similarly, once a byte input/output function has been applied to a stream
without orientation, the stream becomes byte-oriented. Only a call to the f reopen function or the fwide
function can otherwise alter the orientation of a stream. (A successful call to freopen removes any
orientation.)5)
Byte input/output functions shall not be applied to a wide-oriented stream; and wide-character input/output
functions shall not be applied to a byte-oriented stream. The remaining stream operations do not affect and are
not affected by a stream’s orientation, except for the following additional restrictions:
- Binary wide-oriented streams have the file-positioning restrictions ascribed to both text and binary streams.
- For wide-oriented streams, after a successful call to a file-positioning function that leaves the file position
indicator prior to the end-of-file, a wide-character output function can overwrite a partial multibyte
character; any file contents beyond the byte(s) written are henceforth undefined.
Each wide-oriented stream has an associated &state t object that stores the current parse state of the
stream. A successful call to fgetpos stores a representat& of the value of this mbstate t object as part
of the value of the fpos t object. A later successful call to f setpos using the same storz fpos t value
restores the value of the &ociated mbstate t object as well as the position within the controlled%ream.
7.9.3 Files
Although both text and binary wide-oriented streams are conceptually sequences of wide characters, the
external file associated with a wide-oriented stream is a sequence of multibyte characters, generalized as
follows:
5) The three predefined streams stdin, stdout, and stderr are unoriented at program startup.

---------------------- Page: 7 ----------------------
0 ISO/IEC
ISO/IEC 9899: 1990/Amendment 1: 1995 (E)
- Multibyte encodings within files may contain embedded null bytes (unlike multibyte encodings valid for
use internal to the program).
I
- A file need not begin nor end in the initial shift state?
characters may differ among files. Both the nature and choice of
Moreover, the encodings used for multibyte
such encodings are implementation defined.
The wide-character input functions read multibyte characters from the stream and convert them to wide
characters as if they were read by successive calls to the fgetwc function. Each conversion occurs as if by a
call to the mbrtowc function, with the conversion state described by the stream’s own x&state-t object.
The wide-character output functions convert wide characters to multibyte characters and write them to the
stream as if they were written by successive calls to the fputwc function. Each conversion occurs as if by a
t object.
call to the wcrto& function, with the conversion state described by the stream’s own rnbstate
An encoding error occurs if the character sequence presented to the underlying mbrtowc function does
not form a valid (generalized) multibyte character, or if the code value passed to the underlying wcrtornb
does not correspond to a valid (generalized) multibyte character. The wide-character input/output functions
and the byte input/output functions store the value of the macro EILSEQ in errno if and only if an encoding
error occurs.
Forward References: the fgetwc function (7.16.3.1), the fputwc function (7.16.3.2), conversion state
(7.16.6), the mbrtowc function (7.16.6.3.2), the wcrtornb function (7.16.6.3.3).
7.9.6 Formatted input/output functions
7.9.6.1 The fprintf function
Adjust the description of the qualifiers h, 1, and L to include the additional phrases:
an optional 1 specifying that a following c conversion specifier applies to a wint t argument; an
optional 1 specifying that a following s conversion specifier applies to a pointer-to a wchar t
argument;
Replace the description of the c conversion specifier with:
C If no 1 qualifier is present, the int argument is converted to an unsigned char, and the resulting
character is written. Otherwise, the wint-t argument is converted as if by an 1s conversion specifi-
cation with no precision and an argument that points to a two-element array of wchar-t, the first
element containing the wint argument to the lc conversion specification and the second a null wide
character.
Replace the description of the s conversion specifier with:
If no 1 qualifier is present, the argument shall be a pointer to an array of character type?) Characters
S
from the array are written up to (but not including) a terminating null character. If the precision is
specified, no more than that many characters are written. If the precision is not specified or is greater
than the size of the array, the array shall contain a null character.
If an 1 qualifier is present, the argument shall be a pointer to an array of wchar t type. Wide characters
from the array are converted to multibyte characters (each as if by a call to the icrtomb function, with
the conversion state described by an mbs t ate t object initialized to zero before the first wide character
-
is converted) up to and including a terminating null wide character. The resulting multibyte characters
are written up to (but not including) the terminating null character (byte). If no precision is specified, the
array shall contain a null wide character. If a precision is specified, no more than that many characters
(bytes) are written (including shift sequences, if any), and the array shall contain a null wide character
if, to equal the multibyte character sequence length given by the precision, the function would need to
access a wide character one past the end of the array. In no case is a partial multibyte character written.*)
6) Setting the file position indicator to end-of-file, as with f seek ( file, 0, SEEK END) , has undefined behavior for a binary
stream (because of possible trailing null characters) or for any stream with state-deperident encoding that does not assuredly end in
the initial shift state.
7) No special provisions are made for multibyte characters.
8) Redundant shift sequences may result if multibyte characters have a state-dependent encoding.
4

---------------------- Page: 8 ----------------------
0 ISO/IEC ISO/IEC 9899:1990/Amendment 1: 1995 (E)
The above extension is applicable to all the formatted output functions specified in ISO/IEC 9899: 1990.
Examples
The examples are adjusted to include the following:
In this example, multibyte characters do not have a state-dependent encoding, and the multibyte members
of the extended character set each consist of two bytes, the first of which is denoted here by a 0 and the second
by an uppercase letter.
Given the following wide string with length seven,
static wchar t wstr[] = L1lUXUYabdJZOW1l;
-
the seven calls
fprintf (stdout I I* 11234567890123 1 \P) ;
fprintf (stdout, I1 1%131s 1 \n”, wstr) ;
fprintf (stdout, I1 I%-13.91s 1 \n”, wstr) ;
fprintf (stdout, I1 1813.101s 1 \n”, wstr) ;
fprintf (stdout, I1 1%13.11s 1 \n”, wstr) ;
I’ 1%13.151s 1 \n”, &wstr [2] ) ;
fprintf (stdout,
fprintf(stdout, 111%131cl\n11, wstr[5]);
will print the following seven lines:
112345678901231
I UXUYabdJZDW~
ICiXlXab67Z 1
UXlIYabcUZ I
I
I UXUYabdJZClWI
abdlZl3W I
I
I q Jz I
Forward References: conversion state (7.16.6, the wcrt omb function (7.16.6.3.3).
7.9.6.2 The fscanf function
Adjust the description of the qualifiers h, 1, and L to include the additional sentences:
The conversion specifiers c, s, and [ shall be preceded by 1 if the corresponding argument is a pointer
to wchar t rather than a pointer to a character type.
Replace the definition of directive failure (page 135, lines 34-36, beginning with, “If the length of the input
item is zero.“) with:
If the length of the input item is zero, the execution of the directive fails; this condition is a matching
failure unless end-of-file, an encoding error, or a read error prevented input from the stream, in which
case it is an input failure.
Replace the description of the s conversion specifier with:
S Matches a sequence of non-white-space characters?) If no 1 qualifier is present, the corresponding
argument shall be a pointer to a character array large enough to accept the sequence and a terminating
null character, which will be added automatically.
If an 1 qualifier is present, the input shall be a sequence of multibyte characters that begins in the initial
shift state. Each multibyte character is converted to a wide character as if by a call to the mbrtowc
function, with the conversion state described by an rnbstate t object initialized to zero before the
first multibyte character is converted. The corresponding argument shall be a pointer to an array of
wchar t large enough to accept the sequence and the terminating null wide character, which will be
added a Replace the first two sentences of the description of the [ conversion specifier with:
9) No special provisions are made for multibyte characters in the matching rules used by any of the conversion specifiers 8, [, or c -
the extent of the input field is still determined on a byte-by-byte basis. The resulting field must nevertheless be a sequence of multibyte
characters that begins in the initial shift state.
5

---------------------- Page: 9 ----------------------
ISO/IEX 9899: 1990/Amendment 1: 1995 (E) 0 ISO/IEC
1 Matches a nonempty sequence of characters from a set of expected characters (the scanset). If no 1
qualifier is present, the corresponding argument shall be a pointer to a character array large enough to
accept the sequence and a terminating null character, which will be added automatically.
If an 1 qualifier is present, the input shall be a sequence of multibyte characters that begins in the initial
shift state. Each multibyte character is converted to a wide character as if by a call to the mbrtowc
function, with the conversion state described by an mbstate t object initialized to zero before the
first multibyte character is converted. The corresponding argument shall be a pointer to an array of
wchar t large enough to accept the sequence and the terminating null wide character, which will be
added a‘;;tomatically.
Replace the description of the c conversion specifier with:
C Matches a sequence of characters of the number specified by the field width (1 if no field width is present
in the directive). If no 1 qualifier is present, the corresponding argument shall be a pointer to a character
array large enough to accept the sequence. No null character is added.
If an 1 qualifier is present, the input shall be a sequence of multibyte characters that begins in the initial
shift state. Each multibyte character in the sequence is converted to a wide character as if by a call to the
znbrtowc function, with the conversion state described by an r&state t object initialized to zero
before the first multibyte character is converted. The corresponding a.rgu&nt shall be a pointer to the
initial element of an array of wchar t large enough to accept the resulting sequence of wide characters.
No null wide character is added.
-
The above extension is applicable to all the formatted input functions specified in ISO/IEC 9899: 1990.
Examples
The examples are adjusted to include the following:
In these examples, multibyte characters do have a state-dependent encoding, and multibyte members of the
extended character set consist of two bytes, the first of which is denoted here by a 0 and the second by an
uppercase letter, but are only recognized as such when in the alternate shift state. The shift sequences are
denoted by ?’ and k, in which the first causes entry into the alternate shift state.
After the call:
1.
#include
/* */
chk*str[50];
fscanf(stdin, rra%sWl str);
with the input line:
a?hXUY&bc
str will contain kW3Y&\O assuming that none of the bytes of the shift sequences (or of the multibyte
characters, in the more general case) appears to be a single-byte white-space character.
2 . In contrast, after the call:
#include
#include
/*.*/
wchar t wstr[50];
fscanz(stdin, Ua%lsU, wstr);
with the same input line, wstr will contain the two wide characters that correspond to UX and UY and
a terminating null wide character.
. However, the call:
3
#include
#include
/*.*/
wchar t wstr[50];
fscanf(stdin, WakIXk%lsU, wstr);

---------------------- Page: 10 ----------------------
ISO/IEC 9899:1990/Amendment 1: 1995 (E)
0 ISO/IEC
with the same input line will return zero due to a matching failure against the & sequence in the format
string.
4 . Assuming that the first byte of the multibyte character UX is the same as the first byte of the multibyte
character DY, after the call:
#include
#include
/*.*/
wchar t wstr[50];
fscanf(stdin, WaklY&%ls", wstr);
with the same input line, zero will again be returned, but stdin will be left with a partially consumed
multibyte character.
Forward References: conversion state (7.16.6), the wcrtomb function (7.16.6.3.3).
7.13 Future library directions
The list of headers and their reserved identifiers is adjusted to include the following:
7.13.1 Wide-character classification and mapping utilities
Function names that begin with is or to and a lowercase letter (followed combination of digits,
bY my
letters, and underscore) may be added to the declarations in the header.
7.13.2 Extended multibyte and wide-character utilities
Function names that begin with WCS and a lowercase letter (followed by any combination of digits, letters,
header.
and underscore) may be added to the declarations in the
Lowercase letters may be added to the conversion spe
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.