How the C programming language bridges the man-machine gap

Doug Ewell doug at ewellic.org
Fri Apr 15 13:02:03 CDT 2022


Marius Spix wrote:

> char literals are not reliable for arithmetic expressions.
> '1' - '0' = 1 may be true for Windows-1252 or EBCDIC systems, but you
> cannot expect that this works in all character sets.

Modern C language specifications (at least C99) ensure that you can:

> 5.2.1 Character sets
>
> Both the basic source and basic execution character sets shall have
> the following members: the 26 uppercase letters of the Latin alphabet
>
>    A   B   C   D   E   F   G   H   I   J   K   L   M
>    N   O   P   Q   R   S   T   U   V   W   X   Y   Z
>
> the 26 lowercase letters of the Latin alphabet
>
>    a   b   c   d   e   f   g   h   i   j   k   l   m
>    n   o   p   q   r   s   t   u   v   w   x   y   z
>
> the 10 decimal digits
>
>    0   1   2   3   4   5   6   7   8   9
>
> the following 29 graphic characters
>
>    !   "   #   %   &   '   (   )   *   +   ,   -   .   /   :
>    ;   <   =   >   ?   [   \   ]   ^   _   {   |   }   ~
>
> [...]
>
> In both the source and execution basic character sets, the value of
> each character after 0 in the above list of decimal digits shall be
> one greater than the value of the previous.

I'm not aware of any character set that meets the repertoire requirement but not the digit-sequencing requirement.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org





More information about the Unicode mailing list