EBCDIC control characters
Ken Whistler
kenwhistler at sonic.net
Thu Jun 18 19:24:35 CDT 2020
Asmus,
On 6/18/2020 4:55 PM, Asmus Freytag via Unicode wrote:
> The problem with the C/C++ compilers in this regard has always been
> that they attempted to implement the character-set insensitive model,
> which doesn't play well with Unicode, so if you want to compile a
> program where string literals are in Unicode (and not just any 16-bit
> character set) then you can't simply zero-extend. (And if you are
> trying to create a UTF-8 literal, then all bets are off unless you
> have a real conversion).
As I said, daft. ;-)
Anybody who depends on zero-sign extension for embedding Unicode
character literals in an 8859-1 (or any other 8-bit character set)
program text ought to have their head examined. Just because you *can*
do it, and the compilers will cheerily do what the spec says they should
in such cases doesn't mean that anybody *should* use it. (There is lots
of stuff in C++ that no sane programmer should use. )
--Ken
More information about the Unicode
mailing list