0027, 02BC, 2019, or a new character?

Doug Ewell via Unicode unicode at unicode.org
Thu Jan 25 13:34:16 CST 2018


Philippe Verdy wrote:

> I agree, and still you won't necessarily have to press a dead key to
> have these characters, if you map one key where the Cyrillic letter
> was > producing directly the character with its accent. [...]
>
> However, if you can type one key to produce one latin letter with its
> accent, I don't see why it could not use the caron instead of the
> acute above s and c, so that it is also immediately readable in other
> Eastern European languages. [...]

I think it is very likely the Kazakhs, like most people who are not
experts on computers or Unicode, did not consider the distinction
between the physical keyboard (hardware) and the driver that maps
keystrokes to characters (software). And they might consider replacing
software drivers nationwide to be as unfeasible as replacing physical
keyboards. Remember the government of Kazakhstan is probably not
composed of computer experts.

> As a bonus, banning the apostrophe from the alphabet will have be
> security improvement (thing about the many cases where ASCII
> apostrophes are used as string delimiters in various programming and
> markup languages

Another fact that they really did not seem to take into account. The
advisers and linguists might have considered this, but not the
decision-maker(s).

> the time of 7-bit ASCII is ended now since long, except in very old
> systems,

And on U.S. English keyboards. (It's true, as Sharma says, that they
didn't specify exactly what they meant by a "standard keyboard," but
they did banish all diacritical marks, so...)

> Even with UTF-8, these Latin letters with accents (from any ISO 8859-*
> subset) will be 2-byte wide, so exactly the same encoding size as
> basic letter+ASCII quote and the encoding size is definitely not an
> issue anywhere (all existing Kazakh Cyrillic letters are already using
> 2-byte encoding in UTF-8, as all their assigned code points values
> were higher than 0x7F but lower than 0x800) [...]
>
> Choosing the ASCII quote for this "apostrophe" will not save
> anything ; but the regular Unicode apostrophe U+2019 would need... 3
> bytes after the  1-byte basic Latin letter from ASCII (so it is
> worse !).

I did not see any evidence that this was something they ever considered
or cared about.
 
--
Doug Ewell | Thornton, CO, US | ewellic.org




More information about the Unicode mailing list