L2/18-181

James Kass via Unicode unicode at unicode.org
Thu May 17 09:51:54 CDT 2018


William Overington offered a suggestion,

⇒ Maybe people should be helping to get this resolved
⇒ to the satisfaction of all and helping rather than
⇒ criticising.

That's a noble thought, but as long as Assamese continues to be
written using the Eastern Nagari script, which is referred to as
"BENGALI" in the Unicode naming tables, any disunification proposal
will be a non-starter.  Hence the criticism.  We should strive to keep
any criticism constructive rather than derisive.  If I'm not mistaken,
the character naming for this script was inherited from the ISCII
standard, so it was the Indian government's convention.  I believe
most English speakers aware of the script call it Bengali.

https://en.wikipedia.org/wiki/Eastern_Nagari_script

⇒  U+E0001 LANGUAGE TAG
⇒
⇒ ...
⇒
⇒ There is a note in the code chart.
⇒
⇒  >> This character is deprecated, and its use is strongly
⇒  discouraged.
⇒
⇒ It does not say by whom it is discouraged though nor why.

The reason people shouldn't use it is because it is deprecated.  It
was originally deprecated because people shouldn't use it.

Arguably, a plain-text computer character encoding standard which is
language-neutral does not need a language tagging mechanism.  By
encoding scripts rather than languages, Unicode ensures that the data
is legible in plain-text.  If the recipient of an untagged plain-text
file doesn't know the language well enough to recognize it, then a tag
won't help.  If the recipient wants to translate it anyway, various
on-line translators are fairly sophisticated in language
identification.  If that fails, it's a mystery.  Everybody loves a
mystery.



More information about the Unicode mailing list