Richard Wordingham via Unicode
unicode at unicode.org
Thu Aug 17 15:37:27 CDT 2017
On Thu, 17 Aug 2017 18:34:56 +0530
Shriramana Sharma via Unicode <unicode at unicode.org> wrote:
> Thanks for your reply, but how can characters be used portably if they
> are not part of the published standard yet? Or is it that hereafter
> both Unicode Standard + Unicode Emoji Standard will be parallelly
> portable or something like that?
A hypothetical application could correctly claim to correctly render
every sequence (up to some reasonable length limit) of assigned Unicode
characters from a recent version (e.g. 10.0) while completely ignoring
the Unicode Emoji Standard.
That doesn't mean a great deal though, as Unicode appears not to be a
standard for the encoding of text strings, but merely for the
encoding of characters.
Thus, at the level of undisputable text, in Indic scripts there appears
to be no provision for the ordering of multiple left matras that are
to be stored in logical order (i.e. backing order) after the onset
consonants. (Thus, this is not a problem for the Thai script.)
Fortunately, there is no good evidence that the occurrence of multiple
distinct left matras is anything but a typing error, though I can easily
see how it might be used as a lexicographical convention on the fuzzy
edge of plain text.
In a similar vein, in Malayalam, we get repeats of the 2-part vowel
U+0D4B MALAYALAM VOWEL SIGN OO (see Cibu Johny's report at
but I'm not sure what the legitimate encodings of the example word
കോോോ (typed here as <U+0D15, U+0D4B, U+0D4B, U+0D4B>) are.
More information about the Unicode