question about identifying CLDR coverage % for Amharic
Mark Davis ☕️
mark at macchiato.com
Thu Mar 2 04:50:27 CST 2017
Filed on your behalf at http://unicode.org/cldr/trac/ticket/10098
(Surprised that it thought you were spam; are others seeing that?)
Also, would it be possible for you to supply the ordering rules for CLDR?
Longer term, if the rules can be expressed without too much data, I think
the change should be made in the DUCET; no need for that to differ
gratuitously from what is acceptable in Laos.
On Thu, Mar 2, 2017 at 12:24 AM, Richard Wordingham <
richard.wordingham at ntlworld.com> wrote:
> > I notice a very similar file lo.xml. When did Laos haul up the white
> > flag and more or less adopt the modern Thai collation order for Lao?
>
> As there has been no answer to this question, I presume the surrender
> has not happened. As my ticket submission was rejected as spam, would
> someone kindly file a ticket along these lines:
>
> ==Lao collation is not linguistically correct==
>
> The file collation/lo.xml contains the reckless falsehood "The root
> collation order is valid for this language".
>
> If phonetic Lao syllables were represented by single characters, Lao
> collation would be a simple lexicographic order. It is therefore unable
> to use anything but primary weights.
>
> A Lao syllable may be considered to be composed of onset + vowel + coda
> + tone; the onset and vowel may be interleaved (as in Thai), and the
> tone is represented by a mark following the onset and no later than
> immediately after the vowel. There are two basic schemes ordering for
> single syllables:
>
> 1) <onset-weight><coda-weight><vowel-weight><tone-weight>
> 2) <onset-weight><vowel-weight><coda-weight><tone-weight>
>
> The first is the one most commonly used; the second is closer to the
> CLDR default.
>
> Unlike Thai, the vowel weighting for compound vowel symbols is not
> composed from the individual vowels. For example, part of the ordering
> is:
>
> ເກະ < ເກ < ໂກະ < ໂກ < ເກາະ
>
> However, the current collation yields
> ເກ < ເກະ < ເກາະ < ໂກ < ໂກະ
>
> This ordering is manifestly wrong.
>
> I suggest that the reckless comment be amended to something like, "The
> root collation is of some utility in sorting this language; accurate
> collation appears to require large tables".
>
> Yours faithfully,
>
> Richard Wordingham.
>
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20170302/a0cfb821/attachment.html>
More information about the CLDR-Users
mailing list