question about identifying CLDR coverage % for Amharic

Mark Davis ☕️ mark at macchiato.com
Thu Mar 2 04:50:27 CST 2017


​Filed on your behalf at http://unicode.org/cldr/trac/ticket/10098

(Surprised that it thought you were spam; are others seeing that?)​

​Also, would it be possible for you to supply the ordering rules for CLDR?

Longer term, if the rules can be expressed without too much data, I think
the change should be made in the DUCET; no need for that to differ
gratuitously from ​what is acceptable in Laos.

On Thu, Mar 2, 2017 at 12:24 AM, Richard Wordingham <
richard.wordingham at ntlworld.com> wrote:

> > I notice a very similar file lo.xml.  When did Laos haul up the white
> > flag and more or less adopt the modern Thai collation order for Lao?
>
> As there has been no answer to this question, I presume the surrender
> has not happened.  As my ticket submission was rejected as spam, would
> someone kindly file a ticket along these lines:
>
> ==Lao collation is not linguistically correct==
>
> The file collation/lo.xml contains the reckless falsehood "The root
> collation order is valid for this language".
>
> If phonetic Lao syllables were represented by single characters, Lao
> collation would be a simple lexicographic order. It is therefore unable
> to use anything but primary weights.
>
> A Lao syllable may be considered to be composed of onset + vowel + coda
> + tone; the onset and vowel may be interleaved (as in Thai), and the
> tone is represented by a mark following the onset and no later than
> immediately after the vowel. There are two basic schemes ordering for
> single syllables:
>
> 1) <onset-weight><coda-weight><vowel-weight><tone-weight>
> 2) <onset-weight><vowel-weight><coda-weight><tone-weight>
>
> The first is the one most commonly used; the second is closer to the
> CLDR default.
>
> Unlike Thai, the vowel weighting for compound vowel symbols is not
> composed from the individual vowels. For example, part of the ordering
> is:
>
> ເກະ < ເກ < ໂກະ < ໂກ < ເກາະ
>
> However, the current collation yields
> ເກ < ເກະ < ເກາະ < ໂກ < ໂກະ
>
> This ordering is manifestly wrong.
>
> I suggest that the reckless comment be amended to something like, "The
> root collation is of some utility in sorting this language; accurate
> collation appears to require large tables".
>
> Yours faithfully,
>
> Richard Wordingham.
>



Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20170302/a0cfb821/attachment.html>


More information about the CLDR-Users mailing list