From martin_hosken at sil.org  Thu Mar  6 01:31:38 2014
From: martin_hosken at sil.org (Martin Hosken)
Date: Thu, 6 Mar 2014 14:31:38 +0700
Subject: Case Mappings
Message-ID: <20140306143138.63340d7b@sil-mh6>

Dear All,

How would I derive a case mapping from LDML. For example, how would I use tr.xml to derive that lc(I)<>dotless i and uc(i)<>dotted cap I? I realise there is something deep a mysterious going on in 2.5 level collation that is described rather opaquely. Is this where the information is? Or do I need to look somewhere else?

TIA,
Yours,
Martin

From mark at macchiato.com  Thu Mar  6 05:12:32 2014
From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJU=?=)
Date: Thu, 6 Mar 2014 12:12:32 +0100
Subject: Case Mappings
In-Reply-To: <20140306143138.63340d7b@sil-mh6>
References: <20140306143138.63340d7b@sil-mh6>
Message-ID: <CAJ2xs_Ei5G3QuwStkS6_WH8ZYHNOvqW5ydqbWyY3+abonZEJvg@mail.gmail.com>

I'm curious as to why you need this, since normally people use the Unicode
properties, optionally plus the locale-specific CLDR casing transforms.


Mark <https://google.com/+MarkDavis>

 *? Il meglio ? l?inimico del bene ?*


On Thu, Mar 6, 2014 at 8:31 AM, Martin Hosken <martin_hosken at sil.org> wrote:

> Dear All,
>
> How would I derive a case mapping from LDML. For example, how would I use
> tr.xml to derive that lc(I)<>dotless i and uc(i)<>dotted cap I? I realise
> there is something deep a mysterious going on in 2.5 level collation that
> is described rather opaquely. Is this where the information is? Or do I
> need to look somewhere else?
>
> TIA,
> Yours,
> Martin
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140306/fdab7ac3/attachment.html>

From richard.wordingham at ntlworld.com  Thu Mar  6 12:58:53 2014
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Thu, 6 Mar 2014 18:58:53 +0000
Subject: Case Mappings
In-Reply-To: <20140306143138.63340d7b@sil-mh6>
References: <20140306143138.63340d7b@sil-mh6>
Message-ID: <20140306185853.133a961e@JRWUBU2>

On Thu, 6 Mar 2014 14:31:38 +0700
Martin Hosken <martin_hosken at sil.org> wrote:

> How would I derive a case mapping from LDML. For example, how would I
> use tr.xml to derive that lc(I)<>dotless i and uc(i)<>dotted cap I? I
> realise there is something deep a mysterious going on in 2.5 level
> collation that is described rather opaquely. Is this where the
> information is? Or do I need to look somewhere else?

I don't believe one is intended to derive this from collation.  The
full Lithuanian rules are not derivable from the Lithuanian collation
rules.

The simple answer appears to be that the transforms can be found, at
least for CLDR Version 24, in the files:

common/transforms/tr-Lower.xml
common/transforms/tr-Title.xml
common/transforms/tr-Upper.xml

This is based on looking for the data.  I can't work out how to derive
the file names from the LDML Version 24 Part 2 Section 10 or
http://cldr.unicode.org/#TOC-How-to-Use-, which are the locations where
I would look.  
http://cldr.unicode.org/index/downloads does give the hint that the
data might be in common/transforms.  'Lower', 'Title' and 'Upper'
appear to be undocumented targets.

Richard.

From mark at macchiato.com  Wed Mar 12 14:03:06 2014
From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJU=?=)
Date: Wed, 12 Mar 2014 20:03:06 +0100
Subject: Beta CLDR Spec for v25 (LDML)
Message-ID: <CAJ2xs_ELMoqf6cpwGCK996O094TA+DS=5FjwEc49tBUP5QUgPQ@mail.gmail.com>

There is a beta version of the CLDR specification for version 25, with the
changes listed at:

http://www.unicode.org/reports/tr35/proposed.html#Modifications

If you have any feedback on the new sections, please submit it at
http://unicode.org/cldr/trac/newticket. If you do, please include a link to
the specific section you're commenting on. This is easy to do, since
clicking on any header puts a link to that header into your browser's
address bar.

Mark <https://google.com/+MarkDavis>

*? Il meglio ? l?inimico del bene ?*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140312/7d5c7034/attachment.html>

From cdutro at twitter.com  Thu Mar 27 13:57:17 2014
From: cdutro at twitter.com (Cameron Dutro)
Date: Thu, 27 Mar 2014 11:57:17 -0700
Subject: Territory Codes
Message-ID: <CAFYXrAM2VuDE0tAvgnA_BLZkKZ864wfUVg2HDVS3q2A6Eobf7w@mail.gmail.com>

Hey CLDR users,

Does anyone know what standard CLDR's territory codes adhere to?

-Cameron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140327/183776ce/attachment.html>

From srl at icu-project.org  Thu Mar 27 14:20:08 2014
From: srl at icu-project.org (Steven R. Loomis)
Date: Thu, 27 Mar 2014 14:20:08 -0500
Subject: Territory Codes
In-Reply-To: <CAFYXrAM2VuDE0tAvgnA_BLZkKZ864wfUVg2HDVS3q2A6Eobf7w@mail.gmail.com>
References: <CAFYXrAM2VuDE0tAvgnA_BLZkKZ864wfUVg2HDVS3q2A6Eobf7w@mail.gmail.com>
Message-ID: <60B0DE42-37FF-4CA1-8063-511F1BA75366@icu-project.org>

ISO 3166 territories  plus
Un m.39 regions  (just as bcp47). Tr35 should have references. 


Enviado desde nuestro iPhone.

> El mar 27, 2014, a las 1:57 PM, Cameron Dutro <cdutro at twitter.com> escribi?:
> 
> Hey CLDR users,
> 
> Does anyone know what standard CLDR's territory codes adhere to?
> 
> -Cameron
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140327/dd6bf482/attachment.html>

From petercon at microsoft.com  Fri Mar 28 12:02:05 2014
From: petercon at microsoft.com (Peter Constable)
Date: Fri, 28 Mar 2014 17:02:05 +0000
Subject: Adding RUBLE SIGN to keyboard layouts
Message-ID: <6cb5e73482f341178b548f8618ccde38@BL2PR03MB450.namprd03.prod.outlook.com>

CLDR folk:

Has anyone begun to consider how to support the ruble sign in keyboard layouts (for hardware keyboards)?


Thanks
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140328/37f838c4/attachment-0001.html>

From mark at macchiato.com  Fri Mar 28 12:06:32 2014
From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?=)
Date: Fri, 28 Mar 2014 18:06:32 +0100
Subject: Adding RUBLE SIGN to keyboard layouts
In-Reply-To: <6cb5e73482f341178b548f8618ccde38@BL2PR03MB450.namprd03.prod.outlook.com>
References: <6cb5e73482f341178b548f8618ccde38@BL2PR03MB450.namprd03.prod.outlook.com>
Message-ID: <CAJ2xs_HQpJQF6n9bNXKw18jyRE6EK6yS=2g53RsMtR9p4o8SYQ@mail.gmail.com>

Good question; don't know if the Russians have a standard for where it
goes. For comparison, here are the ru keyboards we currently have in CLDR
(reflecting data publicly available on the platforms):

http://www.unicode.org/cldr/charts/25/keyboards/layouts/ru.html


Mark <https://google.com/+MarkDavis>

 *? Il meglio ? l?inimico del bene ?*


On 28 March 2014 18:02, Peter Constable <petercon at microsoft.com> wrote:

>  CLDR folk:
>
>
>
> Has anyone begun to consider how to support the ruble sign in keyboard
> layouts (for hardware keyboards)?
>
>
>
>
>
>
>
> Thanks
>
> Peter
>
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140328/67327d27/attachment.html>

From petercon at microsoft.com  Fri Mar 28 12:13:42 2014
From: petercon at microsoft.com (Peter Constable)
Date: Fri, 28 Mar 2014 17:13:42 +0000
Subject: Adding RUBLE SIGN to keyboard layouts
In-Reply-To: <CAJ2xs_HQpJQF6n9bNXKw18jyRE6EK6yS=2g53RsMtR9p4o8SYQ@mail.gmail.com>
References: <6cb5e73482f341178b548f8618ccde38@BL2PR03MB450.namprd03.prod.outlook.com>
 <CAJ2xs_HQpJQF6n9bNXKw18jyRE6EK6yS=2g53RsMtR9p4o8SYQ@mail.gmail.com>
Message-ID: <72fff88eb93a426cb160ce57c0b4dec5@BL2PR03MB450.namprd03.prod.outlook.com>

My understanding is that there is some discussion started within Russia on standards, but that there may be opportunity for influencing this. Vlad, (cc?d) can clarify.

Within Microsoft, we?ve been having some discussion around several possibilities and are considering AltGr+8.


Peter


From: mark.edward.davis at gmail.com [mailto:mark.edward.davis at gmail.com] On Behalf Of Mark Davis ??
Sent: March 28, 2014 10:07 AM
To: Peter Constable
Cc: cldr-users; Agustin Da Fieno Delucchi; Vladislav Shershulsky; Michael Kaplan
Subject: Re: Adding RUBLE SIGN to keyboard layouts

Good question; don't know if the Russians have a standard for where it goes. For comparison, here are the ru keyboards we currently have in CLDR (reflecting data publicly available on the platforms):

http://www.unicode.org/cldr/charts/25/keyboards/layouts/ru.html


Mark<https://google.com/+MarkDavis>

? Il meglio ? l?inimico del bene ?

On 28 March 2014 18:02, Peter Constable <petercon at microsoft.com<mailto:petercon at microsoft.com>> wrote:
CLDR folk:

Has anyone begun to consider how to support the ruble sign in keyboard layouts (for hardware keyboards)?


Thanks
Peter

_______________________________________________
CLDR-Users mailing list
CLDR-Users at unicode.org<mailto:CLDR-Users at unicode.org>
http://unicode.org/mailman/listinfo/cldr-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140328/0040839a/attachment.html>

From petercon at microsoft.com  Fri Mar 28 12:42:42 2014
From: petercon at microsoft.com (Peter Constable)
Date: Fri, 28 Mar 2014 17:42:42 +0000
Subject: Adding RUBLE SIGN to keyboard layouts
In-Reply-To: <F910C3D69E742E45BC8FA660FFC1CE1D7CF66BA0@DB3EX14MBXC303.europe.corp.microsoft.com>
References: <6cb5e73482f341178b548f8618ccde38@BL2PR03MB450.namprd03.prod.outlook.com>
 <CAJ2xs_HQpJQF6n9bNXKw18jyRE6EK6yS=2g53RsMtR9p4o8SYQ@mail.gmail.com>,
 <72fff88eb93a426cb160ce57c0b4dec5@BL2PR03MB450.namprd03.prod.outlook.com>
 <F910C3D69E742E45BC8FA660FFC1CE1D7CF66BA0@DB3EX14MBXC303.europe.corp.microsoft.com>
Message-ID: <8c9708cae9d840f9b1690483bffc2d31@BL2PR03MB450.namprd03.prod.outlook.com>

Reposting (Vlad is not a list member so his mail won?t get posted).


From: Vladislav Shershulsky
Sent: March 28, 2014 10:23 AM
To: Peter Constable; Mark Davis ??
Cc: cldr-users; Agustin Da Fieno Delucchi; Michael Kaplan; Jan Nelson
Subject: ??: Adding RUBLE SIGN to keyboard layouts

Peter, I completely agree with your vision of the situation.
If AltGr+8 looks reasonable for all we could have more chances to convince Russian experts in this choice.
Vlad

?????????? ? ????? Windows Phone
________________________________
??: Peter Constable<mailto:petercon at microsoft.com>
??????????: ?28.?03.?2014 21:13
????: Mark Davis ??<mailto:mark at macchiato.com>
?????: cldr-users<mailto:cldr-users at unicode.org>; Agustin Da Fieno Delucchi<mailto:Agustin.Da.Fieno at microsoft.com>; Vladislav Shershulsky<mailto:vladsh at microsoft.com>; Michael Kaplan<mailto:Michael.S.Kaplan at microsoft.com>; Jan Nelson<mailto:Jan.Nelson at microsoft.com>
????: RE: Adding RUBLE SIGN to keyboard layouts
My understanding is that there is some discussion started within Russia on standards, but that there may be opportunity for influencing this. Vlad, (cc?d) can clarify.

Within Microsoft, we?ve been having some discussion around several possibilities and are considering AltGr+8.


Peter


From: mark.edward.davis at gmail.com<mailto:mark.edward.davis at gmail.com> [mailto:mark.edward.davis at gmail.com] On Behalf Of Mark Davis ??
Sent: March 28, 2014 10:07 AM
To: Peter Constable
Cc: cldr-users; Agustin Da Fieno Delucchi; Vladislav Shershulsky; Michael Kaplan
Subject: Re: Adding RUBLE SIGN to keyboard layouts

Good question; don't know if the Russians have a standard for where it goes. For comparison, here are the ru keyboards we currently have in CLDR (reflecting data publicly available on the platforms):

http://www.unicode.org/cldr/charts/25/keyboards/layouts/ru.html


Mark<https://google.com/+MarkDavis>

? Il meglio ? l?inimico del bene ?

On 28 March 2014 18:02, Peter Constable <petercon at microsoft.com<mailto:petercon at microsoft.com>> wrote:
CLDR folk:

Has anyone begun to consider how to support the ruble sign in keyboard layouts (for hardware keyboards)?


Thanks
Peter

_______________________________________________
CLDR-Users mailing list
CLDR-Users at unicode.org<mailto:CLDR-Users at unicode.org>
http://unicode.org/mailman/listinfo/cldr-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140328/b4e24384/attachment.html>

From richard.wordingham at ntlworld.com  Sun Mar 30 07:24:45 2014
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Sun, 30 Mar 2014 13:24:45 +0100
Subject: Non-primary Weights of U+FFFE
Message-ID: <20140330132445.43398a4e@JRWUBU2>

Is there any reason that a CLDR-compliant collation algorithm should
particularly care about the non-primary weights of U+FFFE?  So long as
they satisfy the well-formedness conditions, all I can see is that
having unique values *may* simplify sort key formation for reversed
levels.

Richard.

From markus.icu at gmail.com  Sun Mar 30 11:17:44 2014
From: markus.icu at gmail.com (Markus Scherer)
Date: Sun, 30 Mar 2014 09:17:44 -0700
Subject: Non-primary Weights of U+FFFE
In-Reply-To: <20140330132445.43398a4e@JRWUBU2>
References: <20140330132445.43398a4e@JRWUBU2>
Message-ID: <CAN49p6ogEHP+G=vrER=1XNoCE_YPhoLKMSKP-MDFPJ6+mgyg+Q@mail.gmail.com>

On Sun, Mar 30, 2014 at 5:24 AM, Richard Wordingham <
richard.wordingham at ntlworld.com> wrote:

> Is there any reason that a CLDR-compliant collation algorithm should
> particularly care about the non-primary weights of U+FFFE?  So long as
> they satisfy the well-formedness conditions, all I can see is that
> having unique values *may* simplify sort key formation for reversed
> levels.
>

The non-primary weights need to be greater than the level separator(s) and
less than the weights of CEs that are ignorable on previous levels. It is
also important to generate the special weights on primary to tertiary
levels for shifted CEs, so that alternate=shifted works properly.

In ICU, we have test code that expects the same sort keys generated from
concatenating two strings with U+FFFE vs. calling ucol_mergeSortkeys() on
the two separate sort keys. The latter merges sort keys by copying each
level (separated by byte 01) from each sort key and inserting a byte 02
between the bytes from different sort keys. (see
ucol.h<http://www.icu-project.org/apiref/icu4c/ucol_8h.html>
)

markus
-- 
Google Internationalization Engineering
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140330/dbce961e/attachment-0001.html>

From markus.icu at gmail.com  Sun Mar 30 11:58:34 2014
From: markus.icu at gmail.com (Markus Scherer)
Date: Sun, 30 Mar 2014 09:58:34 -0700
Subject: Non-primary Weights of U+FFFE
In-Reply-To: <CAN49p6ogEHP+G=vrER=1XNoCE_YPhoLKMSKP-MDFPJ6+mgyg+Q@mail.gmail.com>
References: <20140330132445.43398a4e@JRWUBU2>
 <CAN49p6ogEHP+G=vrER=1XNoCE_YPhoLKMSKP-MDFPJ6+mgyg+Q@mail.gmail.com>
Message-ID: <CAN49p6otriz3hg0rg9sSXNsd7qSZFKo8bD3Yzq-jcTFyhrRuXA@mail.gmail.com>

PS: What I am realizing here is that we should be able to use byte 02 as a
lead byte in any non-primary weight. Primary CEs compare greater than
U+FFFE on primary level, and ignorable CEs have high weights and compare
greater than many low weights.

markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140330/e1d04c39/attachment.html>

From markus.icu at gmail.com  Sun Mar 30 17:08:06 2014
From: markus.icu at gmail.com (Markus Scherer)
Date: Sun, 30 Mar 2014 15:08:06 -0700
Subject: Non-primary Weights of U+FFFE
In-Reply-To: <CAN49p6otriz3hg0rg9sSXNsd7qSZFKo8bD3Yzq-jcTFyhrRuXA@mail.gmail.com>
References: <20140330132445.43398a4e@JRWUBU2>
 <CAN49p6ogEHP+G=vrER=1XNoCE_YPhoLKMSKP-MDFPJ6+mgyg+Q@mail.gmail.com>
 <CAN49p6otriz3hg0rg9sSXNsd7qSZFKo8bD3Yzq-jcTFyhrRuXA@mail.gmail.com>
Message-ID: <CAN49p6qzhuMptjmw2eUP2Jd2+-YB=VPS268Lu0qxiKvpbJLibQ@mail.gmail.com>

By the way, the ICU locale explorer and its collation demo are updated to
the not-yet-released ICU 53 which includes the new collation code.
http://demo.icu-project.org/icu-bin/locexp?_=root&d_=en&x=col

markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20140330/043b491f/attachment.html>