Re: Chén , Shěn and 沈 pinyin confusion

Tue Sep 13 23:27:37 CDT 2016

>From what has been said earlier by Markus and Peter does anyone know if 沈 transforms/transliterates  to Shěn if the Names variant of Han-Latin transform is invoked ?

I think Peter's reply was saying it would, but I was not sure.

I will talk to Dev team about invoking the names variant and have a chat with guys about the pronunciation field as a catch all fall back.

At the minute the subject field mapping when views as a sorted list seems to be the big groan coming back at me, so maybe the invoking the  Names variant of Han-Latin transform is a quick win while we look into the pronunciation suggestion.

Thanks again.

Sent from my iPhone

> On 13 Sep 2016, at 22:47, Markus Scherer <markus.icu at gmail.com> wrote:
> 
> The Names variant of the Han-Latin transform (e.g., via ICU Transliterator) should do this -- as a preprocessing step.
> 
> The CLDR/ICU Collator does not currently offer a tailoring that would do this automatically just while sorting. Adding such a variant would add at least a couple of 100kB to the data size.
> 
> For Chinese and Japanese, I suggest you add a pronunciation field (pinyin for zh-CN, Hiragana for ja); prefill it via the Transliterator, make it visible to the user, let them fix it; sort by that.
> 
> markus