Re: Chén , Shěn and 沈 pinyin confusion
Work
Mr at eibbor.co.uk
Wed Sep 14 01:55:57 CDT 2016
For info I tried using the transformation demo and selected Names and Names (Variant) and pasted 沈 to the input and got Chén at the output.
Does this mean 沈 will never transform to Shěn or there is some manual addition I need to make to the 'Compound 1' text box contents?
Sent from my iPhone
> On 14 Sep 2016, at 05:27, Work <Mr at eibbor.co.uk> wrote:
>
> From what has been said earlier by Markus and Peter does anyone know if 沈 transforms/transliterates to Shěn if the Names variant of Han-Latin transform is invoked ?
>
> I think Peter's reply was saying it would, but I was not sure.
>
> I will talk to Dev team about invoking the names variant and have a chat with guys about the pronunciation field as a catch all fall back.
>
> At the minute the subject field mapping when views as a sorted list seems to be the big groan coming back at me, so maybe the invoking the Names variant of Han-Latin transform is a quick win while we look into the pronunciation suggestion.
>
> Thanks again.
>
> Sent from my iPhone
>
>> On 13 Sep 2016, at 22:47, Markus Scherer <markus.icu at gmail.com> wrote:
>>
>> The Names variant of the Han-Latin transform (e.g., via ICU Transliterator) should do this -- as a preprocessing step.
>>
>> The CLDR/ICU Collator does not currently offer a tailoring that would do this automatically just while sorting. Adding such a variant would add at least a couple of 100kB to the data size.
>>
>> For Chinese and Japanese, I suggest you add a pronunciation field (pinyin for zh-CN, Hiragana for ja); prefill it via the Transliterator, make it visible to the user, let them fix it; sort by that.
>>
>> markus
More information about the CLDR-Users
mailing list