Hanb in domain labels
Jungshik SHIN (신정식)
jshin1987 at gmail.com
Fri Aug 16 12:47:39 CDT 2024
I second Bill. The issue raised by Henri makes a lot of sense and we need
to consider revising UTS 39 given the usage of Bopomofo (i.e. typically
Han and Bopomofo wouldn't be mixed together in identifiers).
Jungshik
On Fri, Aug 16, 2024, 9:31 AM Bill Poser via Unicode <
unicode at corp.unicode.org> wrote:
> The use of bopomofo in Chinese is not parallel to the use of kana in
> Japanese. Whereas kana are routinely mixed with kanji in Japanese, with,
> e.g., a verb stem written in kanji and the suffixes written in kana, and
> Japanese can be written entirely in kana (e.g. by young children), bopomofo
> does not appear in ordinary Chinese text. It is an ancillary system, used,
> e.g., to give the pronunciation of Chinese characters and is a commonly
> available input method. That doesn't guarantee that it doesn't occur in
> email addresses, though I don't recall seeing it. I'm not sure if it is
> even permitted in the legal name of a company.
>
> On Fri, Aug 16, 2024 at 7:32 AM Martin J. Dürst via Unicode <
> unicode at corp.unicode.org> wrote:
>
>> Hello Henri,
>>
>> I don't know about Chinese and Bopomofo, but for Japanese, there surely
>> are e.g. company names that contain both Kana and Kanji. And company
>> names are one (although of course not the only) use case for domain names.
>>
>> I'm cc'ing Arnt, who is one of the authors of
>>
>> https://www.ietf.org/archive/id/draft-gulbrandsen-smtputf8-nice-addresses-00.html,
>>
>> which is about email addresses (quite a bit related to domain names) and
>> discusses Chinese quite a bit (although it doesn't mention Bopomofo).
>>
>> Regards, Martin.
>>
>> P.S.: draft-gulbrandsen-smtputf8-nice-addresses-00.html is in my view
>> still in a very early stage; I have read through it but still have to
>> write up my comments.
>>
>> On 2024-08-15 18:08, Henri Sivonen via Unicode wrote:
>> > UTS #39 is commonly used as the baseline for detecting IDN spoofs, and
>> UTS
>> > #39 explicitly allows combining Han and Bopomofo. Considering that ㄚ
>> looks
>> > confusable with 丫 and ㄠ looks confusable with 幺, I’m wondering if it’s
>> > appropriate to explicitly allow this combination in the spoof detection
>> > context. Is combining Han and Bopomofo in one domain label something
>> that
>> > occurs commonly enough in domains that aren’t intended to be spoofs for
>> it
>> > being necessary not to treat the script combination as triggering spoof
>> > detection in the domain name context?
>> >
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240816/30f8870b/attachment-0001.htm>
More information about the Unicode
mailing list