[Unicode] Re: HENTAIGANA LETTER E-1

Fri Jan 8 08:55:36 CST 2016

Garth Wallace wrote:
> On Thu, Jan 7, 2016 at 7:56 AM, suzuki toshiya
> <mpsuzuki at hiroshima-u.ac.jp> wrote:
>> Hi,
>>
>> I'm not a representative of the experts working for the
>> proposal from Japan NB, but I could explain something.
>>
>> 1) "They never took that out?" I'm not sure who you mean
>> "they" (UTC? JNB?), but it seems that no official document
>> asking for the response from JNB is submitted in WG2.
>> If UTC sends something officially, JNB would response
>> something, I believe.
> 
> I meant the JNB. I thought they had removed that character from the
> later revised proposals that were posted on the UTC document register,
> but I checked and I had apparently been mistaken.
> 
> The issue is only raised in passing in a footnote in Mr. Lunde's feedback.

I think HENTAIGANA LETTER E-1 is intentionally proposed
to be coded separately, and no official document is
sent to JNB, so it is still kept as it was before.

>> 2) Difference in HENTAIGANA LETTER E-1 and U+1B001.
>>
>> U+1B001 is a character designed to note an ancient (and
>> extinct in modern Japanese language) pronunciation YE.
>>
>> When standard kana was defined about 100 years ago,
>> the pronunciation YE was already merged to E.
>> Some scholars planned to use a few kana-like characters
>> to note such pronunciation (to discuss about the ancient
>> Japanese language pronunciation), and used some hentaigana-
>> like glyphs for such purpose. As far as I know, there is
>> no wide consensus that the glyph looking like U+1B001 was
>> historically used to note YE mainly, when YE and E were
>> distinctively used in Japanese language.
> 
> AIUI they simply reused an existing hentaigana to make the
> distinction, rather than making a new kana that just happened to look
> exactly like it.

It is difficult (for me) to judge U+1B001 has same identity
with the hentaigana before kana standardization with similar
appearance. The rationale to encode U+1B001 was justified by
its unique phonetic value, so its character name is YE. It
is normative. Some people may think they can identify the
hentaigana by their glyph shapes only, but others may have
different view. As the first proposal (L2/15-193) prioritized
the (modern) phonetic value as the first key to identify the
glyph, I think some user community would want to identify the
glyph by the phonetic value. I don't say it is the best
solution, but I say they have their own rationale.

>> On the other hand, JNB's proposal does not include any
>> ancient/extinct pronunciation, Their phonetic coverage
>> is exactly same with modern Japanese language. So,
>> the glyph looking like U+1B001 is not designed to note
>> the pronunciation YE. The motivation why JNB proposed
>> hentaigana would be just because of their shape differences.
>>
>> Therefore, U+1B001 and HENTAIGANA E-1 could be said as
>> differently designed, their designed usages are different.
>> Please do not think JNB hentaigana experts overlooked
>> U+1B001 and proposed a duplicated encoding. They ought to
>> have known it but proposed.
> 
> It's not unknown for a single character to have more than one
> pronunciation in different contexts.

Is it easy to distinguish the contexts how the "unified U+1B001"
should be pronounced (some case, it must be YE, some case, it
must be E, some case, both of YE/E are acceptable)? I don't have
good connection with the users community of U+1B001, so I cannot
estimate which is easier (less troublesome for existing user
communities) in separation or unification. Do you have any
connection with the user community of U+1B001?

>> However, some WG2 experts suggested to unify them because
>> of the shape similarity. I'm not sure whether 2 glyphs are
>> indistinctively similar for hentaigana scholars, but I
>> accept with that some people are hard to distinguish.
>> I cannot distinguish some Latin and Greek alphabets when
>> they are displayed as single isolated character.
> 
> We're not talking about about different scripts, though. Hentaigana
> are obsolete hiragana (eliminated from modern written Japanese by a
> spelling reform) but they are still hiragana. Latin and Greek, on the
> other hand, are clearly separate but related scripts.

I'm afraid that the counting how many scripts in the set
of modern hiragana, U+1B001 and JNB proposal could depend
on the people. Some people may count only 1, some people
may count 2, some people may count 3. If there is stable
consensus already, it could be used as the rational to unify,
but, I don't think so. Anyway, Latin and Greek were not
good example, I'm sorry.

Regards,
mpsuzuki