["Unicode"] Re: Some questions about Unicode's CJK Unified Ideograph
mpsuzuki at hiroshima-u.ac.jp
Sat May 30 00:46:21 CDT 2015
Please let me ask a slightly off-topic question,
䛩 = ⿰言亞 (not ⿰言亜) is coded at U+46E9. Of course,
the unification between 亞 vs 亜 is not applied basically,
so the separated encoding of ⿰言亜 would be reasonable
(if there is a requirement), but I want to know whether
Vietnamese user community distinguishes ⿰言亞 and ⿰言亜
semantically. Do you know anything?
Ken Whistler wrote:
> On 5/29/2015 5:20 PM, gfb hjjhjh wrote:
>> 1. I have seen a chinese character ⿰言亜 from a Vietnamese dictionary
>> NHAT DUNG THUONG DAM DICTIONARY* *
>> So, a.) In http://www.unicode.org/alloc/Pipeline.html , it show that
>> CJK Extension E and F have already been accepted, but where can I
>> check those proposals to see if the xharacter is in them or not?
> For Extension E, you can check the following code chart:
> See: U+2C89A..U+2C931 (pp. 54-56 of the pdf) for the relevant
> radical (#149). But I don't see that character in the list of
> Extension E characters.
> Extension F is harder to track down, because it has not yet been
> approved by the UTC, and comes in two pieces, with different
> progression so far in the ISO committee. Perhaps somebody on this list
> who has better access to the relevant documents can let you
> know whether ⿰言亜 can be found in those sets.
>> and b.) it say to propose a new character, the proposal must include
>> information about someone who would agree to provide a computer font
>> for publishing the standard, do that mean i have to provide info about
>> someone who is anticipated to agree on doing so or do i need to
>> contact them for their agreement first, and does that mean I can just
>> put info of someone who are making free full unicode CJK coverage font
>> into the proposal?,
> It would require (eventually) provision of a font with correct display
> of just the character proposed -- but in the case of CJK additions, these
> first go through a process of collection and review by the Ideographic
> Rapporteur Group. The best thing to do is to work with a national
> body concerned with CJK characters and ensure that they include
> this character on their list of submissions for IRG review.
>> and c.) just like the question (b), do "names and addresses of
>> appropriate contacts within national body or user organizations"
>> represent Vietnamese government in this case?
> If the character has not been submitted to the IRG for review, it would
> probably be best to work through the Vietnamese national standards
> body. Again, people on this list may be able to provide you the
> correct contact information for them.
>> 2. Is combined characters like U+20DD intended to work with all
>> different type of characters, or is it some problem related to
>> implementation ? as I when i write ゆ⃝ (Japanese Hiragana Letter Yu +
>> Combining Enclosing Circle) appear to be separate on most font I use,
>> but if I change the Hiragana Yu into a conventional = sign or some
>> latin character, most fonts are at least somehow able to put them
>> together. Or, is there any better/alternative representation in
>> unicode that can show japanese hiragana yu in a circle?
> Combining enclosing marks in principle could work with most characters,
> but in practice most arbitrary combinations do not work very well,
> because they would require very complicated font support.
>> 4.In CJK Symbols and Punctuation, Proper name mark and Book name mark
>> are not included. While there are charactera like U+2584, U+FE33,
>> U+FE4F, and U+FE34 in unicode that is more or less a representation
>> for the two symbol, they do not appear below or on the left of typed
>> characters when text flow is horizontal/vertical, and instead, they
>> occupy their own space which make them having little use in daily
>> life, and while the proper name mark and book name mark can
>> represented by text editing softwares and css but those representation
>> are not ideal and they do match "Criteria for Encoding Symbols". Is it
>> possible to make a new unicode symbol, or change some current symbol
>> into one that could appear in suitable place of other characters when
>> typed? And a property of the symbol is that when used in case like 美
>> 國紐約 which 美國 and 紐約 are two different proper name (place name),
>> so an underline should go below them without any separation between
>> the character 美and國 or 紐and約 (when text are written horizontally),
>> but at the same time the underline should not be linked between 國 and
>> 紐 as 國 is the end of first place name while 紐 is the start of the
> What you are talking about is, indeed, best handled by text styling
> rather than by individual character encoding. These are various CJK-specific
> underlining styles (for horizontal text layout) or sidelining styles (for
> vertical text layout). It is precisely because these require
> highlighting for
> ranges of characters (without breaks) that this kind of text decoration is
> handled best by style attributes (or markup), rather than by individual
> combining symbols.
> The characters U+FE33, U+FE34, U+FE4F (but not U+2584) are compatibility
> characters only for mapping to old Chinese standards that had individual
> characters encoded for these underlining or sidelining text highlights,
> but which required specialized text layout programs to make any use
> of them.
More information about the Unicode