Gap at U+2FE0

Michel Mariani 4mm4adbfrm4 at tonton-pixel.com
Sat Oct 23 12:40:54 CDT 2021


According to the recent document: Preliminary proposal to add a new provisional kIDS property (Unihan) <https://www.unicode.org/L2/L2021/21118r-kids-preliminary.pdf>, the U2FE0 block is still been considered for receiving extra IDCs:

> There is currently an unassigned block of 16 code points immediately before the Ideographic Description Characters block, specifically the range U+2FE0 through U+2FEF, which could be used to encode the fifth IDC.

BTW, the existence of this still empty "modest" block of sixteen characters would have been a good opportunity to efficiently encode a set of CJK-specific variation selectors which could have been used to represent region-specific CJK character glyphs, inspired by the the clever (unofficial) scheme proposed in the PanCJKV IVD Collection <https://github.com/adobe-type-tools/pancjkv-ivd-collection>, which IMHO would be far more acceptable if the specific variation selectors were all as short as possible. After all, one of the original aims of Han Unification was the possibility to "pack" all CJK characters in 16 bits... This is of course no more relevant now that 93,867 Unihan characters have been assigned in Unicode 14.O, and more to come later...

For the record, here are the eleven CJK glyph sources referenced so far through the various PDF code charts:

|



> Le 23 oct. 2021 à 05:17, stas via Unicode <unicode at corp.unicode.org> a écrit :
> 
> It bothers me that there is still empty space at U+2FE0 (see https://www.unicode.org/roadmaps/bmp/).
> I find it weird considering there are already two one-column blocks in SMP which are extensions of scripts encoded in BMP:
> Lisu and UCAS (see https://www.unicode.org/roadmaps/smp/). They would be perfect fit for that spot.
> 
> This message: https://www.unicode.org/mail-arch/unicode-ml/y2007-m12/0035.html
> mentions proposal for additional Ideographic Description Characters (I guess it is https://www.unicode.org/L2/L2002/02221-cdp-idc.pdf),
> but almost 20 years passed and it's still not even mentioned on the roadmap, so I guess
> it is rejected for good.
> 
> This document: https://www.unicode.org/L2/L2021/21016r-script-adhoc-rept.pdf states:
> A strong case would need to be made to place characters on the BMP, and in our view, the single open
> column at U+2FE0..U+2FEF should be used for characters with a valid case for encoding on the
> BMP. The Kanbun Extended block does not, in our opinion, fit this criterium. Ken Lunde also agrees with
> this view.
> 
> What is a stronger case than extension of a block already encoded in BMP? It won't get any better than this.
> Looks like this spot became psychological golden place and no new proposal would be good enough for it.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20211023/c4d35a00/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Sources Table.png
Type: image/png
Size: 95653 bytes
Desc: not available
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20211023/c4d35a00/attachment-0001.png>


More information about the Unicode mailing list