table 3-6. UTF-8 bit distribution
Giacomo Catenazzi
cate at cateee.net
Fri Sep 12 01:59:46 CDT 2025
On 2025-09-11 22:58, Dominikus Dittes Scherkl via Unicode wrote:
> Am 11.09.25 um 21:21 schrieb yitin--- via Unicode:
>> https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-3/#G27288
>>
>>
>> What is the significance of using different letters (x,y,z,u)
>> for different bits?
>
> Significance? None.
> This is simply to enhance the Readability: shows witch bits of the
> encoding represent witch bits of the scalar value.
Wikipedia uses a similar table, but using codepoint (so not showing
"distribution", OTOH more consistent with next table).
About the significance: Check the table above, and you see: for UTF-16
we need to specify groups (we need to subtract 1 in one group), so for
consistency it is good to have the same notation also on UTF-8.
About readability: I'm impressed that UTF-8 is fully described in two
lines and two tables. Considering the complexities
Note: nowhere we say about bit ordering, so the yyy xxx may help. If we
want to be precise, (and now with new format and technologies [on real
print may be ugly: or too small or too messy], it may be easier), we may
but subscripts (0 to 19).
My comment is just about "distribution". Is it really the best term to use?
cate
More information about the Unicode
mailing list