table 3-6. UTF-8 bit distribution

Fri Sep 12 01:59:46 CDT 2025

On 2025-09-11 22:58, Dominikus Dittes Scherkl via Unicode wrote:
> Am 11.09.25 um 21:21 schrieb yitin--- via Unicode:
>> https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-3/#G27288 
>>
>>
>> What is the significance of using different letters (x,y,z,u)
>> for different bits?
>
> Significance? None.
> This is simply to enhance the Readability: shows witch bits of the 
> encoding represent witch bits of the scalar value.

Wikipedia uses a similar table, but using codepoint (so not showing 
"distribution", OTOH more consistent with next table).

About the significance: Check the table above, and you see: for UTF-16 
we need to specify groups (we need to subtract 1 in one group), so for 
consistency it is good to have the same notation also on UTF-8.

About readability: I'm impressed that UTF-8 is fully described in two 
lines and two tables. Considering the complexities

Note: nowhere we say about bit ordering, so the yyy xxx may help. If we 
want to be precise, (and now with new format and technologies [on real 
print may be ugly: or too small or too messy], it may be easier), we may 
but subscripts (0 to 19).

My comment is just about "distribution". Is it really the best term to use?

cate