table 3-6. UTF-8 bit distribution
Asmus Freytag
asmusf at ix.netcom.com
Thu Sep 11 16:48:47 CDT 2025
On 9/11/2025 1:49 PM, Jim DeLaHunt via Unicode wrote:
> On 2025-09-11 12:21, yitin--- via Unicode wrote:
>
>> https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-3/#G27288
>>
>>
>> What is the significance of using different letters (x,y,z,u)
>> for different bits? I don't see any consistent pattern in
>> the naming. https://www.rfc-editor.org/rfc/rfc3629 just
>> uses x for all of them.
>
> What I like about Table 3-6's notation is that it shows how the bits
> in the various code units (x,y,z,u) correspond to the bits in the
> scalar value. See for example, the final scalar value:
>
>> 000uuuuu zzzzyyyy yyxxxxxx
> The right-hand part of that row shows that the 'u' bits are encoded in
> the first and second bytes, the 'z' bits are encoded in the second
> byte, the 'y' bits are encoded in the third byte, the 'x' bits are
> encoded in the fourth byte.
>
> The table in section 3 of RFC3629 just shows ranges of scalar values,
> not the bit patterns within the scalar values. Thus it does not
> illustrate as much as the Core Spec illustrates.
>
>
I agree with Jim's discussion of how this adds readability.
A./
More information about the Unicode
mailing list