get the sourcecode [of UTF-8]

Jim DeLaHunt list+unicode at jdlh.com
Sun Nov 3 16:42:02 CST 2024


Hello, anonymous person:

On 2024-11-02 17:42, A bughunter via Unicode wrote:
>
> Where to get the sourcecode of relevent (version) UTF-8?: in order to 
> checksum text against the specific encoding map (codepage).
>
> from A_bughunter at proton.me

I'm afraid I don't really understand what you are asking here.

UTF-8 is a data format, a way of representing 21-bit Unicode scalar 
integers in 1, 2, 3, or 4 bytes (octets). It is defined in section 
2.5.3, "UTF-8" 
<https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-2/#G11165>, 
of the Core Specification of the Unicode Standard. It has not changed 
over time, so it doesn't really have versions.

If by "source code" you refer to an implementation of the UTF-8 format, 
then is no single answer. There are multiple implementations of UTF-8, 
and so multiple independent bodies of "source code".

And there are many things which could be called a "specific encoding map 
(codepage)". I don't know which of those you are referring to.

Does that answer your question?

-- 
.   --Jim DeLaHunt, jdlh at jdlh.com     http://blog.jdlh.com/ 
(http://jdlh.com/)
       multilingual websites consultant, Vancouver, B.C., Canada



More information about the Unicode mailing list