FWD: Re: get the sourcecode [of UTF-8]

A bughunter A_bughunter at proton.me
Tue Nov 5 03:41:01 CST 2024


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

My reply to Oren below. 

from A_bughunter at proton.me

Sent with Proton Mail secure email.

On Monday, November 4th, 2024 at 23:24, orenwatson at tutanota.com <orenwatson at tutanota.com> wrote:

Hi Oren, where did you come up with this sourcecode?
> I don't know precisely what you're looking for but, here is some simple C sourcecode that converts a unicode codepoint to a string in utf-8.
> 
> > 
> > const char *u8ch_tostr(unsigned ch){
> >         static unsigned char buf[5]; // every utf-8 character is <=4 bytes
> >         buf[4]=0;
> >         if(ch<0200){ // ascii is a subset of utf-8
> >                 buf[3]=ch;
> >                 return buf+3;
> >         }
> >         if(ch<04000){ // characters under 0x7ff are 2 bytes
> >                 buf[3]=(ch&077) + 0200;
> >                 buf[2]=(ch/0100) + 0300;
> >                 return buf+2;
> >         }
> >         if(ch<0200000){ // characters >= 0xFFFF are 3 bytes
> >                 buf[3]=(ch&077) + 0200;
> >                 buf[2]=(ch/0100&077) + 0200;
> >                 buf[1]=(ch/010000) + 0340;
> >                 return buf+1;
> >         }
> >         buf[3]=(ch&077) + 0200; // all other characters are 4 bytes
> >         buf[2]=(ch/0100&077) + 0200;
> >         buf[1]=(ch/010000&077) + 0200;
> >         buf[0]=(ch/01000000) + 0360;
> >         return buf;
> > }
> > ---
> > Oren Watson (he/him)
> > orenwatson at tutanota.com
> 
> 
-----BEGIN PGP SIGNATURE-----
Version: ProtonMail

wnUEARYKACcFgmcp6CkJkKkWZTlQrvKZFiEEZlQIBcAycZ2lO9z2qRZlOVCu
8pkAABN7AP9rJ8UQcdRkh9uAGaLHL8cQTZ5VCGGSit6NQrkn1+Fj8wEAtqih
Z87XPWQmlkUErjqalF7UO2GkjtxwkRaTwl19IAA=
=iIr1
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: publickey - A_bughunter at proton.me - 0x66540805.asc
Type: application/pgp-keys
Size: 653 bytes
Desc: not available
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20241105/24e84587/attachment-0001.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: publickey - A_bughunter at proton.me - 0x66540805.asc.sig
Type: application/pgp-signature
Size: 119 bytes
Desc: not available
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20241105/24e84587/attachment-0001.sig>


More information about the Unicode mailing list