FWD: Re: get the sourcecode [of UTF-8]
A bughunter
A_bughunter at proton.me
Tue Nov 5 03:41:01 CST 2024
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
My reply to Oren below.
from A_bughunter at proton.me
Sent with Proton Mail secure email.
On Monday, November 4th, 2024 at 23:24, orenwatson at tutanota.com <orenwatson at tutanota.com> wrote:
Hi Oren, where did you come up with this sourcecode?
> I don't know precisely what you're looking for but, here is some simple C sourcecode that converts a unicode codepoint to a string in utf-8.
>
> >
> > const char *u8ch_tostr(unsigned ch){
> > static unsigned char buf[5]; // every utf-8 character is <=4 bytes
> > buf[4]=0;
> > if(ch<0200){ // ascii is a subset of utf-8
> > buf[3]=ch;
> > return buf+3;
> > }
> > if(ch<04000){ // characters under 0x7ff are 2 bytes
> > buf[3]=(ch&077) + 0200;
> > buf[2]=(ch/0100) + 0300;
> > return buf+2;
> > }
> > if(ch<0200000){ // characters >= 0xFFFF are 3 bytes
> > buf[3]=(ch&077) + 0200;
> > buf[2]=(ch/0100&077) + 0200;
> > buf[1]=(ch/010000) + 0340;
> > return buf+1;
> > }
> > buf[3]=(ch&077) + 0200; // all other characters are 4 bytes
> > buf[2]=(ch/0100&077) + 0200;
> > buf[1]=(ch/010000&077) + 0200;
> > buf[0]=(ch/01000000) + 0360;
> > return buf;
> > }
> > ---
> > Oren Watson (he/him)
> > orenwatson at tutanota.com
>
>
-----BEGIN PGP SIGNATURE-----
Version: ProtonMail
wnUEARYKACcFgmcp6CkJkKkWZTlQrvKZFiEEZlQIBcAycZ2lO9z2qRZlOVCu
8pkAABN7AP9rJ8UQcdRkh9uAGaLHL8cQTZ5VCGGSit6NQrkn1+Fj8wEAtqih
Z87XPWQmlkUErjqalF7UO2GkjtxwkRaTwl19IAA=
=iIr1
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: publickey - A_bughunter at proton.me - 0x66540805.asc
Type: application/pgp-keys
Size: 653 bytes
Desc: not available
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20241105/24e84587/attachment-0001.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: publickey - A_bughunter at proton.me - 0x66540805.asc.sig
Type: application/pgp-signature
Size: 119 bytes
Desc: not available
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20241105/24e84587/attachment-0001.sig>
More information about the Unicode
mailing list