get the sourcecode [of UTF-8]
A bughunter
A_bughunter at proton.me
Fri Nov 8 02:36:03 CST 2024
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
My reply to Steven interspersed.
Please do not reply further unless you intend to answer my Originating concise yet full though simple one line relevent ontopic Question.
Where to get the sourcecode of relevent (version) UTF-8?: in order to checksum text against the specific encoding map (codepage).
from A_bughunter at proton.me
Sent with Proton Mail secure email.
On Friday, November 8th, 2024 at 06:20, Steven R. Loomis <srl295 at gmail.com> wrote:
>
> On Thu, Nov 7, 2024 at 10:35 PM A bughunter via Unicode <unicode at corp.unicode.org> wrote:
>
> > @Julian says "There are no codepages in Unicode. (Or I suppose there is exactly
> > one.)" ( https://corp.unicode.org/pipermail/unicode/2024-November/011116.html )
> >
> > does conflict with
> >
> > @Otto says "UTF-8 is one method (of a handfull of standardized methods) to represent Unicode text at the bit level in order to conveniently transfer, or store, it." ( https://corp.unicode.org/pipermail/unicode/2024-November/011132.html )
>
>
> There's no conflict here. UTF-8 is a form of Unicode, it's not a codepage.
>
> > @Jim says he is putting up a post like a seeing eye dog by pasting from my GitHub ( https://www.github.com/freedom-foundation ) "I summarised what I understand of your project as a courtesy to my fellow unicode-list subscribers. "
>
>
> It's helpful to understand what the purpose of your project is. You could checksum across UTF-8 code units, or UTF-32 code units, or UTF-16 code units, or 21-bit scalar values. It's up to you.
>
> > I chuse this example because it goes to show that Unicode consortium is disappointed. You will probably read in there somewhere that it intended to solve a problem of many codepages hower you see that it has become something more complicated than the problem it were to simplify to solve.
Steven no, you have here listed 4 codepages which are all uncompatible there is no one codepage like Julian claimed.
> History does not bear out your claim. In fact, Unicode has solved exactly what it set out to accomplish. The problem of many codepages now exists mostly due to older implementations and data, plus (very rarely) due to new implementations that choose a difficult path.
>
> -s
-----BEGIN PGP SIGNATURE-----
Version: ProtonMail
wnUEARYKACcFgmctzXAJkKkWZTlQrvKZFiEEZlQIBcAycZ2lO9z2qRZlOVCu
8pkAAE8GAQDew0BFa3W22qr7iDGyqduXp6FqeQbff6WOl74t4v91bQD9EoXy
1E6kMS6E6Ouq7uy82sFm1EGgw0pO/BuRXcR0VgA=
=XJs0
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: publickey - A_bughunter at proton.me - 0x66540805.asc
Type: application/pgp-keys
Size: 653 bytes
Desc: not available
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20241108/6acc4e72/attachment.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: publickey - A_bughunter at proton.me - 0x66540805.asc.sig
Type: application/pgp-signature
Size: 119 bytes
Desc: not available
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20241108/6acc4e72/attachment.sig>
More information about the Unicode
mailing list