get the sourcecode [of UTF-8]
Arthur Rosendahl
arthur at reutenauer.eu
Tue Nov 5 08:09:09 CST 2024
On Tue, Nov 05, 2024 at 01:18:59PM +0000, A bughunter via Unicode wrote:
> On Tuesday, November 5th, 2024 at 08:59, Arthur Rosendahl via Unicode <unicode at corp.unicode.org> wrote:
>> I think that’s what the OP means. He has a UTF-8-encoded string
>> which he wants to map to a sequence of code points. That’s my guess
>> anyway.
>
> This is pretty much the reverse of what I have asked for: reverse engineering an UTF-8 string in order to re-create the sourcecode I have asked for. You shouldn't have to reverse engineer.
Do you then mean that you have a sequence of Unicode code points that
you want to convert to UTF-8? In that case, the C code by Oren is what
you need (for a single code point).
> If you do not have sourcecode for UTF-8 then it is more than likely the standard does sit on the sidelines disconnected from whatever is being called Unicode and UTF-8 actually software in use. You shouldn't have to reverse engineer the software to contrast it against the Unicode standard it purports to have been.
It sounds like you’re confusing a standard and its implementation.
> Originating Question
>
> Where to get the sourcecode of relevent (version) UTF-8?: in order to checksum text against the specific encoding map (codepage).
If you’re going to keep repeating that, you should be aware that it is
incomprehensible, even without the typo and the weird wording.
Arthur
More information about the Unicode
mailing list