get the sourcecode [of UTF-8]
Alexis
flexibeast at gmail.com
Tue Nov 5 07:23:14 CST 2024
A bughunter via Unicode <unicode at corp.unicode.org> writes:
> Generally we put
> the standard into a computer language such as C. Therefore the
> Unicode
> V.16 standard of UTF-8 should also be the sourcecode of the
> implimentation these converge making them synonymous at the
> convergence.
UTF-8 is an _encoding_ of Unicode, a specification of how to
represent Unicode at the bit level. An _encoding_ is something
different from _source code_. Source code is programming language
text that gets translated - interpreted or compiled - into machine
language. UTF-8 is not a programming language. It's a way of
saying "This Unicode code point is encoded in UTF-8 with the
following bit pattern." If you'd like an introduction to how
Unicode code points - like code point 65 for 'A' - are encoded by
UTF-8, you might find this section of the relevant Wikipedia page
helpful:
https://en.wikipedia.org/wiki/UTF-8#Description
There is no piece of software that's the 'reference
implementation' of UTF-8, because UTF-8 is not a specification for
e.g. a software library providing certain functionality: again,
UTF-8 is an algorithm for representing Unicode code points at the
bit level. Programming languages provide functionality for
converting to and from UTF-8.
It's _Unicode_ that has versions; UTF-8 basically does not.
Alexis.
More information about the Unicode
mailing list