Interesting UTF-8 decoder
Mark Davis ☕️ via Unicode
unicode at unicode.org
Mon Oct 9 06:16:03 CDT 2017
The paper points out that the input buffer needs to be padded with 3 null
bytes as a precondition.
Mark <https://twitter.com/mark_e_davis>
On Mon, Oct 9, 2017 at 10:57 AM, J Decker via Unicode <unicode at unicode.org>
wrote:
> that's interesting; however it will segfault if the string ends on a
> memory allocation boundary. will have to make sure strings are always
> allocated with 3 extra bytes.
>
> 2017-10-09 1:37 GMT-07:00 Martin J. Dürst via Unicode <unicode at unicode.org
> >:
>
>> A friend of mine sent me a pointer to
>> http://nullprogram.com/blog/2017/10/06/, a branchless UTF-8 decoder.
>>
>> Regards, Martin.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20171009/51f12b94/attachment.html>
More information about the Unicode
mailing list