Why Work at Encoding Level?

Mark Davis ☕️ mark at macchiato.com
Tue Oct 20 22:37:54 CDT 2015

​> ​A good Unicode string in a programming language

Yes, that would be great, no question. It isn't, however, the case in most
programming languages (measured by the amount of software written in them).
The original question that started these threads was how to handle isolated
surrogates. If you are lucky enough to be only ever using programming
languages that prevent that from ever happening, then the question is moot
for you. If you're not, the question is relevant.


On Tue, Oct 20, 2015 at 6:47 PM, Daniel Bünzli <daniel.buenzli at erratique.ch>

> Le mercredi, 21 octobre 2015 à 02:23, Mark Davis ☕️ a écrit :
> > But more fundamentally, there may not be "excuses" for such software,
> but it happens anyway. Pretending it doesn't, makes for unhappy customers.
> For example, you don't want to be throwing an exception when one is
> encountered, when that could cause an app to fail.
> It does happen at the input layer but it doesn't make any sense to bother
> the programmers with this once the IO boundary has been crossed and
> decoding errors handled.  A good Unicode string in a programming language
> should at least operate at the scalar value level and these notions of
> Unicode n-bit strings should definitively be killed (maybe it would have
> inspired hopeless designers of recent programming languages to actually
> make better choices on that topic).
> Best,
> Daniel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20151020/5dfd618a/attachment.html>

More information about the Unicode mailing list