"A Programmer's Introduction to Unicode"
khaledhosny at eglug.org
Mon Mar 13 16:10:11 CDT 2017
On Mon, Mar 13, 2017 at 07:18:00PM +0000, Alastair Houghton wrote:
> On 13 Mar 2017, at 17:55, J Decker <d3ck0r at gmail.com> wrote:
> > I liked the Go implementation of character type - a rune type - which is a codepoint. and strings that return runes from by index.
> > https://blog.golang.org/strings
> IMO, returning code points by index is a mistake. It over-emphasises
> the importance of the code point, which helps to continue the notion
> in some developers’ minds that code points are somehow “characters”.
> It also leads to people unnecessarily using UCS-4 as an internal
> representation, which seems to have very few advantages in practice
> over UTF-16.
But there are many text operations that require access to Unicode code
points. Take for example text layout, as mapping characters to glyphs
and back has to operate on code points. The idea that you never need to
work with code points is too simplistic.
More information about the Unicode