"A Programmer's Introduction to Unicode"

Richard Wordingham richard.wordingham at ntlworld.com
Mon Mar 13 16:47:04 CDT 2017


On Mon, 13 Mar 2017 23:10:11 +0200
Khaled Hosny <khaledhosny at eglug.org> wrote:
 
> But there are many text operations that require access to Unicode code
> points. Take for example text layout, as mapping characters to glyphs
> and back has to operate on code points. The idea that you never need
> to work with code points is too simplistic.

There are advantages to interpreting and operating on text as though it
were in form NFD.  However, there are still cases where one needs
fractions of a character, such as word boundaries in Sanskrit, though I
think the locations are liable to be specified in a language-specific
form.  U+093E DEVANAGARI VOWEL SIGN AA can have a word boundary in it
in at least 4 ways.

Richard.


More information about the Unicode mailing list