"A Programmer's Introduction to Unicode"

Manish Goregaokar manish at mozilla.com
Mon Mar 13 17:26:00 CDT 2017


Do you have examples of AA being split that way (and further reading)?
I think I'm aware of what you're talking about, but would love to read
more about it.
-Manish


On Mon, Mar 13, 2017 at 2:47 PM, Richard Wordingham
<richard.wordingham at ntlworld.com> wrote:
> On Mon, 13 Mar 2017 23:10:11 +0200
> Khaled Hosny <khaledhosny at eglug.org> wrote:
>
>> But there are many text operations that require access to Unicode code
>> points. Take for example text layout, as mapping characters to glyphs
>> and back has to operate on code points. The idea that you never need
>> to work with code points is too simplistic.
>
> There are advantages to interpreting and operating on text as though it
> were in form NFD.  However, there are still cases where one needs
> fractions of a character, such as word boundaries in Sanskrit, though I
> think the locations are liable to be specified in a language-specific
> form.  U+093E DEVANAGARI VOWEL SIGN AA can have a word boundary in it
> in at least 4 ways.
>
> Richard.


More information about the Unicode mailing list