Unicode of Death 2.0
Manish Goregaokar via Unicode
unicode at unicode.org
Sat Feb 17 14:54:57 CST 2018
Heh, I wasn't aware of the word "phala-form", though that seems
Interesting observation about the vowel glyphs, I'll mention this in the
post. Initially I missed this because I hadn't realized that the bengali o
vowel crashed (which made me discount this).
On Sat, Feb 17, 2018 at 12:22 PM, Philippe Verdy <verdy_p at wanadoo.fr> wrote:
> I would have liked that your invented term of "left-joining consonants"
> took the usual name "phala forms" (to represent RA or JA/JO after a virama,
> generally named "raphala" or "japhala/jophala").
> And why this bug does not occur with some vowels is because these are
> vowels in two parts, that are first decomposed into two separate glyphs
> reordered in the buffer of glyphs, while other vowels do not need this
> prior mapping and keep their initial direct mapping from their codepoints
> in fonts, which means that this has to do to the way the ZWNJ looks for the
> glyphs of the vowels in the glyphs buffer and not in the initial codepoints
> buffer: there's some desynchronization, and more probably an uninitialized
> data field (for the lookup made in handling ZWNJ) if no vowel decomposition
> was done (the same data field is correctly initialized when it is the first
> consonnant which takes an alternate form before a virama, like in most
> Indic consonnant clusters, because the a glyph buffer is created.
> Now we have some hints about why the bug does not occur in Kannada or
> Khmer: a glyph buffer is always created, but there was some shortcut made
> in Devanagari, Bengali, and Telugu to allow processing clusters faster
> without having to create always a gyphs buffer (to allow reordering glyphs
> before positioning them), and working directly on the codepoints streams.
> So it seems related to the fact that OpenType fonts do not need to include
> rules for glyph substitution, but the PHALA forms are represented without
> any glyph substitution, by mapping directly the phala forms in a separate
> table for the consonants. Because there's been no code to glyph
> subtitution, the glyph buffer is not created, but then when processing the
> ZWNJ, it looks for data in a glyph buffer that has still not be initialized
> (and this is specific to the renderers implemented by Apple in iOS and
> MacOS). This bug does not occur if another text rendering engine is used
> (e.g. in non-Apple web browsers).
> 2018-02-16 19:44 GMT+01:00 Manish Goregaokar <manish at mozilla.com>:
>> FWIW I dissected the crashing strings, it's basically all <consonant,
>> virama, consonant, zwnj, vowel> sequences in Telugu, Bengali, Devanagari
>> where the consonant is suffix-joining (ra in Devanagari, jo and ro in
>> Bengali, and all Telugu consonants), the vowel is not Bengali au or o /
>> Telugu ai, and if the second consonant is ra/ro the first one is not also
>> ra/ro (or ro-with-line-through-it).
>> On Thu, Feb 15, 2018 at 10:58 AM, Philippe Verdy via Unicode <
>> unicode at unicode.org> wrote:
>>> That's probably not a bug of Unicode but of MacOS/iOS text renderers
>>> with some fonts using advanced composition feature.
>>> Similar bugs could as well the new advanced features added in Windows or
>>> Android to support multicolored emojis, variable fonts, contextual glyph
>>> transforms, style variants, or more font formats (not just OpenType); the
>>> bug may also be in the graphic renderer (incorrect clipping when drawing
>>> the glyph into the glyph cache, with buffer overflows possibly caused by
>>> incorrectly computed splines), and it could be in the display driver (or in
>>> the hardware accelerator having some limitations on the compelxity of
>>> multipolygons to fill and to antialias), causing some infinite recursion
>>> loop, or too deep recursion exhausting the stack limit;
>>> Finally the bug could be in the OpenType hinting engine moving some
>>> points outside the clipping area (the math theory may say that such
>>> plcement of a point outside the clipping area may be impossible, but
>>> various mathematical simplifcations and shortcuts are used to simplify or
>>> accelerate the rendering, at the price of some quirks. Even the SVG
>>> standard (in constant evolution) could be affected as well in its
>>> There are tons of possible bugs here.
>>> 2018-02-15 18:21 GMT+01:00 James Kass via Unicode <unicode at unicode.org>:
>>>> This article:
>>>> The single Unicode symbol referred to in the article results from a
>>>> string of Telugu characters. The article doesn't list or display the
>>>> characters, so Mac users can visit the above link. A link in one of
>>>> the comments leads to a page which does display the characters.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Unicode