Unicode of Death 2.0

Manish Goregaokar via Unicode unicode at unicode.org
Sun Feb 18 01:57:43 CST 2018

Ah, looking at that the OpenType `pstf` feature seems relevant, though I
cannot get it to crash with Gurmukhi (where the consonant ya is a postform)


On Sat, Feb 17, 2018 at 4:40 PM, Philippe Verdy <verdy_p at wanadoo.fr> wrote:

> An interesting read:
> https://docs.microsoft.com/fr-fr/typography/script-
> development/bengali#reor
> 2018-02-18 1:30 GMT+01:00 Philippe Verdy <verdy_p at wanadoo.fr>:
>> My opinion about this bug is that Apple's text renderer dynamically
>> allocates a glyphs buffer only when needed (lazily), but a test is missing
>> for the lazy construction of this buffer (which is not needed for most
>> texts not needing glyph substitutions or reordering when a single accessor
>> from the code point can find the glyph data directly by lookup in font
>> tables) and this is causing a null pointer exception at run time.
>> The bug occurs effectively when processing the vowel that occurs after
>> the ZWNJ, if the code assumes that there's a glyphs buffer already
>> constructed for the cluster, in order to place the vowel over the correct
>> glyph (which may have been reordered in that buffer).
>> Microsoft's text renderer, or other engines use do not delay the
>> constructiuon of the glyphs buffer, which can be reused for processing the
>> rest of the text, provided it is correctly reset after processing a cluster.
>> 2018-02-17 21:54 GMT+01:00 Manish Goregaokar <manish at mozilla.com>:
>>> Heh, I wasn't aware of the word "phala-form", though that seems
>>> Bengali-specific?
>>> Interesting observation about the vowel glyphs, I'll mention this in the
>>> post. Initially I missed this because I hadn't realized that the bengali o
>>> vowel crashed (which made me discount this).
>>> Thanks!
>>> -Manish
>>> On Sat, Feb 17, 2018 at 12:22 PM, Philippe Verdy <verdy_p at wanadoo.fr>
>>> wrote:
>>>> I would have liked that your invented term of "left-joining consonants"
>>>> took the usual name "phala forms" (to represent RA or JA/JO after a virama,
>>>> generally named "raphala" or "japhala/jophala").
>>>> And why this bug does not occur with some vowels is because these are
>>>> vowels in two parts, that are first decomposed into two separate glyphs
>>>> reordered in the buffer of glyphs, while other vowels do not need this
>>>> prior mapping and keep their initial direct mapping from their codepoints
>>>> in fonts, which means that this has to do to the way the ZWNJ looks for the
>>>> glyphs of the vowels in the glyphs buffer and not in the initial codepoints
>>>> buffer: there's some desynchronization, and more probably an uninitialized
>>>> data field (for the lookup made in handling ZWNJ) if no vowel decomposition
>>>> was done (the same data field is correctly initialized when it is the first
>>>> consonnant which takes an alternate form before a virama, like in most
>>>> Indic consonnant clusters, because the a glyph buffer is created.
>>>> Now we have some hints about why the bug does not occur in Kannada or
>>>> Khmer: a glyph buffer is always created, but there was some shortcut made
>>>> in  Devanagari, Bengali, and Telugu to allow processing clusters faster
>>>> without having to create always a gyphs buffer (to allow reordering glyphs
>>>> before positioning them), and working directly on the codepoints streams.
>>>> So it seems related to the fact that OpenType fonts do not need to
>>>> include rules for glyph substitution, but the PHALA forms are represented
>>>> without any glyph substitution, by mapping directly the phala forms in a
>>>> separate table for the consonants. Because there's been no code to glyph
>>>> subtitution, the glyph buffer is not created, but then when processing the
>>>> ZWNJ, it looks for data in a glyph buffer that has still not be initialized
>>>> (and this is specific to the renderers implemented by Apple in iOS and
>>>> MacOS). This bug does not occur if another text rendering engine is used
>>>> (e.g. in non-Apple web browsers).
>>>> 2018-02-16 19:44 GMT+01:00 Manish Goregaokar <manish at mozilla.com>:
>>>>> FWIW I dissected the crashing strings, it's basically all <consonant,
>>>>> virama, consonant, zwnj, vowel> sequences in Telugu, Bengali, Devanagari
>>>>> where the consonant is suffix-joining (ra in Devanagari, jo and ro in
>>>>> Bengali, and all Telugu consonants), the vowel is not Bengali au or o /
>>>>> Telugu ai, and if the second consonant is ra/ro the first one is not also
>>>>> ra/ro (or ro-with-line-through-it).
>>>>> https://manishearth.github.io/blog/2018/02/15/picking-apart-
>>>>> the-crashing-ios-string/
>>>>> -Manish
>>>>> On Thu, Feb 15, 2018 at 10:58 AM, Philippe Verdy via Unicode <
>>>>> unicode at unicode.org> wrote:
>>>>>> That's probably not a bug of Unicode but of MacOS/iOS text renderers
>>>>>> with some fonts using advanced composition feature.
>>>>>> Similar bugs could as well the new advanced features added in Windows
>>>>>> or Android to support multicolored emojis, variable fonts, contextual glyph
>>>>>> transforms, style variants, or more font formats (not just OpenType); the
>>>>>> bug may also be in the graphic renderer (incorrect clipping when drawing
>>>>>> the glyph into the glyph cache, with buffer overflows possibly caused by
>>>>>> incorrectly computed splines), and it could be in the display driver (or in
>>>>>> the hardware accelerator having some limitations on the compelxity of
>>>>>> multipolygons to fill and to antialias), causing some infinite recursion
>>>>>> loop, or too deep recursion exhausting the stack limit;
>>>>>> Finally the bug could be in the OpenType hinting engine moving some
>>>>>> points outside the clipping area (the math theory may say that such
>>>>>> plcement of a point outside the clipping area may be impossible, but
>>>>>> various mathematical simplifcations and shortcuts are used to simplify or
>>>>>> accelerate the rendering, at the price of some quirks. Even the SVG
>>>>>> standard (in constant evolution) could be affected as well in its
>>>>>> implementation.
>>>>>> There are tons of possible bugs here.
>>>>>> 2018-02-15 18:21 GMT+01:00 James Kass via Unicode <
>>>>>> unicode at unicode.org>:
>>>>>>> This article:
>>>>>>> https://techcrunch.com/2018/02/15/iphone-text-bomb-ios-mac-c
>>>>>>> rash-apple/?ncid=mobilenavtrend
>>>>>>> The single Unicode symbol referred to in the article results from a
>>>>>>> string of Telugu characters.  The article doesn't list or display the
>>>>>>> characters, so Mac users can visit the above link.  A link in one of
>>>>>>> the comments leads to a page which does display the characters.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20180217/f275f441/attachment.html>

More information about the Unicode mailing list