Suppressing Ligation of Spacing Marks
unicode at lindenbergsoftware.com
Wed Nov 9 07:10:34 CST 2016
The part of the specification of the Universal Shaping Engine  that deals with ZWNJ is a bit unclear, but I read it to mean that ZWNJ should not cause the insertion of a dotted circle if the character following it has general category Mn or Mc.
The USE specification says: "The zero-width non-joiner is used to prevent a fusion of two characters. It continues a preceding cluster but causes a cluster break after itself when the following character is not a mark character (gc=Mn or gc=Mc).”
The specification does not say how this character should be handled in cluster validation. I assume first that the statement about the combining grapheme joiner also applies to ZWNJ: “CGJ has been omitted from the above schema in order to avoid unnecessary complexity”. I further interpret the little the spec does say about ZWNJ to imply that it should be allowed before any character with general category Mn or Mc, without affecting the validity of the cluster. Inserting a dotted circle would be equivalent to causing a cluster break, which the spec rules out when the following character has general category Mn or Mc.
U+1A63 has gc=Mc, so it shouldn’t be preceded by a dotted circle in the sequence <NA, ZWNJ, SIGN AA, …>. Note that I omitted the first “…” from the sequence you provided, because an intervening character might trigger the dotted circle.
So this may just be a bug in the implementation of the USE that you’re using. I see this bug in Safari (CoreText), but not in Firefox (Harfbuzz); haven’t tried Edge. Which one are you using?
> On Nov 8, 2016, at 18:09 , Richard Wordingham <richard.wordingham at ntlworld.com> wrote:
> Should it be possible to suppress the ligation of a base character and
> a visually following spacing mark in plain text?
> The example I have in minf is the sequence <U+1A36 TAI THAM LETTER NA,
> U+1A63 TAI THAM VOWEL SIGN AA>. It may be desirable to suppress the
> ligation because both ligands have subscript consonants. However, if
> I write <NA, ..., ZWNJ, SIGN AA, ...>, the Universal Shaping Engine
> decides that the ZWNJ triggers a new syllable, and inserts a dotted
> circle before SIGN AA. (The dotted circle after SIGN AA results from a
> failure to read the proposal for the Lanna script as it was then
More information about the Unicode