Why aren't the emoji modifiers GCB=Extend?

Ken Whistler kenwhistler at att.net
Fri Jun 19 17:24:26 CDT 2015


This results from the fact that the fallback behavior for the modifiers is
simply as independent pictographic blorts, i.e. the color swatch images.
That is also related to why they are treated as gc=Sk symbol modifiers,
rather than as combining marks or format characters.

If you *support* emoji modifier sequences, then yes, you should treat
them as single grapheme clusters for editing -- but their behavior is
more akin then to ligatures or conjuncts than to combining character
sequences. You need additional, specific
knowledge about these sequences -- it doesn't just fall out from a
*default* implementation of UAX #29 rules for grapheme clusters.


On 6/19/2015 1:51 PM, Karl Williamson wrote:
> Someone writing code using Unicode 8 found that the FITZPATRICK 
> modifiers are considered separate graphemes from what they modify.  
> This is surprising, and seems contrary to not only the concept of a 
> grapheme cluster, but the spirit of tr51  2.2.3 "A supported emoji 
> modifier sequence should be treated as a single grapheme cluster for 
> editing purposes"

More information about the Unicode mailing list