Extended grapheme cluster stability

Martinho Fernandes via Unicode unicode at unicode.org
Tue May 22 07:43:23 CDT 2018


On 22.05.18 12:51, Martinho Fernandes via Unicode wrote:

> Hello,
>
> None of the *_Break properties are stable, as far as I can see in
> https://www.unicode.org/policies/stability_policy.html. If I understand
> correctly, this means that, at least in theory, it is possible that in
> Unicode version X a sequence of characters AB forms an extended grapheme
> cluster, i.e. A × B in the notation used in the algorithm description
> and in the test data, but then in Unicode version X+1, that changes to A
> ÷ B.
>
> Am I reading this correctly or is this not possible? Or is it possible
> in theory but not in practice? Or maybe it has happened before?
>
Hmm, to answer my own question, yes, this has happened before. In
Unicode 8 there were no breaks between regional indicators. In Unicode 9
now there are no breaks "between regional indicator (RI) symbols if
there is an odd number of RI characters before the break point". I has
also happened in the direction break=>no break, with when emoji ZWJ
sequences were introduced.

-- 
Martinho


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://unicode.org/pipermail/unicode/attachments/20180522/c44e1b86/attachment.asc>


More information about the Unicode mailing list