Transform rule syntax clarifications
Richard Wordingham via CLDR-Users
cldr-users at unicode.org
Sat Nov 16 20:37:24 CST 2019
On Sat, 16 Nov 2019 13:18:00 -0800
Cameron Dutro via CLDR-Users <cldr-users at unicode.org> wrote:
> The other bits of syntax you've mentioned are from the Unicode Set
> specification, which you can find in UTS #35
> <https://unicode.org/reports/tr35/#Unicode_Sets>. Unicode Sets are
> like regex character classes, but as you've noticed, there are a
> couple of special operations they support that regexes don't.
> Specifically, the "-" operator is the symmetric difference
> <https://en.wikipedia.org/wiki/Symmetric_difference> between the two
> operands (UTS 35 says "asymmetric difference," but I don't think
> that's a thing - I can't find any definition of it online).
It very much is a thing! In this particular case,
$accent_minus = [[$accent]-[$iotasub$macron]];
is probably the same as the symmetric difference, because from
the names i think everything in the second set is in the first set, but
this doesn't always apply. [abcd] - [abef] is [cd], not the symmetric
difference [cdef].
Richard.
More information about the CLDR-Users
mailing list