Janusz S. Bień
jsbien at mimuw.edu.pl
Thu Sep 15 14:56:32 CDT 2016
On Thu, Sep 15 2016 at 21:27 CEST, eliz at gnu.org writes:
> Isn't "grapheme cluster" the definition you are looking for?
I don't think so.
On Thu, Sep 15 2016 at 21:27 CEST, leoboiko at namakajiri.net writes:
> Isn't the Swift "character" and the "textel" merely the same thing as
> what Unicode already named "grapheme clusters"? (Well, technically UAX
> #29 defines them as "user-perceived characters", but then says
> grapheme clusters approximate user-perceived characters
> And, indeed, Swift "Characters" are explicitly defined as "extended
> grapheme clusters" (also from UAX #29):
> Such a notion is indeed needed, but it has been always there.
>  http://unicode.org/reports/tr29/
Perhaps I don't understand properly the rather obscure definitions, like
An extended grapheme cluster is the same as a legacy grapheme
cluster, with the addition of some other characters.
1. Graphemes, if I understand correctly, are language dependent, textels
2. Textel "ń" means both U+0144 and <U+006E,U+0301>, so it is a notion
on a higher abstraction level then a grapheme cluster.
Moreover I don't want to call <U+006E,U+0301> (LATIN SMALL LETTER N,
COMBINING ACUTE ACCENT) an extended grapheme cluster for at least 2
1. there is nothing extended in it
2. U+0301 is not a grapheme according to Polish linguistics terminology
Prof. dr hab. Janusz S. Bien - Uniwersytet Warszawski (Katedra Lingwistyki Formalnej)
Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department)
jsbien at uw.edu.pl, jsbien at mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/
More information about the Unicode