What constitute? an abstract character?

Fuqiao Xue xfq.free at gmail.com
Sun Jun 14 19:47:37 CDT 2020


Hi Corentin,

The term "abstract character" is ambiguous and can have multiple
definitions. Depending on what you need, It can refer to visual (i.e.,
grapheme), logical (i.e., code point), or byte-level (i.e., code unit)
representation of a given piece of text.

FYI - W3C developed a Character Model document, which includes some
guidelines on "characters" and may be useful to you:
https://www.w3.org/TR/charmod/

Cheers,

Fuqiao

2020年6月15日(月) 8:01 Corentin via Unicode <unicode at unicode.org>:
>
> Hello
> While trying to define suitable semantic for the lexing of C++, we seem to fail to agree on the definition of abstract characters
>
> Notably:
> - Would diatrics marks considered in isolation be considered abstract characters?
> - What about Hangul Jamos and other marks that are not found in isolation in their respective scripts, Variation selectors, etc ?
>
> I guess another way to phrase my question is: does every assigned codepoint represent on its own an abstract character?
>
> My understanding is that is not the case, but I am eager to be enlighten
>
> Thanks,
>
> Corentin



More information about the Unicode mailing list