Compatibility decomposables that are not compatibility characters

Monica Merchant monicamerchant1 at gmail.com
Thu Feb 17 06:18:56 CST 2022


Hello,

I have a question about the last two examples on the bottom of page 27
of Chapter
2.3 Compatibility Characters
<https://www.unicode.org/versions/Unicode14.0.0/ch02.pdf>:

*Example 1*

By way of contrast, some compatibility decomposable characters, such as
> modifier letters
> used in phonetic orthographies, for example, U+02B0 modifier letter small
> h, are not
> considered to be compatibility characters. They would have been accepted
> for encoding in
> the standard on their own merits, regardless of their need for mapping to
> IPA. A large
> number of compatibility decomposable characters like this are actually
> distinct symbols
> used in specialized notations, whether phonetic or mathematical. In such
> cases, their compatibility
> mappings express their historical derivation from styled forms of standard
> letters.



*Example 2*

Other compatibility decomposable characters are widely used characters
> serving essential
> functions. U+00A0 no-break space is one example. In these and similar
> cases, such as
> fixed-width space characters, the compatibility decompositions define
> possible fallback
> representations.


The first example illustrates the case where a *compatibility decomposable
character* is *not* a *compatibility character* (i.e. a character that
would not have been encoded except for round-tripping with a source
standard): The Spacing Modifier Letters (U+02B0-U+02FF) and Mathematical
Alphanumeric Symbols (U+1D400-U+1D7FF) are not compatibility characters
because, although they resemble rich text variants of ordinary letters,
they are actually distinct symbols and therefore would have been accepted
for encoding on their own merits (as opposed to being encoded solely for
round-tripping).

However, I'm confused by the second example. In particular, I'm not
sure if no-break
space (*U+00A0*) and the fixed-width space characters (*U+2000-U+200A*) are
compatibility characters or not. They are described as "serving essential
functions", which I read as meaning that they would have been encoded even
if it weren't for round-tripping, in which case they would not be
considered as compatibility characters. Is this correct? If so, are they
essential because they facilitate the typesetting of text-based markup like
HTML (where formatting must be specified in plain text)? No-break space is
also essential in that it is used to display standalone non-spacing marks (pg
267 <https://www.unicode.org/versions/Unicode14.0.0/ch06.pdf>).

I apologise if this is an obvious question and would be grateful for any
guidance, as most resources only mention compatibility characters in
passing.


Thank you,

Monica
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20220218/9c903070/attachment.htm>


More information about the Unicode mailing list