Re: “plain text styling”…
Asmus Freytag
asmusf at ix.netcom.com
Tue Jan 10 00:07:22 CST 2023
On 1/9/2023 9:09 PM, Doug Ewell via Unicode wrote:
>> 3. Emoji vs text presentation.
to me that's more clearly pseudo-encoding than some of the other things
now possible with emoji. It's because the wrong presentation is nearly
always really wrong, so there's no common fallback.
And add to that, that the introduction of the wrong default made
existing applications and texts suddenly fail, and you have one of the
worst blunders in Unicode's encoding history.
>>
>> 4. "Extreme" ligaturing involving emoji ZWJ sequences, regional tags
>> becoming flags, and other pseudo-encoding.
> I would actually consider things like bold, italics, and color to be less of an affront to “plain text” than an emoji presentation form or a sequence that adds up to “woman firefighter with medium-dark skin tone.” Granted ECMA-48 can be used for effects that are less plain-texty than bold, italics, and color.
>
In some ways most of the emoji sequences are really more akin to making
new characters by adding diacritic marks, or making new shapes in
context, the way shapes fuse in Indic conjuncts.
A skintone in some sense has more similarity to a diacritic on a vowel;
just because it's not a mark, but a shade, doesn't erase the similarity.
The whole visual design space for emoji is different. While color is
simply an attribute on text, skintone hews closer to a semantic
component in the way it works.
The same goes for other colors as well, a "black cat" and a generic
kitty have distinct, if overlapping semantic space, and on the level of
an individual symbol.
The concept of semantic ligatures, like the female astronaut, is
interesting, it's a departure from purely graphical constructs like
stacks, conjuncts and ligatures, but while most Latin ligatures are
optional, many conjuncts are not, and using a fallback will alter
meaning, again on the individual grapheme level.
Formatting / styling to me is distinguished by something that's
conceptually always applied to a run of text, and usually not on runs of
length one. The main exception to that was mathematical notation, and we
opted to make a principled exception, precisely because semantic mapping
to highly specific shapes for an individual symbol is or should not be
the task of "styling".
Flag sequences and the like are true examples of pseudo coding.
Introducing a scheme that maps arbitrary code point sequences to a
symbol in a way that depends on definitions maintained outside the
Unicode Standard. It's the clearest case of injecting another character
set (or a lego system to representing one) into the Standard that I've seen.
We could have done the same with three-letter codes for currency
symbols, but we didn't, and that marks the difference.
A./
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230109/371b73c1/attachment.htm>
More information about the Unicode
mailing list