Re: “plain text styling”…

Kent Karlsson kent.b.karlsson at bahnhof.se
Thu Jan 12 10:57:44 CST 2023



> 10 jan. 2023 kl. 07:07 skrev Asmus Freytag via Unicode <unicode at corp.unicode.org>:
> 
> On 1/9/2023 9:09 PM, Doug Ewell via Unicode wrote:
>>> 3. Emoji vs text presentation.
> to me that's more clearly pseudo-encoding than some of the other things now possible with emoji. It's because the wrong presentation is nearly always really wrong, so there's no common fallback.
> 
> And add to that, that the introduction of the wrong default made existing applications and texts suddenly fail, and you have one of the worst blunders in Unicode's encoding history.

I currently try to stay out of emoji stuff. Mostly. But I must point out that labelling (the emoji for) poisonous mushrooms, and some mushrooms are deadly poisonous, as ”food” or ”vegetables” is highly inappropriate.
> […]
> Formatting / styling to me is distinguished by something that's conceptually always applied to a run of text, and usually not on runs of length one.
> 
Technically, yes, but conceptually no. Styling can for the most part be though of as applying to individual characters.

This is in contrast to such things as bidi, which, even without bidi controls, just based on the bidi categories of individual characters, must be seen as applying to runs of characters, due to the resulting reordering. (Sorry for the long sentence.)
> The main exception to that was mathematical notation, and we opted to make a principled exception, precisely because semantic mapping to highly specific shapes for an individual symbol is or should not be the task of "styling”.
> 
1. That styling(!) is lost when doing normalizing to NFKD or NFKC.
2. MathML still considers it a styling. LaTeX has always considered it a styling.
3. It is not general enough. (See my proposal on math expression representation.)

/Kent K
> 
> Flag sequences and the like are true examples of pseudo coding. Introducing a scheme that maps arbitrary code point sequences to a symbol in a way that depends on definitions maintained outside the Unicode Standard. It's the clearest case of injecting another character set (or a lego system to representing one) into the Standard that I've seen.
> 
> We could have done the same with three-letter codes for currency symbols, but we didn't, and that marks the difference.
> 
> A./

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230112/71f0d257/attachment.htm>


More information about the Unicode mailing list