Re: “plain text styling”…
Mark E. Shoulson
mark at kli.org
Mon Jan 9 20:13:05 CST 2023
On 1/7/23 06:37, Cristian Secară via Unicode wrote:
> În data de Thu, 5 Jan 2023 01:53:40 +0100, Kent Karlsson via Unicode a scris:
>
>> More or less regularly there are (informal) requests on this list for
>> encoding (new) control codes or control code sequences for text
>> styling (like bold, italics, text colour, …) also for ”plain text”.
> This seems to overlooks that a "plain text" subjected to such torment can no longer be called "plain".
That was sort of my question at the outset. It doesn't make sense to
call this "plain text" anymore, when it's formatted and styled. Styling
is almost the *definition* of non-plain text. Unicode is all about plain
text, where characters represent glyphs (or spaces) that represent
text. There are some exceptions to this:
1. Use of ZWJ/ZWNJ to affect shaping/ligaturing. I do not consider the
shaping/ligaturing itself to be an exception; that's just
characters/glyphs affecting one another.
2. BiDi controls, and BiDi in general. (Does strong directionality
count as "formatting," especially with regard to LRM/RLM characters, or
is it "just characters/glyphs affecting one another"? Not sure.) Stuff
like enabling/disabling local digits and whatever is related.
3. Emoji vs text presentation.
4. "Extreme" ligaturing involving emoji ZWJ sequences, regional tags
becoming flags, and other pseudo-encoding.
Are there other exceptions? There are probably things with CGJ which
fall into the same category as #1, tweaking the interactions of adjacent
characters/glyphs. Is there really anything like the kind of formatting
you're talking about that we have considered "plain text"? Perhaps #3
is closest.
Mind you, I think improving and upgrading ECMA-48 is a dandy idea, and
your suggestions for it are as good as any I've seen (which is faint
praise because I haven't seen any, but even from my own opinion, your
ideas are pretty good.) And using it in "text" files is a thing people
have already been doing and will continue to do, though it is a bit of
an abuse of the term "text file." But I still don't really see how it
has to do with Unicode. What would you have Unicode do? Define a whole
set of "formatting commands" as part of the Unicode standard?
I think your ideas are good and I'd support them (mostly), just that
this isn't the place that decides such things.
~mark
More information about the Unicode
mailing list