Re: “plain text styling”…

Mark E. Shoulson mark at kli.org
Mon Jan 9 20:13:05 CST 2023


On 1/7/23 06:37, Cristian Secară via Unicode wrote:
> În data de Thu, 5 Jan 2023 01:53:40 +0100, Kent Karlsson via Unicode a scris:
>
>> More or less regularly there are (informal) requests on this list for
>> encoding (new) control codes or control code sequences for text
>> styling (like bold, italics, text colour, …) also for ”plain text”.
> This seems to overlooks that a "plain text" subjected to such torment can no longer be called "plain".

That was sort of my question at the outset.  It doesn't make sense to 
call this "plain text" anymore, when it's formatted and styled.  Styling 
is almost the *definition* of non-plain text. Unicode is all about plain 
text, where characters represent glyphs (or spaces) that represent 
text.  There are some exceptions to this:

1. Use of ZWJ/ZWNJ to affect shaping/ligaturing.  I do not consider the 
shaping/ligaturing itself to be an exception; that's just 
characters/glyphs affecting one another.

2. BiDi controls, and BiDi in general.  (Does strong directionality 
count as "formatting," especially with regard to LRM/RLM characters, or 
is it "just characters/glyphs affecting one another"?  Not sure.)  Stuff 
like enabling/disabling local digits and whatever is related.

3. Emoji vs text presentation.

4. "Extreme" ligaturing involving emoji ZWJ sequences, regional tags 
becoming flags, and other pseudo-encoding.

Are there other exceptions?  There are probably things with CGJ which 
fall into the same category as #1, tweaking the interactions of adjacent 
characters/glyphs.  Is there really anything like the kind of formatting 
you're talking about that we have considered "plain text"?  Perhaps #3 
is closest.

Mind you, I think improving and upgrading ECMA-48 is a dandy idea, and 
your suggestions for it are as good as any I've seen (which is faint 
praise because I haven't seen any, but even from my own opinion, your 
ideas are pretty good.)  And using it in "text" files is a thing people 
have already been doing and will continue to do, though it is a bit of 
an abuse of the term "text file."  But I still don't really see how it 
has to do with Unicode.  What would you have Unicode do?  Define a whole 
set of "formatting commands" as part of the Unicode standard?

I think your ideas are good and I'd support them (mostly), just that 
this isn't the place that decides such things.

~mark



More information about the Unicode mailing list