Re: “plain text styling”…

Kent Karlsson kent.b.karlsson at bahnhof.se
Thu Jan 12 18:59:28 CST 2023



> 12 jan. 2023 kl. 19:23 skrev Cristian Secară via Unicode <unicode at corp.unicode.org>:
> 
> În data de Thu, 12 Jan 2023 17:57:39 +0100, Kent Karlsson via Unicode a scris:
> 
>>> Just because the ESC in GSM does not work the same way as the ESC
>>> in ECMA-48 does not mean it's not ESC.  
>> 
>> You can call it MAMA if you like (but that would also be confusing).
>> It still works just like SS2, not at all like ESC, not even close
>> (i.e. not even like the ESC of old equipments, like that Cristian
>> referred to).
> 
> Well, it is the 3GPP 23.038 specification [1] that calls it "ESC" (not me or anyone else here).

I know. (I should review the updates done during 2022; last I looked close was in 2020…)

> As for the "not even like the ESC of old equipments" I am not sure how this is *not* similar:
> 
> ESC e  gives €
> ESC <  gives [
> ESC (  gives {
> ... and so on (not that many more, though)

There are other ”national language” tables that are more filled. This is wildly different from ESC in ”old equipment” as well as ECMA-48.
ESC with follow character(s) generate ”controls”; whereas the above ”generates” graphic characters (which is the purpose of SS2 and SS3). (Yes, I did suggest, in the referenced paper, to use a control sequence for character references… So I am violating the ”rule” myself… But I don’t see another way of having character references in ECMA-48 style.)

> While not mentioned anywhere in the specification, in terms of SS that should probably be SS1 (only with the ESC ESC sequence as SS2).

Note that there is no SS1 (nor any SS0). But we do have SS2 and SS3 (both invalid to use with Unicode of course).

/Kent K

> Anyway, this strict GSM-specific discussion became off topic now; what I wanted to say initially, was that *in certain cases* – even if not that many, as I can imagine – a ~plain text styling may mislead ordinary users when physical (low level) characters count matters on something presumed to be strictly plain (as opposed to higher levels of text styling, where even a few dozen characters can go unnoticed, usually due to the nature of the target application).
> 
> Cristi
> 
> [1] https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=745
> 
> -- 
> Cristian Secară
> https://www.secarica.ro
> 




More information about the Unicode mailing list