“plain text styling”…

Cristian Secară liste at secarica.ro
Tue Jan 10 19:05:27 CST 2023


În data de Sun, 8 Jan 2023 15:15:21 +0100, Kent Karlsson via Unicode a scris:

> The point is that the ”protocol” is at plain text level. That is why
> ECMA-48 styling can work for applications like terminal emulators,
> where higher-level protocols, like HTML, are out of the question.

By human convention, yes. From an abstract technical perspective, whatever protocol and syntax is used, in the end it comes down to just an ON/.../OFF switch.

> The SMS (and cell broadcast) 7-bit character encodings (there is a
> handful of them) all have just four ”control codes”: CR, LF, FF, and
> SS2 (misnamed(!) as ESC). There is no ESC character nor any CSI
> character.

Actually, the GSM 7 bit default alphabet contains the CR, LF and ESC codes, placed at their "traditional" hex positions (i.e. 0x0D, 0x0A and 0x1B respectively). A single ESC is used to 'trigger' the extension of the GSM 7 bit default alphabet or a character from a national language single shift table. It is the extension of the GSM 7 bit default alphabet where a 0x1B 0x0A sequence generates 0x0C code (FF, i.e. Form Feed, aka Page Break) and where a 0x1B 0x1B sequence generates another 0x1B code (SS2, which is "reserved for the extension to another extension table").

> So SMS and cell broadcast messages are out of scope for that simple
> reason.

Probably now useless and out of question in year 2023 for practical reasons, but – in theory – future revisions of the 3GPP TS 23.038 standard can include whatever character might be needed in those reserved-for-future-expansion places.

*
Back on topic: funny how the not-so-distant past is so quickly forgotten: during end 198x / beginning 199x period of time I used extensively and with great success a lot of "plain text styling" on at least two impact printers (one being a Citizen 120D+, which I still have today). While in direct print mode (as opposed to graphics mode), there were a lot of font styles modifiers for the printing result (well, a lot for that time), triggered with ESC or CTRL sequences.

Examples:
ESC E / ESC F > sets / cancels emphasized print
ESC G / ESC H > sets / cancels doublestrike print
ESC 4 / ESC 5 > sets / cancels italic character (Epson only)
CTRL-O / CTRL-R > sets / cancels compressed print
ESC k 0 > sets Courier character pitch
ESC k 1 > sets Citizen Display character pitch
etc.

Then, in the word processor I used at the time, these codes were allocated to visual control letters or symbols specific to that word processor and ready to be inserted, where required, during text editing.

This is what a code-controlled printing looked like in 8 bit computing (Z80-based):
https://www.secarica.ro/misc/text_print_style_via_ctrl_codes_-_tw_cpc.png
https://www.secarica.ro/misc/text_print_style_via_ctrl_codes_-_tw_zxs.png

Even if such a text was no longer "plain", for me that was just "text", with no particular type designation and no desire to give one. In today text editors, a text containing such escape codes will display some random garbage in those places, but they can be easily removed (or even converted to whatever modern-days styling syntax) with a Python script or something similar.

Cristi

-- 
Cristian Secară
https://www.secarica.ro



More information about the Unicode mailing list