The control codes of the 1976 teletext specification are a brilliant solution, given the boundary condition

Thu Jan 13 17:25:13 CST 2022

> 12 jan. 2022 kl. 17:53 skrev William_J_G Overington via Unicode <unicode at corp.unicode.org>:
> 
> In the document
> 
> https://www.unicode.org/L2/L2022/22013-c0-c1-stability.pdf
> 
> Kent Karlsson writes:
> 
>> But there are some character encodings that are “a bit crazy” when it comes to control codes. They override all of C0 (or C1) with something that cannot even be regarded as pure control characters. In particular Teletext ...
> 
> In my opinion, the control codes of the 1976 teletext specification are a brilliant solution, given the boundary condition that existed at the time.

Key: 1976. Then ”all there was” was 7-bit codepages. (7-bit, not 8-bit; the 8th bit was then commonly used for parity, as it still is in the Teletext protocol.)

> In order to work, a teletext-equipped television set needed enough solid state memory, to which data could be wriiten and from which data could be read, to store a whole teletext page.

Again, 1976. And has been carried on with backwards compatibility until now. However, that is far from making that solution appropriate today, especially not in a Unicode context.

> At the time such solid state memory was expensive, so using two kilobytes of memory rather than one kilobyte of memory would have added significant cost to each teletext-equipped television set.

Sure. 1976.

> So the decision was made to design the specification such that one kilobyte of solid state memory would be sufficient to store a complete teletext page.
> 
> I was told that originally the BBC (British Broadcasting Corporation) and the IBA (Independent Broadcasting Authority) had each developed a prototype system of a text-based information system of its own and that the best features of each were included in the agreed common teletext technical specification.

I don’t know the historical details, but one precursor to Teletext was apparently ”Videotex” (no ”t” at the end…).

> In an era where personal computing was only starting and computers were mostly in businesses, universities and polytechnics, and mostly in monochrome and just text-based, colourful teletext with its graphics was very futuristic and often on view displaying a multipage (a teletext page on a fixed page number yet such that there were a number of different page displays broadcast in sequence, changing, say, every thirty seconds) in the window display of a shop.

That’s fine (1976).

However, that is far from making that solution appropriate today, especially not in a Unicode context. I would agree that not everything need be turned into HTML, however popular it is (HTML/CSS is great, but needlessly heavy-handed for, say, archiving Teletext pages). Though ECMA-48 is also quite old, it still has solutions for colouring and other styling that 1) are compatible with Unicode, 2) is quite popular (to an extent) in all terminal emulators (but ECMA-48 is not limited to terminal emulators), 3) can handle ”later” extension to Teletext w.r.t. colours and styling (though certain extensions to ECMA-48 would be needed for handling conversion of certain Teletext features, esp. subtitling), 4) are viable also outside of the Teletext protocol (the Teletext triple-function ”controls” have no business outside of the Teletext protocol).

Getting back to what I wrote in https://www.unicode.org/L2/L2022/22013-c0-c1-stability.pdf: It is super-highly inappropriate to treat C0/C1 in Unicode as a private-use area, which some have proposed.

/Kent Karlsson

> William Overington
> 
> Wednesday 12 January 2022

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20220114/7b907564/attachment.htm>