Teletext separated mosaic graphics

Kent Karlsson kent.b.karlsson at bahnhof.se
Sat Oct 3 19:25:30 CDT 2020



> 3 okt. 2020 kl. 20:54 skrev Doug Ewell via Unicode <unicode at unicode.org>:
> 
> Harriet Riddle wrote:
> 
>> It's worth pointing out that the control codes for showing mosaic
>> characters as separated are also used in at least some formats to
>> switch alphabetical characters to underlined display.
>> 
>> See for example the definitions for SPL and STL here:
>> https://www.itscj.ipsj.or.jp/iso-ir/056.pdf (that document details the
>> C1 control codes for Data Syntax 2 Serial Videotex—which would seem to
>> be the Teletext set but as a C1 set, and as such with CSI rather than
>> ESC).
> 
> Applications of any sort that are compliant with ISO/IEC 6429 (ECMA-48, ANSI X3.64) should understand ESC [ as a synonym for CSI.

Teletext is not compliant with ECMA-48 (unless converted).

>> Essentially, the expectation seems to be that an emphasised variant of
>> a font would display mosaic characters separated, while a regular
>> variant of a font would display them connected.
> 
> We still haven't written the Technical Note for using the Legacy Symbols -- that's largely on me -- but as far as teletext is concerned, the recommended practice is to translate teletext control codes directly onto the Basic Latin space. For example:
> 
> - "contiguous graphics" becomes U+0019
> - "separated graphics" becomes U+001A
> - "double height" becomes U+000D
> - "end box" becomes U+000A

That would be an extremely bad idea (as well as being completely non-compliant with ECMA-48, if that is still the approach, as I think it should be).

> There is no conflict with the normal meanings of U+000D and U+000A because teletext does not use these to separate lines.

I don’t know how Teletext is represented in DVB or IP-TV; but those digital representations of TV images do not use traditional ”analog” representation of TV images, and hence cannot have the ”analog” representation of ”rows” (lines) of text in Teletext. (And yes, Teletext does work fine with IP-TV.)

Note also that Teletext is rife with ”code page switching”. ESC toggles between a primary and a secondary charset (for text). In a control part of the Teletext protocol one sets the charsets for text (options include various ”national variants” of ISO/IEC 646, as well as Greek, Hebrew and Arabic (visual order, preshaped). 

Toggling between separated and contiguous ”mosaics” is also best seen as a switch between charsets. Regarding it as a styling is odd, since this particular styling would only apply to a few very rarely used characters, and the change is not one that is recognized as styling elsewhere. In addition, you have already encoded separated and contiguous other but similar ”mosaics” characters as separate characters.

Even the colour controls in Teletext switch between text and mosaics (and in addition are usually displayed as a space, as is the norm in Teletext for ”control” characters).

Part of the Teletext protocol specifies how to set/unset bold/italic/underline. But that is not inline in the text, it is ”out-of-line” elsewhere in the protocol (in a control part). But colouring, certain sizing, blink, conceal, and ”boxing” (used for (optional) subtitling and news flash messages) are inline. Note that Teletext is still often used for subtitling.

Most of Teletext styling can be converted to ECMA-48 styling as is. Some others will need an extension of ECMA-48 to be representable in that framework.

Teletext these days are often displayed in things that are not analog (of even digital) TVs; you can find web pages displaying Teletext texts, as well as mobile phone (or tablets) apps that display Teletext texts. They need not all convert those pages to an image before displaying then in the web page/app… (Though one may want to have a partial conversion to HTML rather than to ECMA-48; but for HTML that would not handle ”box” (at all) nor blink (since that is deprecated in HTML), ....)

/Kent K


> In general, a teletext application should treat control codes the way teletext would treat them, and should not try to mix C0 and teletext interpretations. This also means Rob's scenario:
> 
>> It also means a simple text-only file of just the characters won't
>> recreate a screen as the control codes to switch between contiguous/
>> separated won't work.
> 
> may not be well-conceived; the file should probably not be "text" in the sense that its lines end with some combination of CR and/or LF, unless there is an intermediate translation step.
> 
> --
> Doug Ewell, CC, ALB | Thornton, CO, US | ewellic.org
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20201004/ee989e8e/attachment.htm>


More information about the Unicode mailing list