Teletext separated mosaic graphics
kent.b.karlsson at bahnhof.se
Mon Oct 12 17:38:38 CDT 2020
> 11 okt. 2020 kl. 08:12 skrev Asmus Freytag via Unicode <unicode at unicode.org>:
> The simple solution would be to try to contact the people behind some of these websites and simply asking how they do it. That might provide useful answers to whether any encoding solution will be taken up by implementers.
As for the production of Teletext pages, my guess is that there will be no change in tooling or encoding for as long as Teletext pages are going to be produced (except for updates to use newly allocated Unicode characters). The production is, admittedly, waning, and may stop in a few years. At least for ”news” pages; subtitling may go on much longer.
What may be of more interest is archiving. Digital archives saving ”old” data have faced, and are still facing, at least two issues. One is the physical media themselves. Tape (formats), diskette (formats), and tape readers, diskette readers, and now also more ”modern” storage media are getting outdated; and keeping ”old” data needs storage media transfer. Another is encoding; for text there has been a need to convert from various ”old” encodings to ”modern” ones; now often converting to a Unicode encoding.
I don’t know if anyone is actively trying to archive (in a future retrievable manner) Teletext pages. But if there are, they face a text encoding issue. Not just for the ordinary text, but also for the styling of the text. The storage formats (likely vendor specific) will likely go outdated; the ”broadcast formats/protocols” (that we are discussing) are already quaint and incompatible with modern computers. Saving the pages as HTML (including the linkage between pages) may be sufficient for quite some time.
It is possible to convert Teletext pages to use an ECMA-48-based format (using some extensions); that would make the pages directly displayable on terminal emulators (assuming that the terminal emulator implements the extensions…). Teletext pages do, after all, have a ”look” that is close to the ”look” of terminal emulators… Or be displayable in text editors that are ECMA-48 enabled… This would be closer in concept to Teletext for the styling controls than HTML is. I have a suggestion for extensions to ECMA-48 styling that covers Teletext styling capabilities. But those are just suggestions from me; a proof of concept, and not wide-spread implementations.
> On 10/10/2020 3:02 PM, Kent Karlsson via Unicode wrote:
>> Here are a few more web sites showing Teletext pages from various European TV channels.
>> THE LIST IS SURELY FAR FROM COMPLETE, it is just a sample. But it does show that Teletext
>> is commonly displayed as web pages, not just via TV channels (whether "analog" or DVB).
>> I haven't seen these combined with web versions of TV channels, but that would surely
>> be possible to combine. That would be especially useful for optional subtilting, where
>> Teletext is still much used, as a useful accessibility feature.
>> I have no prediction of how long any channels will continue to produce Teletext content.
>> But optional subtilting seems to "survive" longer.
>> I do not know what source format(s) may be used, but it is surely not HTML *nor* close
>> to the Teletext protocol. But see the Teletext page edit tool referenced below.
>> Spain, RTVE:
>> https://www.rtve.es/television/teletexto/100/ <https://www.rtve.es/television/teletexto/100/>
>> Sweden, SVT:
>> https://texttv.nu/ <https://texttv.nu/> (also as iOS app, same name)
>> https://www.svt.se/svttext/web/pages/100.html <https://www.svt.se/svttext/web/pages/100.html>
>> Iceland, RÚV:
>> http://textavarp.is/sida/100 <http://textavarp.is/sida/100>
>> Denmark, DR:
>> https://www.dr.dk/cgi-bin/fttx1.exe/100 <https://www.dr.dk/cgi-bin/fttx1.exe/100>
>> Norway, NRK:
>> https://www.nrk.no/tekst-tv/100/ <https://www.nrk.no/tekst-tv/100/>
>> Finland, YLE:
>> https://yle.fi/aihe/tekstitv <https://yle.fi/aihe/tekstitv>
>> Switzerland, SRF:
>> https://www.teletext.ch/ <https://www.teletext.ch/>
>> Croatia, HRT:
>> https://teletekst.hrt.hr/ <https://teletekst.hrt.hr/>
>> https://www.greektvidents.com/Teletext_ERTEXT.shtml <https://www.greektvidents.com/Teletext_ERTEXT.shtml>
>> And more; I have not done a complete survey!
>> There are also several apps for iOS and for Android that display Teletext content from
>> various (TV channel) providers.
>> What the source format is for the Teletext pages as produced today, I don't know. But I would
>> guess that it is likely "plain text" files, with Teletext specific markup, that is then converted to
>> 1) Teletext analog format, 2) Teletext DVB format, 3) HTML. But that is just my guess.
>> Again note that Teletext is still commonly used for optional subtitles. (DVB subtitles, a "bitmapped"
>> format (i.e. the subtitles are sent as images, not text) does not seem to be used much. At least, I
>> haven't seen it.) This requires timing, which is not part of the Teletext protocol, but must be in
>> the source in order to control when a subtitle is output as Teletext for optional display.
>> "But a teletext application for a modern computer is not "normal use." It is reasonable
>> for a non-standard application like this to interpret characters from U+0000 to U+001F
>> as the corresponding ISO 646 characters would be in teletext."
>> is very false.
>> Further, the "object" overrides in the Teletext *protocol*, in several levels, "objects"
>> prioritized depending on "implementation level", can specify:
>> 1) Bold, italic, underline, proportional font.
>> 2) More colours (but only 16 levels per red/green/blue, no transparency though).
>> 3) Character substitutions (likely replacing spaces) to be able to display characters from "G3".
>> These cannot be handled by "retaining" the ill-designed control codes of Teletext anyway.
>> Have an urge to edit your own Teletext pages? Here’s the web page for doing just that:
>> https://zxnet.co.uk/teletext/editor <https://zxnet.co.uk/teletext/editor>
>> You can save your page in a handful of formats (plus as image). I haven’t analyzed these formats,
>> but presumably they are storage formats actually used for ”real” Teletext pages that are converted
>> to be transmitted (”analog” (outdated) or DVB) or given as web pages (HTML, but no ”separated
>> mosaic” characters, since they are not yet allocated in Unicode; could use small images though...).
>> /Kent K
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Unicode