Teletext separated mosaic graphics
Kent Karlsson
kent.b.karlsson at bahnhof.se
Tue Oct 13 14:05:05 CDT 2020
A final example (semi-faithful text, not an image):
<!DOCTYPE html>
<html lang="es" data-vsp="2.17.1" data-jsdomain="https://js2.rtve.es"><head prefix>
<title>Teletexto El Tiempo - 301 | RTVE.es</title>
<meta name="Description" content="Accede a la página 301 de El Tiempo del teletexto de RTVE donde encontrarás toda la información que necesitas aquí, en RTVE.es" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta name="viewport" content="width=device-width,initial-scale=1.0,maximum-scale=2.0,user-scalable=1" />
<link rel="stylesheet" href="https://css2.rtve.es/css/rtve.2017.rtve/teletexto/teletexto.desktp.css" type="text/css">
<link media="all" rel="stylesheet" href="https://css2.rtve.es/css/rtve.2015/rtve.compacts/portada.desktp.css" type="text/css">
</head>
<body class="el_tiempo " id="bodyElem" data-uidtm=""
data-app="/mod_pf_teletexto”>
……
…...
<li class="list_item" data-ng-href="/television/teletexto/tiempo/302/1/"><a href="/television/teletexto/tiempo/302/1/" title="Predicción hoy/mañana y mapa "><span class="list_item_title">Predicción hoy/mañana y mapa </span></a><a href="/television/teletexto/tiempo/302/1/" title="Predicción hoy/mañana y mapa " class="list_item_link">302 a 304</a></li>
<li class="list_item" data-ng-href="/television/teletexto/tiempo/305/"><a href="/television/teletexto/tiempo/305/" title="El Tiempo en España por CC.AA"><span class="list_item_title">El Tiempo en España por CC.AA</span></a><a href="/television/teletexto/tiempo/305/" title="El Tiempo en España por CC.AA" class="list_item_link">305</a></li>
……
> 13 okt. 2020 kl. 19:31 skrev Kent Karlsson <kent.b.karlsson at bahnhof.se>:
>
> Just to give one example, an HTML(5) code snippet from one of those Teletext sites:
>
>
> <!DOCTYPE html>
> <html lang="sv" class="page-multiple">
> <head>
> <title>SVT Text TV</title>
> <meta content='width=device-width, initial-scale=1.0, maximum-scale=5.0' id='viewport' name='viewport' />
> ….
> ….
> <span class="C"> </span>
> <span class="Y"> Stortingsledamöters e-post hackades </span>
> <h1 class="Y DH"> Norge: Ryssland låg bakom cyberattack </h1>
> <span class="Y"> <a href="/136">136</a></span><span class="Y"> </span>
> <span class="Y"> </span>
> <span class="C"> Flydde från polisen - </span>
> <span class="C"> </span><span class="C"> dog efter balkongfall </span>
> <span class="C"> <a href="/118">118</a> </span>
> <span class="C"> </span>
> <h1 class="Y DH"> Ronaldo har testats positivt för covid</h1>
> <span class="Y"> Fixstjärnan missar matchen mot Sverige</span>
> <span class="Y"> <a href="/300">300</a></span><span class="Y"> </span>
> ….
>
>
>
> Note that ”triple-digits” are (usually) converted to a link to the referenced Teletext page (Teletext pages have triple digit page numbers, starting at 100).
>
>
>
> /Kent K
>
>
>> 13 okt. 2020 kl. 18:45 skrev Asmus Freytag <asmusf at ix.netcom.com>:
>>
>> How do existing websites represent teletext?
>> A./
>>
>>
>> On 10/13/2020 9:34 AM, Kent Karlsson via Unicode wrote:
>>>
>>>> 13 okt. 2020 kl. 16:30 skrev William_J_G Overington via Unicode <unicode at unicode.org>
>>>> :
>>>>
>>>> I am now thinking that the best solution for encoding the teletext control characters using just already existing Unicode characters is to use the Escape format listed in the PDF document linked from the post by Harriet Riddle.
>>>>
>>> That is out of the question for several reasons.
>>>
>>> 1) ECMA-48 specifies such escape sequences as aliases (formally for 7-bit encodings, but in practice not limited that way) for the ECMA-48 C1 control codes. This suggestion is thus incompatible with ECMA-48. (And promoting anything else is a bad idea, even though compatibility with ECMA-48 is not required by Unicode/10646.)
>>>
>>> 2) That ”solution” does not in any way remove the gross ill-designedness of the Teletext ”control” codes (most of them do three things in one go: colour change, code page change, display as SPACE or as ”kept” ”mosaic” character).
>>>
>>> 3) That ”solution” still cannot handle the ”object” format overrides (more colors, bold, Italics, underline, proportional font [and G3 character substitutions, but that falls under encoding conversion, not under styling]) in Teletext (a horrendous idea, the only excuse for which is compatibility with the original Teletext ”controls” which are left untouched in ”advanced” Teletext). The ”object” overrides are in a control section of the Teletext protocol.
>>>
>>> /Kent Karlsson
>>>
>>>
>>>
>>>> https://www.itscj.ipsj.or.jp/iso-ir/056.pdf
>>>>
>>>>
>>>>
>>>> https://corp.unicode.org/pipermail/unicode/2020-October/009048.html
>>>>
>>>>
>>>> This appears to be what is used in the export format named viewdata from the editor that Kent Karlsson mentioned.
>>>>
>>>>
>>>> https://zxnet.co.uk/teletext/editor
>>>>
>>>>
>>>>
>>>> https://corp.unicode.org/pipermail/unicode/2020-October/009071.html
>>>>
>>>>
>>>> If one then uses a specially made OpenType font, one can arrange for each such two character escape sequence to be displayed as one of the glyph designs that I mentioned in the following post, by using the OpenType liga facility..
>>>>
>>>>
>>>> https://corp.unicode.org/pipermail/unicode/2020-October/009047.html
>>>>
>>>>
>>>>
>>>>> For example, Alphanumerics Green would have a visible glyph of an A above a G on a pale.
>>>>>
>>>> This morning I tried making a test font with a visible glyph for the Escape character and a liga glyph substitution for Escape followed by capital A.
>>>>
>>>> I made the font using the High-Logic FontCreator program and tested it in the Serif Affinity Publisher program, producing a PDF document.
>>>>
>>>> I was hoping to be able to paste a copy of the substituted glyph copied from the PDF to WordPad and recover the underlying two character sequence. However I could only seem to get the capital A back. Maybe I did not get the technique quite right and so it might perhaps be possible to get the underlying sequence back from a PDF, but that requires further investigation.
>>>>
>>>> William Overington
>>>>
>>>> Tuesday 13 October 2020
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
More information about the Unicode
mailing list