<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">This is getting too off-topic. But just two small remarks. (After this I will not comment more on SMS stuff in this thread.)<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">12 jan. 2023 kl. 20:05 skrev Harriet Riddle via Unicode <<a href="mailto:unicode@corp.unicode.org" class="">unicode@corp.unicode.org</a>>:</div><div class=""><div class="">…</div></div></blockquote><br class=""><blockquote type="cite" class=""><div class=""><div class="">From an ECMA-35 perspective, it doesn't really matter if 0x1B in Teletext and GSM is (a) ESC with a different behaviour to that specified in ECMA-35 or (b) something other than ESC. Since ECMA-35 explicitly reserves 0x1B for ESC and forbids C0 sets from redefining it, and also defines the behaviour of ESC including the general structure of ESC sequences (which ECMA-48 conforms to), either is equally non-conformant. In the case of GSM, it is further non-conformant by encoding glyphs over the CL area, which is reserved for C0 controls.<br class=""></div></div></blockquote><div><br class=""></div>There is <b class="">no notion</b> of C0, G0, etc. in these 7-bit charsets. But the 7-bit charsets do have a ”secondary codepage” (by another name) and are prepared for having a ”tertiary codepage” (but that is not (yet) used).</div><div><br class=""><blockquote type="cite" class=""><div class=""><div class="">---<br class=""><br class=""><blockquote type="cite" class="">That’s what I said (though I said SMS and cell broadcast 7-bit charsets; GSM (2G) is somewhat outdated, we're (mostly) on 4G and 5G now).<br class=""></blockquote><br class=""><br class="">And yet, when I open my (Android 6.0) SMS app, with an active 4G connection, in the UK, and type a ' (ASCII apostrophe) character, it reports I have 159 characters remaining until it has to send a multi-part SMS. When I delete that character and type a ~ (tilde) instead, it reports only 158 characters remaining. When I delete that and type a ` (backtick), it reports only 69 characters remaining. And as one might have guessed, if I delete that and paste in a 𐐔, it reports 68 characters remaining.<br class=""><br class="">The amount of text that fits in 1120 bits under either GSM 7-bit (if within its repertoire) or UTF-16 (otherwise) is still a relevant metric, it seems.<br class=""></div></div></blockquote><div><br class=""></div><div>Backwards compatibility is a big issue here of course. If no new-fangled extension is used, everything should work as before also for ”old” user equipment (usually mobile phones). Both w.r.t. the charsets, but also w.r.t. the protocol itself. If something new-fangled is used, ”old” equipment may display ”mojibake".</div><div><br class=""></div><div>And, if the text cannot be represented in (one of, there are now several) the 7-charsets, a switch to ”USC-2” (3GPP still does not call it ”UTF-16BE”…) can be done (though the 3GPP standards do not require that, it is application defined).</div><div><br class=""></div><div>/Kent K</div><br class=""><blockquote type="cite" class=""><div class=""><div class="">--Har.<br class=""></div></div></blockquote></div><br class=""></body></html>