<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> </head> <body><div class="auto-created-dir-div" dir="auto" style="unicode-bidi: embed;">Doug Ewell wrote:<div><p><br></p><p><span style="display: inline;">> (It had nothing to do with explicit selection of font styles or sizes via "quasi-control characters," whatever those are.)</span></p><p><span style="display: inline;"><br></span></p><p><span style="display: inline;">Actually, it was me who used the phrase "quasi-control character".</span></p><p><span style="display: inline;"><br></span></p><p><span style="display: inline;"><span style="">https://corp.unicode.org/pipermail/unicode/2021-September/009549.html</span><br></span></p><p><span style="display: inline;"><span style=""><br></span></span></p><p>I know hardly anything about CJK encoding, I am trying to learn.</p><p><br></p><p>A quasi-control character would be a character that is encoded as an ordinary text character and could be displayed using a glyph. However, it could also (or instead) be used by a software system as a control character if that is what the end user prefers and he or she has such a software system available.</p><p><br></p><p>For example, there could be a quasi-control character which has a displayable glyph of a capital A and a capital G arranged in pale with the A above the G, all within a portrait-orientation rectangle, with a meaning of "Alphanumerics Green" which could be used in a Unicode plain text representation of a teletext page (that is, the teletext page being in English, French, German etc, I am not referring to a quasi-control character for CJK in this example). So in many uses the glyph would be displayed and would provide to the human reader an indication of the intended display. In a specialist software application the quasi-control character could be used such that the subsequent text is displayed in green and a space displayed for the quasi-control character rather than the glyph being displayed.</p><p><br></p><p>So I am simply wondering whether use of a quasi-control character for indicating the difference in the font style would solve the problem that is being discussed in the context of CJK if there is a need for a plain text solution.</p><p><br></p><p>> <span style="display: inline !important;">If you really need language tagging, to choose a font or render punctuation or perform spell-checking or text-to-speech or some other process, then use language tagging.</span></p><p><br></p><p>But alas U+E0001 has been deprecated.</p><p><br></p><p>> https://www.unicode.org/charts/PDF/UE0000.pdf</p><p><br></p><p>quote from that document</p><p><br></p><p>The use of tag characters to convey language tags is strongly&#x0A;discouraged.</p><p><br></p><p>Tag identifiers&#x0A;E0001  LANGUAGE TAG</p><p><br></p><p> • This character is deprecated, and its use is&#x0A;strongly discouraged.<br></p><p><br></p><p>end quote</p><p><br></p><p>Should U+E0001 LANGUAGE TAG become undeprecated?</p><p><br></p><p>>></p><table cellspacing="0" cellpadding="0"><tbody><tr><td>In analog era anyone can just write a new characters in ways they<br>desire and spread it around, and if the usage picked up then it would<br>become part of the language, but it's impossible to do the same<br>through Unicode.</td></tr></tbody></table><p><span style="display: inline !important;"><br>> Nor through any of the Chinese or Japanese national standards. This is a fact of life with standardized character sets in general, and has nothing to do with Han unification.</span><br></p><p><br></p><p>Well, there could in theory be introduced a system that could solve that problem, using a technique similar to that which has been proposed for QID emoji, yet a separate system managed directly by Unicode Inc.. Indeed there could be more than one such system, one (or maybe more than one?) for CJK glyphs and another for Latin-style characters and another for other systems. Basically more or less automatic, fairly prompt, registration with only mild moderation by Unicode Inc.. So systems having both the freedoms of the Private Use Areas yet also some of the precision of regular Unicode encoding as regards interoperability. That could be a major step forward in the development and application of Unicode.</p><p><br></p><p>William Overington</p><p><br></p><p>Tuesday 7 September 2021</p><p><br></p><p><br></p></div></div> </body></html>