Ecma-48 proposed styling controls update updated & math expression representation proposal update

Kent Karlsson kent.b.karlsson at bahnhof.se
Sat Jan 6 18:46:47 CST 2024


> 6 jan. 2024 kl. 14:49 skrev William_J_G Overington via Unicode <unicode at corp.unicode.org>:
> 
> 
> 
> It is often difficult to convey the tone of a post in an email, so I begin by saying that this is not in any way critical, that I know very little about this topic yet I am trying to learn, that I would be grateful if you regard these comments and questions as if an informal chat over cups of whatever in a common room somewhere someplace and of me trying to be helpful if I can.
> What exactly are you trying to achieve please? For example, as well as keeping readers of this mailing list informed, for which I thank you, are you trying to persuade a specific committee somewhere to change a specific existing standard?

Well, the ECMA-48 committee is surely disbanded, and trying to resurrect would likely be futile. So my proposals will freestanding proposals “hanging in the air” (or, rather, in github). A bit unfortunate perhaps, but that’s how it is. But you are welcome to pick suggestions from them anyway… I’ve tried to follow several of the styling updates already present in implementations (while not covering “everything”, as being out of scope for the proposal).

However, it would be great if Unicode at least had better character properties for C0/C1 characters, rather than the completely wrong properties Unicode now has for them.

> Sometime somewhere I was advised and I have added my own thoughts that the way to improve one's chances of getting something - whatever it is - done is to write a letter on no more than one side of A4 specifically starting with a request to do something specific or consider doing something specific, on the basis that a one side of A4 document has more chance than a longer letter of being read than being put on the side "for when I am not so busy" which in practice may never arrive, and to make it clear what you are wanting done, so that if the recipient of the letter is minded to be as helpful as possible to you then it is actually clear as to what you want done. I appreciate that with the letter there needs to be the detailed document and I also appreciate that this mailing list may not be to where you would send such a letter.
> 
> I started to have a look through your document and I noticed that you mention teletext. I was involved with teletext, mostly in the 1970s, yet I am still interested so could you say what you are suggesting please? In particular, are you suggesting a way to store in a file suitable for use in a Unicode context the teletext colour codes for both teletext alphanumerics and teletext graphics?

As I mention, Teletext is still in use, and there is a standard for it, implemented in every tv set. I do not know how tv companies store the text, but likely using some proprietary representation, which is then converted to “raw” Teletext. The example I give is mostly to show the styling available in Teletext is covered.

> I am an end user of software programs and not a developer and my experience of programming is mostly in advising undergraduates on electrical and electronic engineering courses and on an information systems engineering course who were learning to write scientific programs, and I do not have detailed knowledge of the underlying systems software. As a result I am somewhat wary of having control codes other than the basic few used for carriage return and line feed as trying to use them in say, WordPad, can be problematic.
> 
> So I am wondering if it could be helpful to have a format as well where each of the control codes in what you are doing could be replaced on a round-trip-is-possible basis with plane 14 tag characters so as to produce a file format that could be suitable for a Unicode environment. I appreciate that is possible that this suggestion might possibly be unsuitable for some reason, but I mention it in case the suggestion might perhaps be useful.
> 
> Perhaps in a Unicode text system a good solution would be for Unicode/ISO IEC 10646 to have some (not yet encoded) non-printing codes added in plane 14 that are treated as not control codes in most uses yet can be treated as control codes in specific situations. This would mean that a file containing them would not contain Unicode control codes so could be stored and shared as a text file, yet when applied to specific equipment of specific software packages could be treated as if containing control codes.

I’d strongly suggest that “tag characters” be strongly deprecated. But that is a different topic. As is deprecating the property ‘default-ignorable’. But considering any character code as ‘non-printing’ when not interpreted, is a bad idea, and that is sort of covered. (In Linux it is common to display uninterpreted characters as a “hex box”, and that is fine. Not exactly SUB/REPLACEMENT CHARACTER, but still fine.)

And the C0/C1 characters should not be regarded magically different from other Unicode characters. Some of them have even been duplicated as non-C0/C1 characters. Something which I don’t think was all that good. Just look at LS and PS, which are basically unused. Everyone are still using LF/CRLF. And NBH even got duplicated twice.

/Kent K

> William Overington
> 
> Saturday 6 January 2024
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240107/2cf89fdc/attachment.htm>


More information about the Unicode mailing list