My suggestions for Unicode based math expression format(s)

Kent Karlsson kent.b.karlsson at bahnhof.se
Tue Dec 27 17:58:28 CST 2022


Let me answer by forward my reply to William’s message a few days ago.

((NLFs are not treated ideally, but that seems to be a common bug; so there are some spurious extra line breaks below)):


> Vidarebefordrat brev:
> 
> Från: Kent Karlsson <kent.b.karlsson at bahnhof.se>
> Ämne: Re: My suggestions for Unicode based math expression format(s)
> Datum: 13 december 2022 16:20:42 CET
> Till: William_J_G Overington <wjgo_10009 at btinternet.com>
> 
> 
> 
> Skickat från min iPhone
> 
>> 13 dec. 2022 kl. 13:33 skrev William_J_G Overington <wjgo_10009 at btinternet.com <mailto:wjgo_10009 at btinternet.com>>:
>> 
>>  
>> Hi
>> 
>> 
>> I have never used the various existing packages that you mention.
>> 
>> 
>> 
>> May I make three observations please?
>> 
>> 
>> 
>> 1. I consider that using control codes to specify layout is a problem. A way to express things without using control codes is needed.
>> 
> There is. Two of them.
>> 
>> 2. Would a test be that what one wants to typeset can be typeset in Microsoft WordPad? One might need to copy and paste characters from a WordPad file that has one of each character in it, as if it were a typecase. For the avoidance of doubt I am not suggesting that all typesetting should be done in WordPad, not at all, but I am saying that if it cannot be typeset in WordPad then a format may be too complicated or too expensive or too inaccessible for widespread use.
>> 
> That is a question for someone else than me…
>> 
>> 3. Back in the early 1990s I was involved in a discussion of how to express mathematical equations using just 7-bit ASCII characters in a monospaced display typical of mainframe visual display units terminals at the time.
>> 
>> 
>> 
>> Some of my suggestions might be relevant here.
>> 
>> 
>> 
>> I suggested using :pom: to express a plus or minus sign as used in the general solution formula for a quadratic equation. That format could be used for special symbols. These days, an OpenType font could cause a correct glyph to be displayed, even if the glyph is not a regular Unicode character.
>> 
> I think that it would not be a good idea to request or or expect that of any font. Any character escapes should be interpreted before any font is involved.
>> 
>> I suggested that an integral be expressed using three capital I letters, one above the other in three lines of text.
>> 
>> 
>> 
>> I
>> 
>> I
>> 
>> I
>> 
>> 
>> 
>> that then allows upper and lower limits to be expressed for definite integrals.
>> 
>> 
>> 
>> For example
>> 
>> 
>> 
>> I t=1
>> 
>> I exp(-t).dt
>> 
>> I t=0
>> 
>> 
>> 
>> Then summation could be expressed as follows.
>> 
>> 
>> 
>> S n=5
>> 
>> S n^2
>> 
>> S n=1
>> 
>> 
>> 
>> and product similarly using three P characters.
>> 
>> 
>> 
>> P
>> 
>> P
>> 
>> P
>> 
>> 
>> 
> nroff with eqn did output to a typewriter-like terminal (fixed width), needing multiple character daisy-wheels some with mathematical characters. This terminal was capable of partial line up/down. But changing daisy wheels had to be done by hand. The output was a bit crude, but did have a fair semblance of normal mathematical typesetting. (troff produced better output, for typesetting machines of the day.) This was before TeX came along.
> 
> /Kent K
> 
> 
>> I hope this helps.
>> 
>> 
>> 
>> Best regards,
>> 
>> 
>> 
>> William Overington
>> 
>> 
>> 
>> Tuesday 13 December 2022
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> ------ Original Message ------
>> From: "Kent Karlsson via Unicode" <unicode at corp.unicode.org <mailto:unicode at corp.unicode.org>>
>> To: unicode at corp.unicode.org <mailto:unicode at corp.unicode.org>
>> Sent: Tuesday, 2022 Dec 13 At 11:36
>> Subject: My suggestions for Unicode based math expression format(s)
>> 
>> (Hoping that this goes through ok; I did have some problems with the sum sign when copying this text…)
>> 
>> 
>> 
>> I've deviced a (or rather, several) new format(s) for representing math expressions.
>> 
>> Why, you may wonder... Isn't MathML the answer to everyting math? Well, not quite.
>> 
>>  
>> 
>> After more than 20 years since the first version of MathML, it is still not a great
>> 
>> success. I think there are several reasons for that. One is the obvious: it is too
>> 
>> verbose. Another is that (much due to the verbosity) that one really need authoring
>> 
>> tools to be able to write any math expression in the MathML representation. The
>> 
>> advantage of TeX math (or even old eqn) expressions is that users can with relative
>> 
>> ease type the expression they want on the keyboard. Ordinary cut-paste-modifyViaKeyboard
>> 
>> works. Authoring tools are less straight-forward to use. Furhter, not everything is
>> 
>> HTML (or XML). One may even want to have math expressions in what is otherwise plain
>> 
>> text; for instance for cut and paste, loosing styling per se (colour, bold/..., size)
>> 
>> but not the math expressions.
>> 
>>  
>> 
>> But what about typability, directly from the keyboard, without using a special authoring
>> 
>> tool? Are eqn or TeX the only options? Well, there is AsciiMath and UnicodeMath...
>> 
>> However, those do common parenthesis parsing that is undesirable, among other things.
>> 
>> And, apart from UnicodeMath, they were created long before Unicode, so they are not well
>> 
>> adapted to using Unicode characters.
>> 
>>  
>> 
>> OMML (Office Math ML, also XML based) is just as verbose as MathML, if not worse.
>> 
>>  
>> 
>> Using {} (a convention borrored from TeX; and using \{ and \} for literal {}) and some
>> 
>> other special "mark-down" and character escape inspired notations, we can make a surface
>> 
>> form of a math expression representation (encoding if you like) that is typable on a Latin based
>> 
>> keyboard; except that ∑ and π in the example here may need some further escape notation,
>> 
>> like \sum, \pi, to be fully keyboard typable (similarly to TeX, eqn, UnicodeMath, etc.). Not-so-common
>> 
>> symbols will still need to be picked from some kind of menu, or use Unicode charater escapes,
>> 
>> \uxxxx, \Uxxxxxx. Here is an example, using the same expression as is used as the lead example
>> 
>> in the MathML Core specification; it looks a little bit like TeX, intentionally, due to the selection
>> 
>> of {}^_ as meta-characters for certain math expression controls, but isn't TeX:
>> 
>>  
>> 
>> ${∑$/{n=1}$\{+∞}{1\/n^2}={π^2\/6}}
>> 
>>  
>> 
>> There is also a HTML/XML compatible form proposed, that is fully equivalent in expressivity
>> 
>> with the other forms/variants proposed. Though it is not MathML, but it is using XML tags,
>> 
>> so it is a bit longer than the above (read "me" as "math expresion"):
>> 
>>  
>> 
>> <me>∑<blw/><me>n=1</me><abv/><me>+∞</me><me>1<dv/>n<rsp/>2</me>=<me>π<rsp/>2<dv/>6</me></me>
>> 
>>  
>> 
>> Or with some more whitespace/linebreaks:
>> 
>> <me>
>> 
>>                        ∑<blw/><me>n=1</me><abv/><me>+∞</me>   <me>1<dv/>n<rsp/>2</me>
>> 
>>                        =
>> 
>>                        <me>π<rsp/>2<dv/>6</me>
>> 
>> </me>
>> 
>>  
>> 
>> This shows that having math expressions in an XML compatible format does not need to have
>> 
>> clay feet. There are several key reasons for this relative light-footedness. The reasons
>> 
>> include using: default styles, short tag/attribute names (for the XML variant) and short
>> 
>> controls/markup for the other variants, and the use of a level of structural parsing,
>> 
>> uncommon for XML (but otherwise common, also for math, in e.g. eqn and TeX). Details in
>> 
>> the spec referenced below.
>> 
>>  
>> 
>> It also shows that equivalent representations can be even more light-footed than the XML/HTML
>> 
>> variant, as well as the possiblilty of having variant surface representation that fits with
>> 
>> at least some other contexts (than XML/HTML).
>> 
>>  
>> 
>> In addition, the respresentations (all variants) can still be general enough to allowRTL math
>> 
>> expressions in a reliable way (in particular, reliable direction of arrows, which in math expressions
>> 
>> almost always refer to the left and right side "arguments”, not an external physical direction),
>> 
>> as well as chemical reaction formulas (math-like, not graphical) and the like. Re. arrows: see
>> 
>> https://www.unicode.org/L2/L2022/22026r-non-bidi-mirroring.pdf <https://www.unicode.org/L2/L2022/22026r-non-bidi-mirroring.pdf>.
>> 
>>  
>> 
>> You can find the proposed format(s) specification at
>> 
>> https://github.com/kent-karlsson/control/blob/main/math-layout-controls-2022-C.pdf <https://github.com/kent-karlsson/control/blob/main/math-layout-controls-2022-C.pdf>.
>> 
>>  
>> 
>> There is absolutely no claim that this covers everyting w.r.t. math expressions;
>> 
>> very likely it does not. But it does cover more than I set out to cover. There is no attempt
>> 
>> to be compatible with MathML (sorry, but that would have killed the idea).
>> 
>>  
>> 
>> Comments are welcome.
>> 
>>  
>> 
>> Happy Lucia!*
>> 
>> /Kent Karlsson
>> 
>>  
>> 
>> * https://en.wikipedia.org/w/index.php?title=Saint_lucia%27s_day <https://en.wikipedia.org/w/index.php?title=Saint_lucia%27s_day>


> 27 dec. 2022 kl. 10:11 skrev William_J_G Overington via Unicode <unicode at corp.unicode.org>:
> 
> 
> Hi
> 
> I have never used the various existing packages that have been mentioned.
> 
> May I make three observations please?
> 
> 1. I consider that using control codes to specify layout is a problem. A way to express things without using control codes is needed.
> 
> 2. Would a test be that what one wants to typeset can be typeset in Microsoft WordPad? One might need to copy and paste characters from a WordPad file that has one of each character in it, as if it were a typecase. For the avoidance of doubt I am not suggesting that all typesetting should be done in WordPad, not at all, but I am saying that if it cannot be typeset in WordPad then a format may be too complicated or too expensive or too inaccessible for widespread use.
> 
> 3. Back in the early 1990s I was involved in a discussion of how to express mathematical equations using just 7-bit ASCII characters in a monospaced display typical of mainframe visual display units terminals at the time.
> 
> Some of my suggestions might be relevant here.
> 
> I suggested using :pom: to express a 'plus or minus' sign as used in the general solution formula for a quadratic equation. That format could be used for special symbols. These days, an OpenType font could cause a correct glyph to be displayed, even if the glyph is not a regular Unicode character.
> 
> I suggested that an integral be expressed using three capital I letters, one above the other in three lines of text.
> 
> I
> I
> I
> 
> that then allows upper and lower limits to be expressed for definite integrals.
> 
> For example
> 
> I t=1
> I exp(-t).dt
> I t=0
> 
> Then summation could be expressed as follows.
> 
> S n=5
> S n^2
> S n=1
> 
> and product similarly using three P characters.
> 
> P
> P
> P
> 
> This system could be used to some extent immediately without any additional software being needed. An OpenType font could be used to substitute a 'plus or minus' sign for :pom: and for other symbols. Hopefully software could be written to substitute the three capital I letters with a single integral sign.
> 
> I hope this helps.
> 
> Best regards,
> 
> William Overington
> 
> Tuesday 27 December 2022

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20221228/7aa64d67/attachment.htm>


More information about the Unicode mailing list