# My suggestions for Unicode based math expression format(s)

Kent Karlsson kent.b.karlsson at bahnhof.se
Tue Dec 27 17:58:28 CST 2022

Let me answer by forward my reply to William’s message a few days ago.

((NLFs are not treated ideally, but that seems to be a common bug; so there are some spurious extra line breaks below)):

> Vidarebefordrat brev:
>
> Från: Kent Karlsson <kent.b.karlsson at bahnhof.se>
> Ämne: Re: My suggestions for Unicode based math expression format(s)
> Datum: 13 december 2022 16:20:42 CET
> Till: William_J_G Overington <wjgo_10009 at btinternet.com>
>
>
>
> Skickat från min iPhone
>
>> 13 dec. 2022 kl. 13:33 skrev William_J_G Overington <wjgo_10009 at btinternet.com <mailto:wjgo_10009 at btinternet.com>>:
>>
>> ﻿
>> Hi
>>
>>
>> I have never used the various existing packages that you mention.
>>
>>
>>
>> May I make three observations please?
>>
>>
>>
>> 1. I consider that using control codes to specify layout is a problem. A way to express things without using control codes is needed.
>>
> There is. Two of them.
>>
>> 2. Would a test be that what one wants to typeset can be typeset in Microsoft WordPad? One might need to copy and paste characters from a WordPad file that has one of each character in it, as if it were a typecase. For the avoidance of doubt I am not suggesting that all typesetting should be done in WordPad, not at all, but I am saying that if it cannot be typeset in WordPad then a format may be too complicated or too expensive or too inaccessible for widespread use.
>>
> That is a question for someone else than me…
>>
>> 3. Back in the early 1990s I was involved in a discussion of how to express mathematical equations using just 7-bit ASCII characters in a monospaced display typical of mainframe visual display units terminals at the time.
>>
>>
>>
>> Some of my suggestions might be relevant here.
>>
>>
>>
>> I suggested using :pom: to express a plus or minus sign as used in the general solution formula for a quadratic equation. That format could be used for special symbols. These days, an OpenType font could cause a correct glyph to be displayed, even if the glyph is not a regular Unicode character.
>>
> I think that it would not be a good idea to request or or expect that of any font. Any character escapes should be interpreted before any font is involved.
>>
>> I suggested that an integral be expressed using three capital I letters, one above the other in three lines of text.
>>
>>
>>
>> I
>>
>> I
>>
>> I
>>
>>
>>
>> that then allows upper and lower limits to be expressed for definite integrals.
>>
>>
>>
>> For example
>>
>>
>>
>> I t=1
>>
>> I exp(-t).dt
>>
>> I t=0
>>
>>
>>
>> Then summation could be expressed as follows.
>>
>>
>>
>> S n=5
>>
>> S n^2
>>
>> S n=1
>>
>>
>>
>> and product similarly using three P characters.
>>
>>
>>
>> P
>>
>> P
>>
>> P
>>
>>
>>
> nroff with eqn did output to a typewriter-like terminal (fixed width), needing multiple character daisy-wheels some with mathematical characters. This terminal was capable of partial line up/down. But changing daisy wheels had to be done by hand. The output was a bit crude, but did have a fair semblance of normal mathematical typesetting. (troff produced better output, for typesetting machines of the day.) This was before TeX came along.
>
> /Kent K
>
>
>> I hope this helps.
>>
>>
>>
>> Best regards,
>>
>>
>>
>> William Overington
>>
>>
>>
>> Tuesday 13 December 2022
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ------ Original Message ------
>> From: "Kent Karlsson via Unicode" <unicode at corp.unicode.org <mailto:unicode at corp.unicode.org>>
>> To: unicode at corp.unicode.org <mailto:unicode at corp.unicode.org>
>> Sent: Tuesday, 2022 Dec 13 At 11:36
>> Subject: My suggestions for Unicode based math expression format(s)
>>
>> (Hoping that this goes through ok; I did have some problems with the sum sign when copying this text…)
>>
>>
>>
>> I've deviced a (or rather, several) new format(s) for representing math expressions.
>>
>> Why, you may wonder... Isn't MathML the answer to everyting math? Well, not quite.
>>
>>
>>
>> After more than 20 years since the first version of MathML, it is still not a great
>>
>> success. I think there are several reasons for that. One is the obvious: it is too
>>
>> verbose. Another is that (much due to the verbosity) that one really need authoring
>>
>> tools to be able to write any math expression in the MathML representation. The
>>
>> advantage of TeX math (or even old eqn) expressions is that users can with relative
>>
>> ease type the expression they want on the keyboard. Ordinary cut-paste-modifyViaKeyboard
>>
>> works. Authoring tools are less straight-forward to use. Furhter, not everything is
>>
>> HTML (or XML). One may even want to have math expressions in what is otherwise plain
>>
>> text; for instance for cut and paste, loosing styling per se (colour, bold/..., size)
>>
>> but not the math expressions.
>>
>>
>>
>> But what about typability, directly from the keyboard, without using a special authoring
>>
>> tool? Are eqn or TeX the only options? Well, there is AsciiMath and UnicodeMath...
>>
>> However, those do common parenthesis parsing that is undesirable, among other things.
>>
>> And, apart from UnicodeMath, they were created long before Unicode, so they are not well
>>
>> adapted to using Unicode characters.
>>
>>
>>
>> OMML (Office Math ML, also XML based) is just as verbose as MathML, if not worse.
>>
>>
>>
>> Using {} (a convention borrored from TeX; and using \{ and \} for literal {}) and some
>>
>> other special "mark-down" and character escape inspired notations, we can make a surface
>>
>> form of a math expression representation (encoding if you like) that is typable on a Latin based
>>
>> keyboard; except that ∑ and π in the example here may need some further escape notation,
>>
>> like \sum, \pi, to be fully keyboard typable (similarly to TeX, eqn, UnicodeMath, etc.). Not-so-common
>>
>> symbols will still need to be picked from some kind of menu, or use Unicode charater escapes,
>>
>> \uxxxx, \Uxxxxxx. Here is an example, using the same expression as is used as the lead example
>>
>> in the MathML Core specification; it looks a little bit like TeX, intentionally, due to the selection
>>
>> of {}^_ as meta-characters for certain math expression controls, but isn't TeX:
>>
>>
>>
>> ${∑$/{n=1}\$\{+∞}{1\/n^2}={π^2\/6}}
>>
>>
>>
>> There is also a HTML/XML compatible form proposed, that is fully equivalent in expressivity
>>
>> with the other forms/variants proposed. Though it is not MathML, but it is using XML tags,
>>
>> so it is a bit longer than the above (read "me" as "math expresion"):
>>
>>
>>
>> <me>∑<blw/><me>n=1</me><abv/><me>+∞</me><me>1<dv/>n<rsp/>2</me>=<me>π<rsp/>2<dv/>6</me></me>
>>
>>
>>
>> Or with some more whitespace/linebreaks:
>>
>> <me>
>>
>>                        ∑<blw/><me>n=1</me><abv/><me>+∞</me>   <me>1<dv/>n<rsp/>2</me>
>>
>>                        =
>>
>>                        <me>π<rsp/>2<dv/>6</me>
>>
>> </me>
>>
>>
>>
>> This shows that having math expressions in an XML compatible format does not need to have
>>
>> clay feet. There are several key reasons for this relative light-footedness. The reasons
>>
>> include using: default styles, short tag/attribute names (for the XML variant) and short
>>
>> controls/markup for the other variants, and the use of a level of structural parsing,
>>
>> uncommon for XML (but otherwise common, also for math, in e.g. eqn and TeX). Details in
>>
>> the spec referenced below.
>>
>>
>>
>> It also shows that equivalent representations can be even more light-footed than the XML/HTML
>>
>> variant, as well as the possiblilty of having variant surface representation that fits with
>>
>> at least some other contexts (than XML/HTML).
>>
>>
>>
>> In addition, the respresentations (all variants) can still be general enough to allowRTL math
>>
>> expressions in a reliable way (in particular, reliable direction of arrows, which in math expressions
>>
>> almost always refer to the left and right side "arguments”, not an external physical direction),
>>
>> as well as chemical reaction formulas (math-like, not graphical) and the like. Re. arrows: see
>>
>> https://www.unicode.org/L2/L2022/22026r-non-bidi-mirroring.pdf <https://www.unicode.org/L2/L2022/22026r-non-bidi-mirroring.pdf>.
>>
>>
>>
>> You can find the proposed format(s) specification at
>>
>> https://github.com/kent-karlsson/control/blob/main/math-layout-controls-2022-C.pdf <https://github.com/kent-karlsson/control/blob/main/math-layout-controls-2022-C.pdf>.
>>
>>
>>
>> There is absolutely no claim that this covers everyting w.r.t. math expressions;
>>
>> very likely it does not. But it does cover more than I set out to cover. There is no attempt
>>
>> to be compatible with MathML (sorry, but that would have killed the idea).
>>
>>
>>
>>
>>
>>
>> Happy Lucia!*
>>
>> /Kent Karlsson
>>
>>
>>
>> * https://en.wikipedia.org/w/index.php?title=Saint_lucia%27s_day <https://en.wikipedia.org/w/index.php?title=Saint_lucia%27s_day>

> 27 dec. 2022 kl. 10:11 skrev William_J_G Overington via Unicode <unicode at corp.unicode.org>:
>
>
> Hi
>
> I have never used the various existing packages that have been mentioned.
>
> May I make three observations please?
>
> 1. I consider that using control codes to specify layout is a problem. A way to express things without using control codes is needed.
>
> 2. Would a test be that what one wants to typeset can be typeset in Microsoft WordPad? One might need to copy and paste characters from a WordPad file that has one of each character in it, as if it were a typecase. For the avoidance of doubt I am not suggesting that all typesetting should be done in WordPad, not at all, but I am saying that if it cannot be typeset in WordPad then a format may be too complicated or too expensive or too inaccessible for widespread use.
>
> 3. Back in the early 1990s I was involved in a discussion of how to express mathematical equations using just 7-bit ASCII characters in a monospaced display typical of mainframe visual display units terminals at the time.
>
> Some of my suggestions might be relevant here.
>
> I suggested using :pom: to express a 'plus or minus' sign as used in the general solution formula for a quadratic equation. That format could be used for special symbols. These days, an OpenType font could cause a correct glyph to be displayed, even if the glyph is not a regular Unicode character.
>
> I suggested that an integral be expressed using three capital I letters, one above the other in three lines of text.
>
> I
> I
> I
>
> that then allows upper and lower limits to be expressed for definite integrals.
>
> For example
>
> I t=1
> I exp(-t).dt
> I t=0
>
> Then summation could be expressed as follows.
>
> S n=5
> S n^2
> S n=1
>
> and product similarly using three P characters.
>
> P
> P
> P
>
> This system could be used to some extent immediately without any additional software being needed. An OpenType font could be used to substitute a 'plus or minus' sign for :pom: and for other symbols. Hopefully software could be written to substitute the three capital I letters with a single integral sign.
>
> I hope this helps.
>
> Best regards,
>
> William Overington
>
> Tuesday 27 December 2022

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20221228/7aa64d67/attachment.htm>