From pgcon6 at msn.com Mon Dec 5 12:46:22 2022 From: pgcon6 at msn.com (Peter Constable) Date: Mon, 5 Dec 2022 18:46:22 +0000 Subject: Public review issues: work on source code security Message-ID: As work progresses on the next version of Unicode, it's typical that around this time we start seeing proposed updates for various Unicode Standard Annexes (UAXes) or Unicode Technical Standards (UTSes) that are released in sync with the new Unicode version. This cycle is no different: check out the Public Review Issues page for 11 PRIs that have recently been posted. https://www.unicode.org/review/ The closing dates for these PRIs is January 3. (Btw, you can also see the timeline for upcoming alpha and beta reviews for the next version on the Beta Review Status page: https://www.unicode.org/versions/beta.html.) I want to draw particular attention to the PRIs for various docs that are being worked on in relation to source code security. The main doc is a new specification, UTS #55, Unicode Source Code Handling. Please check out PRI #466: https://www.unicode.org/review/pri466/ There are also proposed updates related to this for these other UAXes / UTSes: * UAX #9, Unicode Bidirectional Algorithm: see PRI #460 (https://www.unicode.org/review/pri460/) * UAX #14, Unicode Line Breaking Algorithm: see PRI #461 (https://www.unicode.org/review/pri461/) * UAX #31, Unicode Identifiers and Syntax: see PRI #462 (https://www.unicode.org/review/pri462/) * UTS #39, Unicode Security Mechanisms: see PRI #463 (https://www.unicode.org/review/pri463/) There will also be a PRI coming soon for a related small change in UTS #51, Unicode Emoji. We welcome review and feedback - please submit feedback by January 3. (That's needed so that working groups have time to prepare recommended changes for the next UTC meeting.) Peter Constable, UTC Chair -------------- next part -------------- An HTML attachment was scrubbed... URL: From kent.b.karlsson at bahnhof.se Tue Dec 13 05:36:15 2022 From: kent.b.karlsson at bahnhof.se (Kent Karlsson) Date: Tue, 13 Dec 2022 12:36:15 +0100 Subject: My suggestions for Unicode based math expression format(s) Message-ID: <8B168E84-2EA2-403A-AE80-232C7ED58C08@bahnhof.se> (Hoping that this goes through ok; I did have some problems with the sum sign when copying this text?) I've deviced a (or rather, several) new format(s) for representing math expressions. Why, you may wonder... Isn't MathML the answer to everyting math? Well, not quite. After more than 20 years since the first version of MathML, it is still not a great success. I think there are several reasons for that. One is the obvious: it is too verbose. Another is that (much due to the verbosity) that one really need authoring tools to be able to write any math expression in the MathML representation. The advantage of TeX math (or even old eqn) expressions is that users can with relative ease type the expression they want on the keyboard. Ordinary cut-paste-modifyViaKeyboard works. Authoring tools are less straight-forward to use. Furhter, not everything is HTML (or XML). One may even want to have math expressions in what is otherwise plain text; for instance for cut and paste, loosing styling per se (colour, bold/..., size) but not the math expressions. But what about typability, directly from the keyboard, without using a special authoring tool? Are eqn or TeX the only options? Well, there is AsciiMath and UnicodeMath... However, those do common parenthesis parsing that is undesirable, among other things. And, apart from UnicodeMath, they were created long before Unicode, so they are not well adapted to using Unicode characters. OMML (Office Math ML, also XML based) is just as verbose as MathML, if not worse. Using {} (a convention borrored from TeX; and using \{ and \} for literal {}) and some other special "mark-down" and character escape inspired notations, we can make a surface form of a math expression representation (encoding if you like) that is typable on a Latin based keyboard; except that ? and ? in the example here may need some further escape notation, like \sum, \pi, to be fully keyboard typable (similarly to TeX, eqn, UnicodeMath, etc.). Not-so-common symbols will still need to be picked from some kind of menu, or use Unicode charater escapes, \uxxxx, \Uxxxxxx. Here is an example, using the same expression as is used as the lead example in the MathML Core specification; it looks a little bit like TeX, intentionally, due to the selection of {}^_ as meta-characters for certain math expression controls, but isn't TeX: ${?$/{n=1}$\{+?}{1\/n^2}={?^2\/6}} There is also a HTML/XML compatible form proposed, that is fully equivalent in expressivity with the other forms/variants proposed. Though it is not MathML, but it is using XML tags, so it is a bit longer than the above (read "me" as "math expresion"): ?n=1+?1n2=?26 Or with some more whitespace/linebreaks: ?n=1+? 1n2 = ?26 This shows that having math expressions in an XML compatible format does not need to have clay feet. There are several key reasons for this relative light-footedness. The reasons include using: default styles, short tag/attribute names (for the XML variant) and short controls/markup for the other variants, and the use of a level of structural parsing, uncommon for XML (but otherwise common, also for math, in e.g. eqn and TeX). Details in the spec referenced below. It also shows that equivalent representations can be even more light-footed than the XML/HTML variant, as well as the possiblilty of having variant surface representation that fits with at least some other contexts (than XML/HTML). In addition, the respresentations (all variants) can still be general enough to allowRTL math expressions in a reliable way (in particular, reliable direction of arrows, which in math expressions almost always refer to the left and right side "arguments?, not an external physical direction), as well as chemical reaction formulas (math-like, not graphical) and the like. Re. arrows: see https://www.unicode.org/L2/L2022/22026r-non-bidi-mirroring.pdf. You can find the proposed format(s) specification at https://github.com/kent-karlsson/control/blob/main/math-layout-controls-2022-C.pdf. There is absolutely no claim that this covers everyting w.r.t. math expressions; very likely it does not. But it does cover more than I set out to cover. There is no attempt to be compatible with MathML (sorry, but that would have killed the idea). Comments are welcome. Happy Lucia!* /Kent Karlsson * https://en.wikipedia.org/w/index.php?title=Saint_lucia%27s_day From asmusf at ix.netcom.com Thu Dec 15 01:23:19 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Wed, 14 Dec 2022 23:23:19 -0800 Subject: My suggestions for Unicode based math expression format(s) In-Reply-To: <8B168E84-2EA2-403A-AE80-232C7ED58C08@bahnhof.se> References: <8B168E84-2EA2-403A-AE80-232C7ED58C08@bahnhof.se> Message-ID: <0c3dd1f7-7a1e-3ccd-771e-3512d75dd73b@ix.netcom.com> On 12/13/2022 3:36 AM, Kent Karlsson via Unicode wrote: > (Hoping that this goes through ok; I did have some problems with the sum sign when copying this text?) > > I've deviced a (or rather, several) new format(s) for representing math expressions. > Why, you may wonder... Isn't MathML the answer to everyting math? Well, not quite. > Looks like it came through. The real audience for this would be people creating / editing mathematical / technical / scientific papers, not the character encoders (although there's some small overlap). I think it is valuable to explore the solution space, but have you heard from people actively using mathematical notation (in whatever form it is currently supported) to find out what they are looking for? What would you have to offer to make your new mousetrap one that people would be willing to leave investment in established technologies behind? I'm sure you've thought of all of that already. A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From valgrec at gmail.com Sat Dec 24 12:28:02 2022 From: valgrec at gmail.com (Valeria Greco) Date: Sat, 24 Dec 2022 19:28:02 +0100 Subject: New registration and a request for a new smiley Message-ID: <0E56DA9D-9A60-420E-AE4B-BC0B4128015C@gmail.com> Hi everyone, I?m very happy to subscribe to this mailing list. I would like to know if the Consortium thinks to create a Nativity emoji. In the emoji requests I don?t find anything. Merry Christmas to everyone, Valeria From kent.b.karlsson at bahnhof.se Sun Dec 25 14:48:56 2022 From: kent.b.karlsson at bahnhof.se (Kent Karlsson) Date: Sun, 25 Dec 2022 21:48:56 +0100 Subject: what is the purpose of U+2012 ? In-Reply-To: References: <20221127005124.00003aa9@secarica.ro> <5BBE1A39-72CC-4CE5-BAFA-322F7C713FFA@fn.de> <20221127151002.00007781@secarica.ro> Message-ID: I would not say that it is all that useless today. FIGURE DASH may well be used to indicate an unknown (digit) value, esp. in a tabular setting. One may be tempted to use ? instead, but actually using dash is more common, and less clumsy. And then have the dash be of digit width (using an otherwise proportional font, but with fixed width ((ascii)) digits) makes it look even better? Unknown (not yet planned, or not yet foreseeable, or, for that matter, could not retrieve value from database/computation/similar) values are quite common. /Kent K > 30 nov. 2022 kl. 20:14 skrev Asmus Freytag via Unicode : > > On 11/30/2022 10:46 AM, J?rg Knappen via Unicode wrote: >> this character >> seems somewhat redundant. > Which is neither here nor there. It's been on the books for over 30 years and before that presumably found in earlier character sets. > > The presumption would be that there are documents that use it, and that they depend on the properties as stated. That means, some fonts will continue to support it, allowing users to decide to use it for new documents, which then strengthens the need to have continuing font support. > > That said, it may reflect a typographic practice that isn't widely used today, but perhaps still more widely used than many a symbol from an archaic script? > > A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From kent.b.karlsson at bahnhof.se Mon Dec 26 09:10:18 2022 From: kent.b.karlsson at bahnhof.se (Kent Karlsson) Date: Mon, 26 Dec 2022 16:10:18 +0100 Subject: My suggestions for Unicode based math expression format(s) In-Reply-To: <0c3dd1f7-7a1e-3ccd-771e-3512d75dd73b@ix.netcom.com> References: <8B168E84-2EA2-403A-AE80-232C7ED58C08@bahnhof.se> <0c3dd1f7-7a1e-3ccd-771e-3512d75dd73b@ix.netcom.com> Message-ID: <1159F2F5-55B4-4A82-A369-F9180CB8B10F@bahnhof.se> > 15 dec. 2022 kl. 08:23 skrev Asmus Freytag via Unicode : > > On 12/13/2022 3:36 AM, Kent Karlsson via Unicode wrote: >> (Hoping that this goes through ok; I did have some problems with the sum sign when copying this text?) >> >> I've deviced a (or rather, several) new format(s) for representing math expressions. >> Why, you may wonder... Isn't MathML the answer to everyting math? Well, not quite. >> > Looks like it came through. > > The real audience for this would be people creating / editing mathematical / technical / scientific papers, not the character encoders (although there's some small overlap). > > I think it is valuable to explore the solution space, but have you heard from people actively using mathematical notation (in whatever form it is currently supported) to find out what they are looking for? > > What would you have to offer to make your new mousetrap one that people would be willing to leave investment in established technologies behind? > > I'm sure you've thought of all of that already. > > A./ MathML has been ?around? for more than 20 years. It hasn?t been a great hit. And I think I know why: it is way too verbose. So what about other ways of getting ?math on the web?. There is one that is popular: integrating LaTeX math expressions with HTML web pages. But the integration was, and still is, very much a workaround. The result is (smallish) images with math expressions in the web page. Not really what one would like to see. I guess MathML is supposed to remedy that, but MathML is still too verbose. And ?authoring tools?, such as that in MS Word, are quite tedious to use (and the result does not always look good). I know, MS Word has three different ways of representing math expressions: UnicodeMath, TeX math expressions, and OMML which is XML based but is not MathML. Imagine writing a math textbook, with tens of math expressions per page, and hundreds of pages. Using MathML or OMML, even with authoring tools, would be tedious indeed. Ok, maybe that is not the target for MathML or OMML. But still, why should there be vastly different representations for shorter texts (like a web page) vs. longer texts (like a math textbook). They should at least be perfectly equivalent. There should be a format that works both for the web (HTML) and other contexts (some other markup, even plain text), one that is light-footed, and equivalent in all ?surface forms? (HTML, some other markup, even plain text). Further, in a Unicode context handling bidi properly is almost a requirement. Out of the mentioned formats, it is, IIUC, only MathML that attempts to cover bidi. And it is not done very well, at least it is not reliable, especially not w.r.t. arrows. It should be noted that for bidi there are different conventions when to ?mirror? a math subexpression. E.g., for division it is not always reversed (RTL). Indeed, how does one make a horizontal division textually reversed? And for subtraction it would be very confusing to ?swap the arguments? since the operator is left/right symmetric. So it must be up to the author to decide which arguments to ?swap?; no blind textual reversal. That goes also for which direction ?operators? (including arrows) are to be ?mirrored?. One cannot blindly mirror math operators just because the context is RTL. Indeed, arrows (and arrow-like) characters are treated especially handwavingly in the bidi algorithm specification. And how to avoid the heavy-footedness of MathML and OMML? Much of that is due to the convention of ?fully bracket? everything based on XML. But if one ignores that convention, then we can have a notation that is compatible with XML, and still be more ?light-footed?, almost like TeX or UnicodeMath. But then an extra level of parsing is needed, a level akin to the parsing done for eqn, TeX or UnicodeMath. That is unusual, but I don?t think that is contrary to what is allowed in XML. The solution I?ve presented ?goes back to basics?. Math expressions basically have not changed for decades. I hesitate to say centuries, since it has changed over several centuries. Still, there is a basic structure, regardless of semantics, of placements of math expression parts. Just the placement, no reference to semantics, or even operator precedence. Using a system like XML, where there is a strong tradition of ?full bracketing?, is fine when the (textual, in number of characters) size of the content is much greater than the bracketing (in case of XML, they are called tags). But for math expressions, full bracketing for the structure notation, with clumsy brackets (tags) to boot, is not a great idea. I don?t think there is anything in XML that *forces* the use of full bracketing. Hence, the proposal I make (XML variant) uses (invisible) structure *operators* (with priorities expressed in the grammar), in the XML variant these operators are ?empty tags?, to allow avoiding bracketing when it can be derived by parsing according to the grammar. But not everything ?is XML?. There needs to be a fully equivalent representation suitable for context that are not XML. And I presented one such, based on C1 control codes. Much as the UTC loves to hate (C0 and) C1 control codes, the UTC has also declared that there ?will be no more control codes?. Fortunately, ECMA-48 control codes allows for several extension mechanisms (that would be ?available/reserved for future standardisation? as well as explicitly ?private use? ones), voiding the need to allocate new control codes in Unicode; the extension mechanisms are still available. For the math expressions, using SCI, Single Character Introducer, seems to be the handiest. That, together with using a few more hitherto unused C1 control codes enabled a representation of math expressions that can reasonably be called plain text (or a plain text protocol). XML, on the other hand, uses pure printable character substrings as controls, and is a higher-level protocol. But what about typeability (direct from the keyboard)? Well, the XML version is technically typable, especially the verbosity is kept down quite a bit in my proposal (but not when it is verbose, typing by hand is simply too much). But it is still XML, even if less verbose, so not so handy. And the C1 version? Well, it is very compact, and truly plain text. But most text editors are not friendly when it comes to C1 control characters, and those characters are also essentially impossible to type directly from the keyboard. So a short, non-verbose, markup (in ?markdown style?)? We already have UnicodeMath (which is NOT a plain text format, claiming that it is is misleading). Two reasons why that is not quite right: 1) It is not exactly equivalent to any other format. 2) As a markup language (which it is), it is not particularly well designed. It?s a bit haphazard, and frankly a hack. Furthermore, 3) it does not seem to cater for bidi. My sketchy proposal in appendix C is based on the C1 version of the proposed math expression formats. Admittedly a bit quirky, since C1 controls, in particular SCI, is a bit quirky. I don?t know the original intent for SCI, but it was left unspecified in use, except that it takes one, and only one, character after the SCI to create a whole lot of control sequences. But asking for new control characters is a non-starter, so this one seemed fine to use. Some of the quirkiness is avoided in the ?mark-down? version hinted at in appendix C, and maybe more can be avoided. And, since I?m sure someone will ask (or even protest): So-called ?MATHEMATICAL? letters and digits are forbidden in the proposed formats. Why? The fundamental reason is that they are ill-conceived. Variables are often multi-letter, especially in computer science. And they need not be in English (even though that is commonly so), so other (Latin) letters than a-zA-Z may be used. Therefore, there is a more general math letters (and numerals, and some symbols) styling mechanism proposed. This mechanism can handle multiple-letter variables easily, as well as handle other letters than those that have been allocated as ?MATHEMATICAL? ones. Note that the distinction that TeX makes between ?default math variable style? (almost the same as \mathnormal) and \mathit. Very often one must apply \mathit to get a proper typesetting of multiletter variables. I come across such things as ?coefficient? written (by others than me) in MS Word math expressions, and it looks horrible. Just like when forgetting to apply \mathit in TeX (and I have written a few texts where I needed to apply \mathit quite generously; not counting TeX?s ability to define macros/commands). Neither MS Word nor MathML seems to cater well for what in TeX is \mathit (it may be applicable, but not easily so). Further, combining characters apply to a base character in Unicode, never to a math expression, despite what other math formats say (unfortunately, MathML is among the formats that mistreat combining characters). So while I?m not keen on referring to math expression representation formats as ?mouse traps?, I do think I have a solution that avoids several of the problems of other formats, their flaws (verbosity, single context only), shortcomings (not handling RTL arrow-like characters right, inability or difficulty to handle multiletter variables, haphazard design of mark-down style mark-up), and outright errors (misinterpretation of combining characters, forcing very strict RTL on RTL math expression). And, of course, the non-use of the ill-conceived ?MATHEMATICAL? letters. Of course, there is no claim of perfection, so comments are welcome. Kind regards /Kent K -------------- next part -------------- An HTML attachment was scrubbed... URL: From jameskass at code2001.com Mon Dec 26 09:45:29 2022 From: jameskass at code2001.com (James Kass) Date: Mon, 26 Dec 2022 15:45:29 +0000 Subject: My suggestions for Unicode based math expression format(s) In-Reply-To: <1159F2F5-55B4-4A82-A369-F9180CB8B10F@bahnhof.se> References: <8B168E84-2EA2-403A-AE80-232C7ED58C08@bahnhof.se> <0c3dd1f7-7a1e-3ccd-771e-3512d75dd73b@ix.netcom.com> <1159F2F5-55B4-4A82-A369-F9180CB8B10F@bahnhof.se> Message-ID: <87bd2ec7-2ef4-73cf-6ec4-620bb0c72a58@code2001.com> On 2022-12-26 3:10 PM, Kent Karlsson via Unicode wrote: > So while I?m not keen on referring to math expression representation formats as ?mouse traps?, For anyone unfamiliar with this idiomatic usage, it derives from the saying "build a better mousetrap, and the world will beat a path to your door". https://en.wikipedia.org/wiki/Build_a_better_mousetrap,_and_the_world_will_beat_a_path_to_your_door From kent.b.karlsson at bahnhof.se Mon Dec 26 11:10:25 2022 From: kent.b.karlsson at bahnhof.se (Kent Karlsson) Date: Mon, 26 Dec 2022 18:10:25 +0100 Subject: My suggestions for Unicode based math expression format(s) In-Reply-To: <87bd2ec7-2ef4-73cf-6ec4-620bb0c72a58@code2001.com> References: <87bd2ec7-2ef4-73cf-6ec4-620bb0c72a58@code2001.com> Message-ID: <542A171C-5A5B-4275-B39E-75B8A406A856@bahnhof.se> I understand the metaphor. That does not mean that I have to like that formulation? /K Skickat fr?n min iPhone > 26 dec. 2022 kl. 16:47 skrev James Kass via Unicode : > > ? >> On 2022-12-26 3:10 PM, Kent Karlsson via Unicode wrote: >> So while I?m not keen on referring to math expression representation formats as ?mouse traps?, > > For anyone unfamiliar with this idiomatic usage, it derives from the saying "build a better mousetrap, and the world will beat a path to your door". > > https://en.wikipedia.org/wiki/Build_a_better_mousetrap,_and_the_world_will_beat_a_path_to_your_door From asmusf at ix.netcom.com Mon Dec 26 14:13:35 2022 From: asmusf at ix.netcom.com (Asmus Freytag) Date: Mon, 26 Dec 2022 12:13:35 -0800 Subject: My suggestions for Unicode based math expression format(s) In-Reply-To: <542A171C-5A5B-4275-B39E-75B8A406A856@bahnhof.se> References: <87bd2ec7-2ef4-73cf-6ec4-620bb0c72a58@code2001.com> <542A171C-5A5B-4275-B39E-75B8A406A856@bahnhof.se> Message-ID: <7741b8c3-f55b-ca00-61ff-a48dc2890f30@ix.netcom.com> On 12/26/2022 9:10 AM, Kent Karlsson via Unicode wrote: > I understand the metaphor. That does not mean that I have to like that formulation? > > /K > > Skickat fr?n min iPhone > >> 26 dec. 2022 kl. 16:47 skrev James Kass via Unicode: >> >> ? >>> On 2022-12-26 3:10 PM, Kent Karlsson via Unicode wrote: >>> So while I?m not keen on referring to math expression representation formats as ?mouse traps?, >> For anyone unfamiliar with this idiomatic usage, it derives from the saying "build a better mousetrap, and the world will beat a path to your door". >> >> https://en.wikipedia.org/wiki/Build_a_better_mousetrap,_and_the_world_will_beat_a_path_to_your_door > No matter the metaphor used, the underlying issue stands: Whatever the perceived shortcomings of existing solutions, they represent investment by implementers and users alike. Anything new will by definition be incompatible with existing practice (which is initially on the minus side of the ledger). To get people to accept a new solution requires that a significant enough number will see the promised benefits (positive side) as large enough to warrant investment into adoption and implementation. That was no different from the situation Unicode was in when it was invented as a new technology to overcome the shortcomings of existing solutions. You may be correct that your solution has advantages. I'm not a in good position to evaluate that, because I'm not a user of any of the existing solutions (although I have used LaTeX in the past, and have a working knowledge of mathematical notation). The same is true, if more so, for most people currently active in the Unicode space. They are not the community that would be using a new mathematical notation, or even uses existing ones. A./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Tue Dec 27 03:11:01 2022 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Tue, 27 Dec 2022 09:11:01 +0000 (GMT) Subject: My suggestions for Unicode based math expression format(s) In-Reply-To: <7741b8c3-f55b-ca00-61ff-a48dc2890f30@ix.netcom.com> References: <87bd2ec7-2ef4-73cf-6ec4-620bb0c72a58@code2001.com> <542A171C-5A5B-4275-B39E-75B8A406A856@bahnhof.se> <7741b8c3-f55b-ca00-61ff-a48dc2890f30@ix.netcom.com> Message-ID: <5788e986.21e19.18552d976b0.Webtop.96@btinternet.com> Hi I have never used the various existing packages that have been mentioned. May I make three observations please? 1. I consider that using control codes to specify layout is a problem. A way to express things without using control codes is needed. 2. Would a test be that what one wants to typeset can be typeset in Microsoft WordPad? One might need to copy and paste characters from a WordPad file that has one of each character in it, as if it were a typecase. For the avoidance of doubt I am not suggesting that all typesetting should be done in WordPad, not at all, but I am saying that if it cannot be typeset in WordPad then a format may be too complicated or too expensive or too inaccessible for widespread use. 3. Back in the early 1990s I was involved in a discussion of how to express mathematical equations using just 7-bit ASCII characters in a monospaced display typical of mainframe visual display units terminals at the time. Some of my suggestions might be relevant here. I suggested using :pom: to express a 'plus or minus' sign as used in the general solution formula for a quadratic equation. That format could be used for special symbols. These days, an OpenType font could cause a correct glyph to be displayed, even if the glyph is not a regular Unicode character. I suggested that an integral be expressed using three capital I letters, one above the other in three lines of text. I I I that then allows upper and lower limits to be expressed for definite integrals. For example I t=1 I exp(-t).dt I t=0 Then summation could be expressed as follows. S n=5 S n^2 S n=1 and product similarly using three P characters. P P P This system could be used to some extent immediately without any additional software being needed. An OpenType font could be used to substitute a 'plus or minus' sign for :pom: and for other symbols. Hopefully software could be written to substitute the three capital I letters with a single integral sign. I hope this helps. Best regards, William Overington Tuesday 27 December 2022 -------------- next part -------------- An HTML attachment was scrubbed... URL: From kent.b.karlsson at bahnhof.se Tue Dec 27 17:58:28 2022 From: kent.b.karlsson at bahnhof.se (Kent Karlsson) Date: Wed, 28 Dec 2022 00:58:28 +0100 Subject: My suggestions for Unicode based math expression format(s) In-Reply-To: <5788e986.21e19.18552d976b0.Webtop.96@btinternet.com> References: <87bd2ec7-2ef4-73cf-6ec4-620bb0c72a58@code2001.com> <542A171C-5A5B-4275-B39E-75B8A406A856@bahnhof.se> <7741b8c3-f55b-ca00-61ff-a48dc2890f30@ix.netcom.com> <5788e986.21e19.18552d976b0.Webtop.96@btinternet.com> Message-ID: Let me answer by forward my reply to William?s message a few days ago. ((NLFs are not treated ideally, but that seems to be a common bug; so there are some spurious extra line breaks below)): > Vidarebefordrat brev: > > Fr?n: Kent Karlsson > ?mne: Re: My suggestions for Unicode based math expression format(s) > Datum: 13 december 2022 16:20:42 CET > Till: William_J_G Overington > > > > Skickat fr?n min iPhone > >> 13 dec. 2022 kl. 13:33 skrev William_J_G Overington >: >> >> ? >> Hi >> >> >> I have never used the various existing packages that you mention. >> >> >> >> May I make three observations please? >> >> >> >> 1. I consider that using control codes to specify layout is a problem. A way to express things without using control codes is needed. >> > There is. Two of them. >> >> 2. Would a test be that what one wants to typeset can be typeset in Microsoft WordPad? One might need to copy and paste characters from a WordPad file that has one of each character in it, as if it were a typecase. For the avoidance of doubt I am not suggesting that all typesetting should be done in WordPad, not at all, but I am saying that if it cannot be typeset in WordPad then a format may be too complicated or too expensive or too inaccessible for widespread use. >> > That is a question for someone else than me? >> >> 3. Back in the early 1990s I was involved in a discussion of how to express mathematical equations using just 7-bit ASCII characters in a monospaced display typical of mainframe visual display units terminals at the time. >> >> >> >> Some of my suggestions might be relevant here. >> >> >> >> I suggested using :pom: to express a plus or minus sign as used in the general solution formula for a quadratic equation. That format could be used for special symbols. These days, an OpenType font could cause a correct glyph to be displayed, even if the glyph is not a regular Unicode character. >> > I think that it would not be a good idea to request or or expect that of any font. Any character escapes should be interpreted before any font is involved. >> >> I suggested that an integral be expressed using three capital I letters, one above the other in three lines of text. >> >> >> >> I >> >> I >> >> I >> >> >> >> that then allows upper and lower limits to be expressed for definite integrals. >> >> >> >> For example >> >> >> >> I t=1 >> >> I exp(-t).dt >> >> I t=0 >> >> >> >> Then summation could be expressed as follows. >> >> >> >> S n=5 >> >> S n^2 >> >> S n=1 >> >> >> >> and product similarly using three P characters. >> >> >> >> P >> >> P >> >> P >> >> >> > nroff with eqn did output to a typewriter-like terminal (fixed width), needing multiple character daisy-wheels some with mathematical characters. This terminal was capable of partial line up/down. But changing daisy wheels had to be done by hand. The output was a bit crude, but did have a fair semblance of normal mathematical typesetting. (troff produced better output, for typesetting machines of the day.) This was before TeX came along. > > /Kent K > > >> I hope this helps. >> >> >> >> Best regards, >> >> >> >> William Overington >> >> >> >> Tuesday 13 December 2022 >> >> >> >> >> >> >> >> >> >> >> ------ Original Message ------ >> From: "Kent Karlsson via Unicode" > >> To: unicode at corp.unicode.org >> Sent: Tuesday, 2022 Dec 13 At 11:36 >> Subject: My suggestions for Unicode based math expression format(s) >> >> (Hoping that this goes through ok; I did have some problems with the sum sign when copying this text?) >> >> >> >> I've deviced a (or rather, several) new format(s) for representing math expressions. >> >> Why, you may wonder... Isn't MathML the answer to everyting math? Well, not quite. >> >> >> >> After more than 20 years since the first version of MathML, it is still not a great >> >> success. I think there are several reasons for that. One is the obvious: it is too >> >> verbose. Another is that (much due to the verbosity) that one really need authoring >> >> tools to be able to write any math expression in the MathML representation. The >> >> advantage of TeX math (or even old eqn) expressions is that users can with relative >> >> ease type the expression they want on the keyboard. Ordinary cut-paste-modifyViaKeyboard >> >> works. Authoring tools are less straight-forward to use. Furhter, not everything is >> >> HTML (or XML). One may even want to have math expressions in what is otherwise plain >> >> text; for instance for cut and paste, loosing styling per se (colour, bold/..., size) >> >> but not the math expressions. >> >> >> >> But what about typability, directly from the keyboard, without using a special authoring >> >> tool? Are eqn or TeX the only options? Well, there is AsciiMath and UnicodeMath... >> >> However, those do common parenthesis parsing that is undesirable, among other things. >> >> And, apart from UnicodeMath, they were created long before Unicode, so they are not well >> >> adapted to using Unicode characters. >> >> >> >> OMML (Office Math ML, also XML based) is just as verbose as MathML, if not worse. >> >> >> >> Using {} (a convention borrored from TeX; and using \{ and \} for literal {}) and some >> >> other special "mark-down" and character escape inspired notations, we can make a surface >> >> form of a math expression representation (encoding if you like) that is typable on a Latin based >> >> keyboard; except that ? and ? in the example here may need some further escape notation, >> >> like \sum, \pi, to be fully keyboard typable (similarly to TeX, eqn, UnicodeMath, etc.). Not-so-common >> >> symbols will still need to be picked from some kind of menu, or use Unicode charater escapes, >> >> \uxxxx, \Uxxxxxx. Here is an example, using the same expression as is used as the lead example >> >> in the MathML Core specification; it looks a little bit like TeX, intentionally, due to the selection >> >> of {}^_ as meta-characters for certain math expression controls, but isn't TeX: >> >> >> >> ${?$/{n=1}$\{+?}{1\/n^2}={?^2\/6}} >> >> >> >> There is also a HTML/XML compatible form proposed, that is fully equivalent in expressivity >> >> with the other forms/variants proposed. Though it is not MathML, but it is using XML tags, >> >> so it is a bit longer than the above (read "me" as "math expresion"): >> >> >> >> ?n=1+?1n2=?26 >> >> >> >> Or with some more whitespace/linebreaks: >> >> >> >> ?n=1+? 1n2 >> >> = >> >> ?26 >> >> >> >> >> >> This shows that having math expressions in an XML compatible format does not need to have >> >> clay feet. There are several key reasons for this relative light-footedness. The reasons >> >> include using: default styles, short tag/attribute names (for the XML variant) and short >> >> controls/markup for the other variants, and the use of a level of structural parsing, >> >> uncommon for XML (but otherwise common, also for math, in e.g. eqn and TeX). Details in >> >> the spec referenced below. >> >> >> >> It also shows that equivalent representations can be even more light-footed than the XML/HTML >> >> variant, as well as the possiblilty of having variant surface representation that fits with >> >> at least some other contexts (than XML/HTML). >> >> >> >> In addition, the respresentations (all variants) can still be general enough to allowRTL math >> >> expressions in a reliable way (in particular, reliable direction of arrows, which in math expressions >> >> almost always refer to the left and right side "arguments?, not an external physical direction), >> >> as well as chemical reaction formulas (math-like, not graphical) and the like. Re. arrows: see >> >> https://www.unicode.org/L2/L2022/22026r-non-bidi-mirroring.pdf . >> >> >> >> You can find the proposed format(s) specification at >> >> https://github.com/kent-karlsson/control/blob/main/math-layout-controls-2022-C.pdf . >> >> >> >> There is absolutely no claim that this covers everyting w.r.t. math expressions; >> >> very likely it does not. But it does cover more than I set out to cover. There is no attempt >> >> to be compatible with MathML (sorry, but that would have killed the idea). >> >> >> >> Comments are welcome. >> >> >> >> Happy Lucia!* >> >> /Kent Karlsson >> >> >> >> * https://en.wikipedia.org/w/index.php?title=Saint_lucia%27s_day > 27 dec. 2022 kl. 10:11 skrev William_J_G Overington via Unicode : > > > Hi > > I have never used the various existing packages that have been mentioned. > > May I make three observations please? > > 1. I consider that using control codes to specify layout is a problem. A way to express things without using control codes is needed. > > 2. Would a test be that what one wants to typeset can be typeset in Microsoft WordPad? One might need to copy and paste characters from a WordPad file that has one of each character in it, as if it were a typecase. For the avoidance of doubt I am not suggesting that all typesetting should be done in WordPad, not at all, but I am saying that if it cannot be typeset in WordPad then a format may be too complicated or too expensive or too inaccessible for widespread use. > > 3. Back in the early 1990s I was involved in a discussion of how to express mathematical equations using just 7-bit ASCII characters in a monospaced display typical of mainframe visual display units terminals at the time. > > Some of my suggestions might be relevant here. > > I suggested using :pom: to express a 'plus or minus' sign as used in the general solution formula for a quadratic equation. That format could be used for special symbols. These days, an OpenType font could cause a correct glyph to be displayed, even if the glyph is not a regular Unicode character. > > I suggested that an integral be expressed using three capital I letters, one above the other in three lines of text. > > I > I > I > > that then allows upper and lower limits to be expressed for definite integrals. > > For example > > I t=1 > I exp(-t).dt > I t=0 > > Then summation could be expressed as follows. > > S n=5 > S n^2 > S n=1 > > and product similarly using three P characters. > > P > P > P > > This system could be used to some extent immediately without any additional software being needed. An OpenType font could be used to substitute a 'plus or minus' sign for :pom: and for other symbols. Hopefully software could be written to substitute the three capital I letters with a single integral sign. > > I hope this helps. > > Best regards, > > William Overington > > Tuesday 27 December 2022 -------------- next part -------------- An HTML attachment was scrubbed... URL: