Plain text custom fraction input

Richard Wordingham richard.wordingham at
Wed Jul 22 17:54:02 CDT 2015

On Wed, 22 Jul 2015 12:21:32 +0200 (CEST)
Marcel Schneider <charupdate at> wrote:

> On 22 Jul 2015, at 09:52, Richard Wordingham  wrote:

> We never thought of common hieroglyphs otherwise as running LTR,
> while on monuments the great liberty of the script allows to run in
> amost all directions. IMO monumental transcription is always
> difficult to deal with, whenever exact rendering is expected.
> However, since Unicode's purpose is plain text encoding, we must
> stick with what I consider as a convention in egyptology...

Which means that Ancient Egyptian hieroglyphs are unencoded!  Their
default direction is right-to-left, but that's only the start of the
trouble.  The encoded hieroglyphs aren't Bidi-mirrored, so if I embed
then in a right-to-left override, I should get retrograde characters.
Now these aren't totally useless, but at present we seem to need a
duplicate set of right-to-left hieroglyphs for unstacked text.  There
is work in progress to allow normal Egyptological hieroglyphic text.

There seems to have been a change in the notion of what the Egyptian
scripts are.  Hieratic texts are normally printed in hieroglyphs for
general study, so it had seemed that it would be legitimate to use a
font that rendered a hieratic style rather than a hieroglyphic style.
(Some 'hieroglyphs' only occurred in the hieratic style.)  The
hieratic style is strictly right-to-left, so rendering the text in a
hieratic style would not be compliant with Unicode.  However, it seems
that the hieratic style is now a separate script, so any such
rendering would now be doubly non-compliant. 

> ...which brings us back to plain text fractions, which by an apparent
> but tacit convention we can input as an *unlimited* string of
> superscript digits, followed by U+2044, followed by an *unlimited*
> string of subscript digits. What are you referring to when talking
> about implementing the fraction slash?

If you are happy with that style, I was wrong, I wasn't being clever
enough.  In a left to right context, the conversion of digits to the
numerator and denominator forms can progress from right to left for the
numerator by conditioning on the following character being a fraction
slash or converted digit, and similarly from left to right for the
denominator.  I'm not sure what should happen in right to left
contexts.  I've a feeling the numerator should come before the
denominator, but the bidi algorithm doesn't swap them - it keeps the
first number on the left. Note that subscript and superscript digits
are only available for those of us who use the Western Arabic digits.

However, I believe there is a real problem for the 'nut' style, where
the numerator and denominator are separated by a horizontal line - in
Western Asia westwards.  I'm having trouble finding examples of
fractions using Indic scripts - apparently they originally stacked the
numerator above the denominator, but I don't know what happens nowadays.

> If this input method is not encouraged, what's the use of U+215F

It's for temporarily storing a character defined in some other coding


More information about the Unicode mailing list