Bidi paragraph direction in terminal emulators (was: Proposal for BiDi in terminal emulators)

Richard Wordingham via Unicode unicode at unicode.org
Mon Feb 4 19:44:23 CST 2019


On Mon, 4 Feb 2019 22:27:39 +0100
Egmont Koblinger via Unicode <unicode at unicode.org> wrote:

> Hi Richard,
> 
> > The concept appears to exist in the form of the fields of the
> > fifth edition of ECMA-48.  Have you digested this ambitious
> > standard?  
> 
> To be honest: No, I haven't. And I have no idea what those "fields"
> are.

(Taken out of order)

> That being said, I'd really, honestly love to see if someone evaluated
> ECMA's "fields" and created a feasibility study for current terminal
> emulators, similarly to how I did it with TR/53.

They mostly seem to be security, protection and checking features.
They seem to make sense for a captive system used as a till or for stock
look-up by customers.  For example, fields can be restricted as to how
they are overwritten, e.g. not at all, or only with numbers, and some
fields cannot be copied from the terminal.  HTML forms seem to provide
most of this functionality nowadays.

Fields are persistent attributes.

On reading further, the pane boundary functionality seems to be
provided by the 'line home position' and 'line limit position'.  These
would have to be re-established whenever a pane became the active pane,
but they seem to support the notion of writing a paragraph into a
pane, with the terminal sorting out the splitting into lines.  I'm not
sure that this would be portable between ECMA-48 terminals; I get
the impression that there would be a reliance on unstandardised
behaviour being appropriate.  I could be wrong; the specification may
be there.

> I spent (read: wasted) way too much time studying ECMA TR/53 to get to
> understand what it's talking about, to realize that the good parts
> were already obvious to me, and to be able to argue why I firmly
> believe that the bad parts are bad. Remember: These documents were
> created in 1991, that is, 28 years ago. (I'm emphasizing it because I
> did the math wrong for a long time, I though it was 18 years ago :-D.)
> Things have a changed a lot since then.

It took me a while to work out that the recommendations of ECMA TR/53
had been implemented in Issue 5 of ECMA-48.

> As for the BiDi docs, I found that the current state of the art,
> current best practices, exisiting BiDi algorithm differ so much from
> ECMA's approach (which no one I'm aware of cared to implement for 28
> years) that the standard is of pretty little use. Only a few good
> parts could be kept (but needed tiny corrections), and plenty of other
> things needed to be build up anew. This is the only reasonable way to
> move forward.

The relationship between the data store and the presentation store
don't seem to be very well defined.  There may be room for the BiDi
algorithm there.

> If you designed a house 2 or 3 years ago, and finally have the money
> to get it built, you can reasonably start building it. If you designed
> a house 28 years ago and finally have the chance to build it
> (including the exact same heating technologies, electrical system
> etc.), you wouldn't, would you? I'm sure you looked at those plans,
> and started at the very least heavily updating them, or started to
> design a brand new one, perhaps somewhat based on your old ideas.

But a scheme may be more persuasive if it can be said to conform to
ECMA-48.

One thing that is very unclear in ECMA-48 is how characters are
allocated to cells in 'implicit' mode.  As the Arabic encoding
considered contained harakat, it looks as though the allocation is
defined by 'unspecified protocols'. I note that in the scheme
apparently given most consideration, forced Arabic presentation forms
are selected by a combination of escape sequences and Arabic letters.
The 'unspecified protocols' could be interpreted as one grapheme
cluster* per group of cells.  The typical groups would be one cell and
the two cells for a CJK character.

*Grapheme cluster is a customisable concept.
 
> I don't expect it to be any different with "fields" of ECMA-48. I'm
> not aware of any terminal emulator implementing anything like them,
> whatever they are. Probably there's a good reason for that. Whatever
> purpose they aimed to serve apparently wasn't important enough for
> such a long time. By now, if they're found important, they should
> probably be solved by some new design (or at the very least, just like
> I did with TR/53, the work should begin by evaluating that standard to
> see if it's still feasible).

> Instead of spending a huge amount of work on my BiDi proposal, I could
> have just said: "guys, let's go with ECMA for BiDi handling". The
> thing is, I'm pretty sure it wouldn't have taken us anywhere. I don't
> expect it to be different with "fields" either.

Your interpretation document would have explored the issues.

> The starting point for my work was the current state of terminal
> emulators and the surrounding ecosystem, plus the current BiDi
> algorithm; not some ancient plan that was buried deep in some drawer
> for almost three decades. I hope this makes sense.

You're assuming that the committee process didn't add much value to the
standard.

Richard.


More information about the Unicode mailing list