Limits in UBA

Eli Zaretskii eliz at gnu.org
Wed Oct 22 21:41:06 CDT 2014


> From: "Whistler, Ken" <ken.whistler at sap.com>
> CC: "unicode at unicode.org" <unicode at unicode.org>, "Whistler, Ken"
> 	<ken.whistler at sap.com>
> Date: Wed, 22 Oct 2014 19:18:38 +0000
> Accept-Language: en-US
> 
> > I'd appreciate some pointers to such texts, if they are publicly
> > accessible.  I'd be very interested to see why such deep embeddings
> > are necessary.
> 
> They aren't necessary for human-generated text. There is no normal human text
> reading case for them.

But if humans aren't going to read that text, the embeddings aren't
necessary at all, because programs read and process text in logical
order anyway.  Bidi reordering is a display-time feature, meant for
human consumption.

> An example I could think of off the top of my head might involve some
> complicated database application working with Arabic data.

Again, if the query is to be submitted to a program, there should not
be a need for embeddings at all.

> And while the database itself doesn't care about UBA or display
> order when parsing and compiling such queries, the SQL text can be
> and *is* routinely logged. And the worry by the UTC is that when
> such logged generated text might include encapsulated embedded
> chunks, you don't want UBA per se to be introducing limits that
> cause failures when there might be a use case to display such text
> for diagnostics, for example. I don't happen to *know* of a
> particular example of such text to point you to, but that kind of
> thing is the relevant use scenario.

Still, the number 63 or 127 sounds arbitrary, and unnecessarily large
to me.


More information about the Unicode mailing list