Bidi edge cases in Hangul and Indic

David Corbett via Unicode unicode at unicode.org
Thu Feb 22 21:21:26 CST 2018


On Thu, Feb 22, 2018 at 6:32 PM, Ken Whistler wrote:

>
> If you override the normal left-to-right ordering with bidi override
> controls, then the layout order is reversed, but what is actually laid out
> is those two glyphs. So you just reverse the order of the two syllables for
> display, in either case.
>

My confusion stems from Unicode’s online bidi utility. Compare
https://unicode.org/cldr/utility/bidi.jsp?a=%E2%80%AE%EB%B3%B4%EA%B8%B0
(NFC) to https://unicode.org/cldr/utility/bidi.jsp?a=%E2%80%AE%E1%84%
87%E1%85%A9%E1%84%80%E1%85%B5 (NFD). Concatenating each one’s characters in
reordered display position order produces canonically different results.

Here is more practical example. A sequence of an emoji modifier base and an
emoji modifier in an RTL run will be display-reordered such that the
modifier is to left of the base. Clearly, the right thing is to not reorder
them, because they should ligate to form a single glyph. Contrast this with
“fl” in an RTL run, which will be display-reordered to “lf”: it would be
wrong to apply the previous rationale here just because “fl” may have a
single glyph.

It sounds like the UBA doesn’t specify how to reorder the glyphs of the
characters within a level run. That’s about what I expected. I was just
worried it might require an easily implemented but wrong order, so thanks
for the reassurance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20180222/8abe12e3/attachment.html>


More information about the Unicode mailing list