IdnaTest.txt and RFC 5893

Markus Scherer markus.icu at gmail.com
Wed Jan 4 17:40:15 CST 2017


On Wed, Jan 4, 2017 at 2:28 AM, Alastair Houghton <
alastair at alastairs-place.net> wrote:

> RFC 5893 seems pretty clear to me, and the problem really is that the test
> vectors (which come from unicode.org) seem (to me) to be incorrect.


https://tools.ietf.org/html/rfc5893#section-2 says "*The following rule*,
consisting of six conditions, *applies to labels* in Bidi domain names."

That's what the ICU code does -- applying the rule to each label -- and I
assume that's the basis for the test data.

The latter part of this RFC section says that *if* certain conditions are
met *for all labels, then* the domain name as a whole displays well.

ICU does not currently check for multi-label bidi combinations.

FYI the ICU checkLabelBiDi() code is currently here
<http://bugs.icu-project.org/trac/browser/trunk/icu4j/main/classes/core/src/com/ibm/icu/impl/UTS46.java#L541>
(Java
version).

markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20170104/6484b639/attachment.html>


More information about the Unicode mailing list