Transforming BidiTest.txt to the format of BidiCharacterTest.txt

Markus Scherer markus.icu at gmail.com
Wed Feb 12 13:46:03 CST 2014


On Wed, Feb 12, 2014 at 11:09 AM, Whistler, Ken <ken.whistler at sap.com>wrote:

> Eric,
>
> The C version of the bidiref code does that, in part.
>
> See the function br_ParseFileFormatB in brinput.c.
>
> http://www.unicode.org/Public/PROGRAMS/BidiReferenceC/6.3.0/
>
> It doesn't actually *transform* the BidiTest.txt file to output the other
> format, but it
> parses the input and then constructs calls into the bidi testing API in
> the same format
> used when it parses BidiCharacterTest.txt. So you could adapt that code,
> if you
> want, to writing out lines in the format of BidiCharacterTest.txt. The
> main addition you would have to make would be to add a table of
> characters exemplifying each of the bidi classes, so you could map
> the bidi class values from BidiTest.txt back to actual code points to
> store in BidiCharacterTest.txt format.
>

ICU also has test code that parses both files, but it does not transform
either one into the format of the other. We have both C++ and Java, and I
can send you URLs if you are interested. There are also sample characters
per Bidi_Class.

markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20140212/c4ceee36/attachment.html>


More information about the Unicode mailing list