Missing UAX#31 tests?

Mon Jul 9 22:24:06 CDT 2018

Thanks, Karl.

Mark

On Mon, Jul 9, 2018 at 10:11 PM, Karl Williamson <public at khwilliamson.com>
wrote:

> On 07/08/2018 03:21 AM, Mark Davis ☕️ wrote:
>
>> I'm surprised that the tests for 11.0 passed for a 10.0 implementation,
>> because the following should have triggered a difference for WB. Can you
>> check on this particular case?
>>
>> ÷ 0020 × 0020 ÷#÷ [0.2] SPACE (WSegSpace) × [3.4] SPACE (WSegSpace) ÷
>> [0.3]
>>
>
> I'm one of the people who advocated for this change, and I had already
> tailored our implementation of 10.0 to not break between horizontal white
> space, so it's actually not surprising that this rule didn't break
>
>>
>>
>> About the testing:
>>
>> The tests are generated so that they go all the combinations of pairs,
>> and some combinations of triples. The generated test cases use a sample
>> from each partition of characters, to cut down on the file size to a
>> reasonable level. That also means that some changes in the rules don't
>> cause changes in the test results. Because it is not possible to test every
>> combination, so there is also provision for additional test cases, such as
>> those at the end of the files, eg:
>>
>> https://unicode.org/Public/11.0.0/ucd/auxiliary/WordBreakTest.html
>> https://unicode.org/Public/10.0.0/ucd/auxiliary/WordBreakTest.html
>>
>> We should extend those each time to make sure we cover combinations that
>> aren't covered by pairs. There were some additions to that end; if they
>> didn't cover enough cases, then we can look at your experience to add more.
>>
>> I can suggest two strategies for further testing:
>>
>> 1. To do a full test, for each row check every combinations obtained by
>> replacing each sample character by every other character in its
>> partition. Eg for the above line that would mean testing every <WSegSpace,
>> WSegSpace> sequence.
>>
>> 2. Use a monkey test against ICU. That is, generate random combinations
>> of characters from different partitions and check that ICU and your
>> implementation are in sync.
>>
>> 3. During the beta period, test your previous-version with the new test
>> files. If there are no failures, yet there are changes in the rules, then
>> raise that issue during the beta period so we can add tests.
>>
>
> I actually did this, and as I recall, did find some test failures.  In
> retrospect, I must have screwed up somehow back then.  I was under tight
> deadline pressure, and as a result, did more cursory beta testing than
> normal.
>
>>
>> 4. If possible, during the beta period upgrade your implementation and
>> test against the new and old test files.
>>
>
>
>> Anyone else have other suggestions for testing?
>>
>> Mark
>>
>>
> As an aside, a release or two ago, I implemented SB, and someone
> immediately found a bug, and accused me of releasing software that had not
> been tested at all.  He had looked through the test suite and not found
> anything that looked like it was testing that.  But he failed to find the
> test file which bundled up all your tests, in a manner he was not
> accustomed to, so it was easy for him to overlook.  The bug only manifested
> itself in longer runs of characters than your pairs and triples tested.  I
> looked at it, and your SB tests still seemed reasonable, and I should not
> expect a more complete series than you furnished.
>
>>
>>
>> Mark
>> //////
>>
>> On Sun, Jul 8, 2018 at 6:52 AM, Karl Williamson via Unicode <
>> unicode at unicode.org <mailto:unicode at unicode.org>> wrote:
>>
>>     I am working on upgrading from Unicode 10 to Unicode 11.
>>
>>     I used all the new files.
>>
>>     The algorithms for some of the boundaries, like GCB and WB, have
>>     changed so that some of the property values no longer have code
>>     points associated with them.
>>
>>     I ran the tests furnished in 11.0 for these boundaries, without
>>     having changed the algorithms from earlier releases.  All passed 100%.
>>
>>     Unless I'm missing something, that indicates that the tests
>>     furnished in 11.0 do not contain instances that exercise these
>>     changes.  My guess is that the 10.0 tests were also deficient.
>>
>>     I have been relying on the UCD to furnish tests that have enough
>>     coverage to sufficiently exercise the algorithms that are specified
>>     in UAX 31, but that appears to have been naive on my part
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20180710/3bdfaa5e/attachment.html>