Meroitic cursive fractions numerical values

Ken Whistler kenwhistler at
Sat Mar 28 19:22:33 CDT 2015

On 3/28/2015 1:05 PM, Karl Williamson wrote:
> In the 8.0 Beta files, some numerical values are not reduced to their 
> lowest forms.  Is there a compelling reason that
> is not written as

Well, obviously you might not consider it a "compelling" reason, but the
numeric values were written that way in the original proposal (L2/12-206,
June 6, 2012). Nobody said anything about rational numbers
expressed as fractions being required to be lowest form,
and the entries were just carried forward into the drafts of Unicode 8.0
UnicodeData.txt for beta review.

> given that there is also a
> Aren't the numeric values of U+109FB and U+109BD the same?

Of course.

> Existing software that looks at the numeric values of characters is 
> written expecting that rational numbers will have been reduced to 
> their lowest form.

Well, not all existing software, obviously, as the tools used to 
generate the
derived data files didn't complain, and produced the correct results for
these Meroitic fractions:

And there is nothing in the documentation of the Numeric_Value property 
(see UAX #44)
that currently *requires* only an irreducible fraction (or an integer) in
the field. (See also DerivedNumericValues.txt, which is silent on this.)

You can always provide beta feedback requesting that the relevant fractions
be changed to their lowest forms, for review by the May UTC meeting.

Personally, I wouldn't object to a change like that, as I don't see any 
didactic value to expressing the fractional values with precisely the same
numerator and denominator as the character form implies, if it isn't 

On the other hand, I would be loathe to make this a mandatory *requirement*
of the Numeric_Value field, as that would then add yet another baroque
invariant on the UCD data, and would imply yet more elaborated testing to
verify for each release that a new invariant we imposed on ourselves what
not somehow violated in the new data for the UCD. The set of invariants 
maintained is already bordering on impossible for any one participant in the
data maintenance to understand.

The other drawbacks of piling on invariants is that the UTC has been 
bitten by
them in the past when something new comes up that wasn't anticipated.
This particular requirement might be innocuous and safe -- but why tempt
the fates?


More information about the Unicode mailing list