Definition of Values of Property Vertical_Orientation

Asmus Freytag asmusf at ix.netcom.com
Tue Aug 23 07:51:26 CDT 2022


On 8/23/2022 3:13 AM, Wordingham Richard via Unicode wrote:
>
>> On 23/08/2022 00:31 Sławomir Osipiuk via Unicode 
>> <unicode at corp.unicode.org> wrote:
>>
>>
>> Why is Vertical_Orientation even listed in 4.2.9.1 if it doesn't need
>> special handling? How is it even a "complex" case in any meaningful way?
>> The default is "R". The "U" ranges are all explicitly listed, making 
>> them
>> *non-default* from a parsing standpoint, all handled by normally reading
>> the data file. Is this not correct?
> The Unicode term “default property value” has only a limited 
> connection with the natural English meaning of the phrase.  A “default 
> property value” of an encoded character property is one taken by 
> unassigned code points or encoded characters for which the property is 
> irrelevant (TUS Section 3.5 D26).  Its connection with parsing is 
> currently weak and confusing when there are multiple “default property 
> values”.
>
> Worse, only an encoded character can have an “explicit property value” 
> (D24)!
>
> Richard.

There's a dual us of "default". For an code point that has an assigned 
character, a "default" value is one that is omitted in the data file 
listing. Which comes in handy for binary properties, so you only need to 
list those with a value of "True".

For unassigned code points, a "default" means the most likely future 
value. In a few cases, that's not a single value across the entire code 
space, but there may be regions set aside for encoding characters that 
require different values than the default and where it makes sense to 
"future proof" some algorithms by picking a different value as the most 
likely one.

Whether the actual value will later correspond to the default value is 
left open and there will be some exceptions, but generally these values 
are chosen to minimize disruptions.

This range-based concept of defaults is what's called "complex" 
defaults. Now, the issue arises how to document them. The current 
approach on record is to use multiple @missing directives, with each 
later one resetting the value for the range given. The first one would 
cover the range 0000..10FFFF to set the general default for the entire 
code space and any following @missing directives would override selected 
subranges.

Finally, the explicit values would override any default values set in  
@missing directives.

For compatibility with older parsers, all @missing directives are 
wrapped in comments.

For some properties, such as derived bidi class, the  full scheme will 
be present in 15.0, but vertical orientation missed the cutoff, so that 
will be taken care of in the next version(s).

Where multiple @missing lines are used, you will no longer see explicit 
listing of default values for reserved code points.

A./
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20220823/e8c022af/attachment.htm>


More information about the Unicode mailing list