Definition of Values of Property Vertical_Orientation
Asmus Freytag
asmusf at ix.netcom.com
Tue Aug 23 07:51:26 CDT 2022
On 8/23/2022 3:13 AM, Wordingham Richard via Unicode wrote:
>
>> On 23/08/2022 00:31 Sławomir Osipiuk via Unicode
>> <unicode at corp.unicode.org> wrote:
>>
>>
>> Why is Vertical_Orientation even listed in 4.2.9.1 if it doesn't need
>> special handling? How is it even a "complex" case in any meaningful way?
>> The default is "R". The "U" ranges are all explicitly listed, making
>> them
>> *non-default* from a parsing standpoint, all handled by normally reading
>> the data file. Is this not correct?
> The Unicode term “default property value” has only a limited
> connection with the natural English meaning of the phrase. A “default
> property value” of an encoded character property is one taken by
> unassigned code points or encoded characters for which the property is
> irrelevant (TUS Section 3.5 D26). Its connection with parsing is
> currently weak and confusing when there are multiple “default property
> values”.
>
> Worse, only an encoded character can have an “explicit property value”
> (D24)!
>
> Richard.
There's a dual us of "default". For an code point that has an assigned
character, a "default" value is one that is omitted in the data file
listing. Which comes in handy for binary properties, so you only need to
list those with a value of "True".
For unassigned code points, a "default" means the most likely future
value. In a few cases, that's not a single value across the entire code
space, but there may be regions set aside for encoding characters that
require different values than the default and where it makes sense to
"future proof" some algorithms by picking a different value as the most
likely one.
Whether the actual value will later correspond to the default value is
left open and there will be some exceptions, but generally these values
are chosen to minimize disruptions.
This range-based concept of defaults is what's called "complex"
defaults. Now, the issue arises how to document them. The current
approach on record is to use multiple @missing directives, with each
later one resetting the value for the range given. The first one would
cover the range 0000..10FFFF to set the general default for the entire
code space and any following @missing directives would override selected
subranges.
Finally, the explicit values would override any default values set in
@missing directives.
For compatibility with older parsers, all @missing directives are
wrapped in comments.
For some properties, such as derived bidi class, the full scheme will
be present in 15.0, but vertical orientation missed the cutoff, so that
will be taken care of in the next version(s).
Where multiple @missing lines are used, you will no longer see explicit
listing of default values for reserved code points.
A./
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20220823/e8c022af/attachment.htm>
More information about the Unicode
mailing list