<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">On 8/23/2022 3:13 AM, Wordingham
Richard via Unicode wrote:<br>
</div>
<blockquote type="cite"
cite="mid:796693839.1949146.1661249582514@mail.virginmedia.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<meta charset="UTF-8">
<div> <br>
</div>
<blockquote type="cite">
<div> On 23/08/2022 00:31 Sławomir Osipiuk via Unicode <<a
href="mailto:unicode@corp.unicode.org"
moz-do-not-send="true" class="moz-txt-link-freetext">unicode@corp.unicode.org</a>>
wrote: </div>
<div> <br>
</div>
<div> <br>
</div>
<div> Why is Vertical_Orientation even listed in 4.2.9.1 if it
doesn't need </div>
<div> special handling? How is it even a "complex" case in any
meaningful way? </div>
<div> The default is "R". The "U" ranges are all explicitly
listed, making them </div>
<div> *non-default* from a parsing standpoint, all handled by
normally reading </div>
<div> the data file. Is this not correct? </div>
</blockquote>
<div> The Unicode term “default property value” has only a limited
connection with the natural English meaning of the phrase. A
“default property value” of an encoded character property is one
taken by unassigned code points or encoded characters for which
the property is irrelevant (TUS Section 3.5 D26). Its
connection with parsing is currently weak and confusing when
there are multiple “default property values”. </div>
<div class="default-style"> <br>
</div>
<div class="default-style"> Worse, only an encoded character can
have an “explicit property value” (D24)! </div>
<div class="default-style"> <br>
</div>
<div class="default-style"> Richard. </div>
</blockquote>
<p><font face="Candara">There's a dual us of "default". For an code
point that has an assigned character, a "default" value is one
that is omitted in the data file listing. Which comes in handy
for binary properties, so you only need to list those with a
value of "True".</font></p>
<p><font face="Candara">For unassigned code points, a "default"
means the most likely future value. In a few cases, that's not a
single value across the entire code space, but there may be
regions set aside for encoding characters that require different
values than the default and where it makes sense to "future
proof" some algorithms by picking a different value as the most
likely one.</font></p>
<p><font face="Candara">Whether the actual value will later
correspond to the default value is left open and there will be
some exceptions, but generally these values are chosen to
minimize disruptions.</font></p>
<p><font face="Candara">This range-based concept of defaults is
what's called "complex" defaults. Now, the issue arises how to
document them. The current approach on record is to use multiple
@missing directives, with each later one resetting the value for
the range given. The first one would cover the range
0000..10FFFF to set the general default for the entire code
space and any following @missing directives would override
selected subranges.</font></p>
<p><font face="Candara">Finally, the explicit values would override
any default values set in @missing directives.</font></p>
<p><font face="Candara">For compatibility with older parsers, all
@missing directives are wrapped in comments.</font></p>
<p><font face="Candara">For some properties, such as derived bidi
class, the full scheme will be present in 15.0, but vertical
orientation missed the cutoff, so that will be taken care of in
the next version(s).</font></p>
<p><font face="Candara">Where multiple @missing lines are used, you
will no longer see explicit listing of default values for
reserved code points.</font></p>
<p><font face="Candara">A./<br>
</font></p>
</body>
</html>