Definition of Values of Property Vertical_Orientation

Richard Wordingham richard.wordingham at ntlworld.com
Tue Aug 23 20:36:08 CDT 2022


On Tue, 23 Aug 2022 15:51:35 -0700
Markus Scherer via Unicode <unicode at corp.unicode.org> wrote:

> On Tue, Aug 23, 2022 at 11:32 AM Richard Wordingham via Unicode <
> unicode at corp.unicode.org> wrote:  
> 
> > > Where multiple @missing lines are used, you will no longer see
> > > explicit listing of default values for reserved code points.  
> >
> > Which will *silently* damage some parsers' output.  The damage
> > should show as Unicode 16.0 comes out.
> >  
> 
> Unicode *15*, in a few weeks.

Unicode 14 UCD files will have been parsed correctly.
Out of date parsers should still handle characters that are assigned in
Unicode 15.0.  However, the complex default values for characters
unassigned in Unicode 15.0 will not be loaded properly.  When
characters newly assigned in Unicode 16.0 start hitting applications
supposed to be using the UCD of Unicode 15.0, then the mitigations that
should be in place may not be there.  The effect is horribly subtle.

> Depending on the parser, you might see it getting confused about
> multiple @missing lines, or getting incorrect property values for
> unassigned code points.
> 
> We have had one prominent data file with multiple @missing lines since
> before the start of Unicode 15 beta, and we emphasized this on the
> beta review page <https://www.unicode.org/versions/beta-15.0.0.html>.
> 
> Starting with Version 15.0, some data files in the UCD may contain
> multiple @missing lines defined for the same property. This is
> currently the case for DerivedBidiClass.txt. UCD file parsers will
> need to be updated to treat the additional @missing lines like data
> lines. See UAX #44 Section 4.2.10, @missing Conventions
> <https://www.unicode.org/reports/tr44/tr44-29.html#Missing_Conventions>
> for details.

Remember that the current draft of UAX #44 for Unicode 15.0 says that
comment lines should not be parsed.  The need to parse ostensible
comment lines needs to be publicised. 

Richard.


More information about the Unicode mailing list