Default case algorithms

Philippe Verdy verdy_p at wanadoo.fr
Tue Jun 24 11:03:48 CDT 2014


2014-06-24 17:07 GMT+02:00 Markus Scherer <markus.icu at gmail.com>:

> On Tue, Jun 24, 2014 at 4:56 PM, Daniel Bünzli <
> daniel.buenzli at erratique.ch> wrote:
>
>> Does an algorithm that simply applies R1 *regardless of context*
>> constitute a default case algorithm or not ? I.e. does simply mapping each
>> character C in a string using Uppercase_Mapping (C) (e.g. as exposed by the
>> XML UCD) constitute a default case conversion as mandated by the standard ?
>>
>
> It implements simple uppercasing but not full uppercasing.
> It misses simple, common things like ß -> SS (which is neither
> language-dependent nor context-sensitive).
>

Bot so simple; may be it is SS for modern German, but Czech would map it to
SZ, and historically that letter is a ligature of SZ (including in old
German texts where that ligature was used), along with many other ligatures
in medieval texts.

If texts were printed in Fraktur style, you always have an ambiguity about
if you should even use ß as a single letter or if you should better encoded
separate letters (without even needing to encode any ligature hint because
ligatures are everywhere in the text in its original form they are inherent
of the script style (you would use hints only for variants of these
ligatures or infrequent absences of a ligature).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20140624/3c63f16a/attachment.html>


More information about the Unicode mailing list