Diaeresis vs. umlaut (was: Re: Standaridized variation sequences for the Desert alphabet?)

Martin J. Dürst duerst at it.aoyama.ac.jp
Sun Mar 26 04:37:49 CDT 2017


On 2017/03/25 03:33, Doug Ewell wrote:
> Philippe Verdy wrote:
>
>> But Unicode just prefered to keep the roundtrip compatiblity with
>> earlier 8-bit encodings (including existing ISO 8859 and DIN
>> standards) so that "ü" in German and French also have the same
>> canonical decomposition even if the diacritic is a diaeresis in French
>> and an umlaut in German, with different semantics and origins.
>
> Was this only about compatibility, or perhaps also that the two signs
> look identical and that disunifying them would have caused endless
> confusion and misuse among users?

I'm not sure to what extent this was explicitly discussed when Unicode 
was created. The fact that the first 256 code points are identical to 
those in ISO-8859-1 was used as a big selling point when Unicode was 
first introduced. It may well have been that for Unicode, there was no 
discussion at all in this area, because ISO-8859-1 was already so well 
established.

And for ISO-8859-1, space was an important concern. Ideally, both 
Islandic and Turkish (and the letters missed for French) would have been 
covered, but that wasn't possible. Disunifying diaeresis and umlaut 
would have been an unaffordable luxury.

The above reasons mask any inherent reasons for why diaeresis and umlaut 
would have been unified or not if the decision had been argued purely 
"on the merit". But having used both German and French, and e.g. looking 
at the situation in Switzerland, where it was important to be able to 
write both French and German on the same typewriter, I would definitely 
argue that disunifying them would have caused endless
confusion and errors among users.

Also, it was argued a few mails ago that diaeresis and umlaut don't look 
exactly the same. I remember well that when Apple introduced its first 
laser printers, there were widespread complaints that the fonts (was it 
Helvetica, Times Roman, and Palatino?) unified away the traditional 
differences in the cuts of these typefaces for different languages.

So to quite some extent, in the relevant period (i.e. 1970ies/80ies), 
the differences between diaeresis and umlaut may be due to design 
differences in the cuts for different languages (e.g. French and 
German). Nobody would have disunified some basic letters because they 
may have looked slightly different in cuts for different languages, and 
so people may also have been just fine with unifying diaeresis and 
umlaut. (German fonts e.g. may have contained a 'ë' for use e.g. with 
"Citroën", but the dots on that 'ë' will have been the same shape as 
'ä', 'ö', and 'ü' umlauts for design consistency, and the other way 
round for French).

Regards,   Martin.


More information about the Unicode mailing list