Diaeresis vs. umlaut (was: Re: Standaridized variation sequences for the Desert alphabet?)
Martin J. Dürst
duerst at it.aoyama.ac.jp
Sun Mar 26 04:37:49 CDT 2017
On 2017/03/25 03:33, Doug Ewell wrote:
> Philippe Verdy wrote:
>> But Unicode just prefered to keep the roundtrip compatiblity with
>> earlier 8-bit encodings (including existing ISO 8859 and DIN
>> standards) so that "ü" in German and French also have the same
>> canonical decomposition even if the diacritic is a diaeresis in French
>> and an umlaut in German, with different semantics and origins.
> Was this only about compatibility, or perhaps also that the two signs
> look identical and that disunifying them would have caused endless
> confusion and misuse among users?
I'm not sure to what extent this was explicitly discussed when Unicode
was created. The fact that the first 256 code points are identical to
those in ISO-8859-1 was used as a big selling point when Unicode was
first introduced. It may well have been that for Unicode, there was no
discussion at all in this area, because ISO-8859-1 was already so well
And for ISO-8859-1, space was an important concern. Ideally, both
Islandic and Turkish (and the letters missed for French) would have been
covered, but that wasn't possible. Disunifying diaeresis and umlaut
would have been an unaffordable luxury.
The above reasons mask any inherent reasons for why diaeresis and umlaut
would have been unified or not if the decision had been argued purely
"on the merit". But having used both German and French, and e.g. looking
at the situation in Switzerland, where it was important to be able to
write both French and German on the same typewriter, I would definitely
argue that disunifying them would have caused endless
confusion and errors among users.
Also, it was argued a few mails ago that diaeresis and umlaut don't look
exactly the same. I remember well that when Apple introduced its first
laser printers, there were widespread complaints that the fonts (was it
Helvetica, Times Roman, and Palatino?) unified away the traditional
differences in the cuts of these typefaces for different languages.
So to quite some extent, in the relevant period (i.e. 1970ies/80ies),
the differences between diaeresis and umlaut may be due to design
differences in the cuts for different languages (e.g. French and
German). Nobody would have disunified some basic letters because they
may have looked slightly different in cuts for different languages, and
so people may also have been just fine with unifying diaeresis and
umlaut. (German fonts e.g. may have contained a 'ë' for use e.g. with
"Citroën", but the dots on that 'ë' will have been the same shape as
'ä', 'ö', and 'ü' umlauts for design consistency, and the other way
round for French).
More information about the Unicode