German sharp S uppercase mapping
Daniel Buncic
daniel.buncic at uni-koeln.de
Tue Nov 26 14:41:29 CST 2024
Dear Marius, dear Ivan, dear Peter, dear all,
Thanks to Marius for the compromise idea that the ß → SS mapping could
remain in the standard table but ß → ẞ be handled as special casing for
German. However, I wonder what language the standard table would be
there for then, given that ß is used in no other language but German.
(Or if the ß → SS rule was then only applied to those few older
non-German texts that did use ß, it would be wrong in most cases, as in
this Polish Bible from 1846:
https://books.google.de/books?id=W4xbAAAAMAAJ&hl=de. Google Books,
certainly on the basis of some ß → ss rule, gives one of the words in
the title as “Wssystko”, but that does not make sense; the word spelled
“Wßystko” on the title page has to be transcribed as “Wszystko”
(‘Whole’), in the same way as e.g. the first word in the heading of
Genesis is spelled “PIERWSZE” (‘First’), not, of course, “PIERWSSE”.)
As to the interpretation of spelling rules, one has to know that “auch”
(‘also’) in normative dictionaries always separates a secondary form
from a preferred one. Equal options are separated by “oder” (‘or’) or
merely by a comma or a slash. In this light, see the change from the
previous version of the rule (§25 E3) to the current one:
“Bei Schreibung mit Großbuchstaben schreibt man SS. Daneben ist auch die
Verwendung des Großbuchstabens ẞ möglich. Beispiel: Straße – STRASSE –
STRAẞE.”
(‘When writing in capital letters, one writes SS. In addition to this,
the use of the capital letter ẞ is also possible: Straße – STRASSE –
STRAẞE.’ –
https://www.rechtschreibrat.com/DOX/rfdr_Regeln_2016_redigiert_2018.pdf,
p. 29)
↓
“Bei Schreibung mit Großbuchstaben ist neben der Verwendung des
Großbuchstabens ẞ auch die Schreibung SS möglich: Straße – STRAẞE –
STRASSE.”
(‘When writing in capital letters, in addition to using the capital
letter ẞ, it is also possible to write SS: Straße – STRAẞE – STRASSE.’ –
https://www.rechtschreibrat.com/DOX/RfdR_Amtliches-Regelwerk_2024.pdf,
p. 48)
Before, capital ẞ was classified as ‘also possible’, now SS is ‘also
possible’, and the order of the examples was also changed from “STRASSE
– STRAẞE” to “STRAẞE – STRASSE”. If they had meant the alternatives to
be equal, they would have written something like “Bei Schreibung mit
Großbuchstaben kann man ẞ oder SS schreiben” (‘When writing in capital
letters, one can write ẞ or SS’). It is correct that the order by
itself does not indicate a preference, but the wording does.
Peter, can you give me an example of an implementation that would crash
if there was a new version of CaseFolding.txt or SpecialCasing.txt?
Wouldn’t a programmer either copy the data of the file into their
application so that it still works if the server unicode.org is down?
And then changing the original would have no effect until the programmer
decides to implement the change in their application, but then it would
be their responsibility to take care of the effects of that change
within their application. Or in the worst case, the application would
download its data directly from, say,
https://www.unicode.org/Public/16.0.0/ucd/CaseFolding.txt, but then a
new version would just have to be stored under …/17.0.0/… and it would
not affect the application. How can a new version of a file like this
directly “break existing implementations”? Probably I am
misunderstanding something here.
Best wishes,
Daniel
--
Prof. Dr. Daniel Bunčić
===============================================================
Slavisches Institut der Universität zu Köln
Weyertal 137, D-50931 Köln
Telefon: +49 (0)221 470-90535
Sprechstunden: https://uni.koeln/ENZEB
E-Mail: daniel.buncic at uni-koeln.de = daniel at buncic.de
Threema: https://threema.id/8M375R5K
===============================================================
Homepage: http://daniel.buncic.de/
Academia: http://uni-koeln.academia.edu/buncic
ResearchGate: https://researchgate.net/profile/Daniel-Buncic-2
===============================================================
More information about the Unicode
mailing list