Standaridized variation sequences for the Deseret alphabet?

Michael Everson everson at
Thu Mar 23 08:32:46 CDT 2017

> On 23 Mar 2017, at 05:54, Martin J. Dürst <duerst at> wrote:
> Hello Michael, others,
> [Fixed script name in subject.]
> On 2017/03/23 09:03, Michael Everson wrote:
>> On 22 Mar 2017, at 21:39, David Starner <prosfilaes at> wrote:
>>> There's the same characters here, written in different ways.
>> No, it’s not. Its the same diphthong (a sound) written with different letters.
> I think this may well be the *historically* correct analysis. And that may have some influence on how to encode this, but it shouldn't be dominant.

Well, Martin, maybe you’re comfortable with shifting goalposts, but we have used historically correct analysis to identify characters in the past and to continue with this precedent is consistent with good practice. 

> What's most important is (past and) *current use*. If the distinction is an orthographic one (e.g. different words being written with different shapes), then that's definitely a good indication for splitting.

It *is* an orthographic one. For one thing, the 1859 glyphs look NOTHING LIKE the 1855 glyphs. 

> On the other hand, if fonts (before/outside Unicode) only include one variant at the time, if people read over the variant without much ado, if people would be surprised to find both corresponding variants in one and the same text (absent font variations), if there are examples where e.g. the variant is adjusted in quotes from texts that used the 'old' variant inside a text with the 'new' variants, and so on, then all these would be good indications that this is, for actual usage purposes, just a font difference, and should therefore best be handled as such.

Um, yeah. Why have Unicode at all? I mean people in Georgia were happy with ASCII-based font hacks. Lots of people are still using them. Sure, people put up with the unification of Coptic and Greek. 

Just font differences. Yeah. 

> The closes to the current case that I was able to find was the German ß. It has roots in both an ss and an sz (to be precise, an ſs and an ſz) ligature (seeß). And indeed in some fonts, its right part looks more like an s, and in other fonts more like a z (and in lower case, more often like an s, but in upper case, much more like a (cursive) Z). Nevertheless, there is only one character (or two if you count upper case) encoded, because anything else would be highly confusing to virtually all users.

The situation of the Deseret diphthong letters isn’t anything like German ß. Yes, you can analyse it as something like ſs and ſȥ, but THOSE LOOK VERY NEARLY ALIKE.

Ignoring the stroke of SHORT I which is the same for all the Deseret letters being discussed, we have EW represented by �� and �� (which look nothing alike) and OI represented by �� and �� (which look nothing alike).

A unification of these as “glyph variants” is perverse and not consistent with the way we have encoded things in the past.

> What is right for Deseret has to be decided by and for Deseret users, rather than by script historians.

Odd. That view doesn’t seem to be applicable to CJK unification.


More information about the Unicode mailing list