Standaridized variation sequences for the Desert alphabet?

Michael Everson everson at
Mon Mar 27 08:49:54 CDT 2017

On 27 Mar 2017, at 09:29, Martin J. Dürst <duerst at> wrote:

>> He is. He transcribes texts into Deseret. I’ve published three of them (Alice, Looking-Glass, and Snark).
> Great to know. Given that, I'd assume that you'd take his input a bit more serious.

I’m discussing it now, offline, with him and Ken.

> Here's what he wrote:
> >>>>
> My own take on this is "absolutely not." This is a font issue, pure and simple. There is no dispute as to the identity of the characters in question, just their appearance.

That begs the whole question of character identity. He’s simply saying what you and Asmus also said. But when you dig into it further, there’s more to the story, as we have found out. 

> In any event, these two letters were never part of the "standard" Deseret Alphabet used in printed materials. To the extent they were used, it was in hand-written material only, where you're going to see a fair amount of variation anyway. There were also two recensions of the DA used in printed materials which are materially different, and those would best be handled via fonts.

There was indeed type cut for these. What’s not found is a full alphabet chart showing some of the ligated letters, but that’s a different question.

> It isn't unreasonable to suggest we change the glyphs we use in the Standard. Ken Beesley and I have have discussed the possibility, and we both feel that it's very much on the table.
> >>>>

Now that further research has been done, I’ll be discussing this with John and Ken with regard to putting together a proposal which will support the two ligating letterform characters as well as some other historical Deseret characters, some used in an important English-Hopi lexicon which was recently published. (I await my copy of that.)

>> I am a designer and typographer, and I’ve worked rather extensively with a variety of Deseret fonts for my publications. They have been well-received.
> That's fine, and not disputed at all. That's exactly why I'm looking for input from other people.

Well, all right, but I didn’t use either �� or �� in my editions apart from the entry in the chart in the front matter. 

> As an analogy, assume we had a famous type designer coming to this list and request that we encode old-style digits separately from roman digits, e.g. arguing that this might simplify the production of fonts.

I don’t see how this analogy could possibly apply. Once again the 1859 ligature-characters look nothing at all like the 1855 one, which speaks to their unique identity as characters. 

Moreover, encoded digits are used by billions of people daily.

> We would understand this request, but we would still deny it because based on our day-to-day use of digits, we would understand that at large (i.e. for the average user) the convenience of having only one code point for a given digit weights stronger than the convenience of separate code points for the type designer.

I’m not suggesting encoding characters for “convenience”. I’m suggesting that there is a character-identity issue here, based both on the origin of the characters and of their vasty different appearance from other characters encoded in the standard. 

> We are looking for similar input from "average users" for Deseret.

The encoding of historic characters is for “expert users” working with historical material, not necessarily “average users” who might be composing blog entries. 

>> Actually neither of the ligature-letters are used in our Carrollian Deseret volumes.
> Ok. That means that these don't provide any information on the discussion at hand (whether to unify or disunify the ligature shapes).

I didn’t even know about the 1859 ligatures until this week. All this proves is that John didn’t use any ligatures when he transcribed the texts. 

>> You know, Martin, I *have* been doing this for the last two decades. I’m well aware of what a font is and can do.
> Great. So you know that present-day font technology would allow us to handle the different shapes in at least any of the following ways:
> 1) Separate characters for separate shapes, both shapes in same font

We shouldn’t do that for shapes so different and with clearly different origins.

> 2) Variant selectors, one or both shapes in same font

Pseudo-encoding, useful for subtle variation but not for something as big as this. I am not an enemy of variation selectors. In fact I’m preparing a nice proposal for some standardized sequences. It would not apply here, because they glyph identity of the letters is too distinct. 

> 3) Font features (e.g. 1855 vs. 1859) to select shapes in the same font

Font trickery. Not portable. Not supported by most apps. 

> 4) Font selection, different fonts for different shapes

We really don’t do this just for one or two characters in a script. 

> Does that knowledge in any way suggest one particular solution?

None of this discussion has convinced me that these letters are variants of existing characters. 

>> I’m also aware of what principles we have used for determining character identity.
> Which, as we have been working out in other mails, are indeed a collection of principles, one of which is history of shape derivation.

That and spelling. The only counterargument seems to be “they are diphthongs” but we don’t encode sounds, we encode the elements of writings systems. The 1859 ligated letterforms are not in any way glyph variants of the 1855 ligated letterforms. They’re completely different letterforms, having only the diagonal stroke of the �� in common.

>> I saw your note about CJK. Unification there typically has something to do with character origin and similarity. The Deseret diphthong letters are clearly based on ligatures of *different* characters.
> One of the principles of CJK unification is that minor differences are ignored if they are not semantically relevant. For CJK, 'minor' is important, because otherwise, many users wouldn't be able to recognize the shapes as having the same semantics/usage.

These would not be unified according to CJK principles: 

> The qualification 'minor' is less important for an alphabet. In general, the more established and well-known an alphabet is, the wider the variations of glyph shapes that may be tolerated. The question I'm trying to get an answer for for Deseret is whether current actual script users see the shape variation as just substitutable glyphs of the same letter, or inherently different letters.
> The answer to this question is not the *only* criterion for deciding whether to encode further Deseret letters, but I think it's an important criterion. And the answer that John has given seems to point in a very clear direction for this question.

John’s view was a first statement before many questions were asked and before research into the matter had commenced, really. 

I’ll get back to you after working with John and Ken some more. 

Michael Everson

More information about the Unicode mailing list