Standaridized variation sequences for the Desert alphabet?

Michael Everson everson at evertype.com
Mon Mar 27 07:07:19 CDT 2017


On 27 Mar 2017, at 06:42, Martin J. Dürst <duerst at it.aoyama.ac.jp> wrote:

>> The default position is NOT “everything is encoded unified until disunified”.
> 
> Neither it's "everything is encoded separately unless it's unified”.

These Deseret letters aren’t encoded. For my part I wasn’t made aware of them in 2004 when they were written about. My view is “Ah, here’s something. is it encoded? No. Is it a glyph variant of something encoded? No."

>> The characters in question have different and undisputed origins, undisputed.
> 
> If you change that to the somewhat more neutral "the shapes in question have different and undisputed origins", then I'm with you. I actually have said as much (in different words) in an earlier post.

And what would the value of this be? Why should I (who have been doing this for two decades) not be able to use the word “character” when I believe it correct? Sometimes you people who have been here for a long time behave as though we had no precedent, as though every time a character were proposed for encoding it’s as thought nothing had ever been encoded before.

>> We’ve encoded one pair; evidently this pair was deprecated and another pair was devised. The letters wynn and w are also used for the same thing. They too have different origins and are encoded separately. The letters yogh and ezh have different origins and are encoded separately. (These are not perfect analogies, but they are pertinent.)
> 
> Fine. I (and others) have also given quite a few analogies, none of them perfect, but most if not all of them pertinent.

The sharp s analogy wasn’t useful because whether ſs or ſz users can’t tell either and don’t care. No Fraktur fonts, for instance, offer a shape for U+00DF that looks like an ſs. And what Antiiqua fonts do, well, you get this:

https://en.wikipedia.org/wiki/%C3%9F#/media/File:Sz_modern.svg

And there’s nothing unrecognizable about the ſɜ (< ſꝫ (= ſz)) ligature there. The situation in Deseret is different.

Other analogies had to do with normal shape variation, not shapes derived from underlying ligatures. Analogies are never perfect but I don’t think the ones offered were pertinent.

Underlying ligature difference is indicative of character identity. Particularly when two resulting ligatures are SO different from one another as to be unrecognizable. And that is the case with EW on the left and OI on the right here: 
https://en.wikipedia.org/wiki/Deseret_alphabet#/media/File:Deseret_glyphs_ew_and_oi_transformation_from_1855_to_1859.svg

The lower two letterforms are in no way “glyph variants” of the upper two letterforms. Apart from the stroke of the SHORT I �� they share nothing in common — because they come from different sources and are therefore different characters. 

>>> We haven't yet heard of any contrasting uses for the letter shapes we are discussing.
>> 
>> Contrasting use is NOT the only criterion we apply when establishing the characterhood of characters.
> 
> Sorry, but where did I say that it's the only criterion? I don't think it's the only criterion. On the other hand, I also don't think that historical origin is or should be the only criterion.

Neither do I, but it has been a very clear precedent for many character distinctions and that is useful precedent. 

> Unfortunately, much of what you wrote gave me the impression that you may think that historical origin is the only criterion, or a criterion that trumps all others. If you don't think so, it would be good if you could confirm this. If you think so, it would be good to know why.

Character origin is intimately related to character identity. Even where superficial similarity is concerned; I had to prove character origin for the disunification of YOGH from EZH long long ago and I’ve done the same over and over again for many characters and even full scripts. Sometimes characters are used and then become disused. MOST of the Bamum characters we have encoded aren’t in modern use today, but they were encoded for historical concerns. 

>> Please try to remember that. (It’s a bit shocking to have to remind people of this.
> 
> You don't have to remind me, at least. I have mentioned "usability for average users in average contexts" and "contrasting use" as criteria, and I have also in earlier mail acknowledged history as a (not the) criterion, and have mentioned legacy/roundtrip issues. I'm sure there are others.

I don’t think that ANY user of Deseret is all that “average”. Certainly some users of Deseret are experts interested in the script origin, dating, variation, and so on — just as we have medievalists who do the same kind of work. I’m about to publish a volume full of characters from Latin Extended-D. My work would have been impossible had we not encoded those characters. 

Michael Everson


More information about the Unicode mailing list