Standaridized variation sequences for the Desert alphabet?

Martin J. Dürst duerst at
Mon Mar 27 02:05:12 CDT 2017

On 2017/03/27 01:20, Michael Everson wrote:
> On 26 Mar 2017, at 16:45, Asmus Freytag <asmusf at> wrote:

> Consider 2EBC ⺼ CJK RADICAL MEAT and 2E9D ⺝ CJK RADICAL MOON which are apparently really supposed to have identical glyphs, though we use an old-fashioned style in the charts for the former. (Yes, I am of course aware that there are other reasons for distinguishing these, but as far as glyphs go, even our standard distinguishes them artificially.)

"apparently", maybe. Let's for a moment leave aside the radicals 
themselves, which are to a large extent artificial constructs. Let's 
look at the actual characters with these radicals (e.g. U+6709,... for 
MOON and U+808A,... for MEAT), in the multi-column code charts of ISO 
10646. There are some exceptions, but in most cases, the G/J/K columns 
show no difference (i.e. always the ⺝ shape, with two horizontal bars), 
whereas the H/T/V columns show the ⺼ shape (two downwards slanted bars) 
for the "MEAT" radical and the ⺝ shape for the moon radical. So whether 
these radicals have identical glyphs depends on typographic 
tradition/font/... In Japan, many people may be rather unaware of the 
difference, whereas in Taiwan, it may be that school children get 
drilled on the difference.

> One practical consequence of changing the chart glyphs now, for instance, would be that it would invalidate every existing Deseret font. Adding new characters would not.

Independent of whether the chart glyphs get changed, couldn't we just 
add a note "also # in some fonts" (where # is the other variant). That 
would make sure that nobody could claim "this font is wrong" based on 
the charts. (Even if a general claim that the chart glyphs aren't 
normative applies to all charts anyway.)

>> In fact, it would seem that if a Deseret text was encoded in one of the two systems, changing to a different font would have the attractive property of preserving the content of the text (while not preserving the appearance).
> Changing to a different font in order to change one or two glyphs is a mechanism that we have actually rejected many times in the past. We have encoded variant and alternate characters for many scripts.

Well, yes, rejected many times in cases where that was appropriate. But 
also accepted many times, in cases that we may not even remember, 
because they may not even have been made explicitly. Because in such 
cases, the focus may not be on a change to one or a few letter shapes, 
but the focus may be on a change of the overall style, which induces a 
change of letter shape in some letters. The roman/italic a/ɑ and g/ɡ 
distinctions (the later code points only used to show the distinction in 
plain text, which could as well be done descriptively), as well as a 
large number of distinctions in Han fonts, come to my mind. I'm quite 
sure other scripts have similar phenomena.

>> This, in a nutshell, is the criterion for making something a font difference vs. an encoding distinction.
> Character identity is not defined by any single criterion. Moreover, in Deseret, it is not the case that all texts which contain the diphthong /juː/ or /ɔɪ/ write it using EW �� or OI ��. Many write them as Y + U ���� and O + I ����. So the choice is one of *spelling*, and spelling has always been a primary criterion for such decisions.

This is interesting information. You are saying that in actual practice, 
there is a choice between writing ���� (two letters for a diphthong) and 
writing ��. In the same location, is ���� (the base for the historically 
later shape variant of ��; please note that this may actually be written 
����; there's some inconsistency in order between the above cited 
sentence and the text below copied from an earlier mail) also used as a 
spelling variant? Overall, we may have up to four variants, of which 
three are currently explicitly supported in Unicode. Are all of these 
used as spelling variants? Is the choice of variant up to the author 
(for which variants), or is it the editor or printer who makes the 
choice (for which variants)? And what informs this choice? If we have 
any historic metal types, are there examples where a font contains both 
ligature variants?

(Please note that because ��, ��, and �� are available as individual 
letters, it's very difficult to think about the two-letter sequences as 
anything else than spellings, but that doesn't necessarily carry over to 
the ligatures.)

And then the same questions, with parallel (or not parallel) answers, 
for ɒɪ/ɔɪ/��.

Regards,    Martin.

Text copied from earlier mail by Michael:

1. The 1855 glyph for �� EW is evidently a ligature of the glyph for the 
diagonal stroke of the glyph for �� SHORT I [ɪ] and �� LONG OO [uː], 
that is, [ɪ] + [oː] = [ɪuː], that is, [ju].

2. The 1855 glyph for �� OI is evidently a ligature of the glyph for �� 
SHORT AH [ɒ] and the diagonal stroke of the glyph for �� SHORT I [ɪ], 
that is, [ɒ] + [ɪ] = [ɒɪ], that is, [ɔɪ].

That’s encoded. Now evidently, the glyphs for the 1859 substitutions are 
as follows:

1. The 1859 glyph for EW is evidently a ligature of the glyph for the 
diagonal stroke of the glyph for �� SHORT I [ɪ] and �� SHORT OO [ʊ], 
that is, [ɪ] + [ʊ] = [ɪʊ], that is, [ju].

2. The 1859 glyph for OI is evidently a ligature of the glyph for �� 
LONG AH [ɔː] and the diagonal stroke of the glyph for SHORT I [ɪ], that 
is, [ɔː] + [ɪ] = [ɔːɪ], that is, [ɔɪ].

More information about the Unicode mailing list