Standaridized variation sequences for the Desert alphabet?
Martin J. Dürst
duerst at it.aoyama.ac.jp
Tue Mar 28 07:10:58 CDT 2017
On 2017/03/27 21:59, Michael Everson wrote:
> On 27 Mar 2017, at 08:05, Martin J. Dürst <duerst at it.aoyama.ac.jp> wrote:
>>> Consider 2EBC ⺼ CJK RADICAL MEAT and 2E9D ⺝ CJK RADICAL MOON which are apparently really supposed to have identical glyphs, though we use an old-fashioned style in the charts for the former. (Yes, I am of course aware that there are other reasons for distinguishing these, but as far as glyphs go, even our standard distinguishes them artificially.)
>> "apparently", maybe. Let's for a moment leave aside the radicals themselves, which are to a large extent artificial constructs.
> I do stipulate not being a CJK expert. But those are indeed different due to their origins, however similar their shapes are.
Except for the radicals themselves, I haven't found a contrasting pair.
What I think we would need to find to influence the current
argumentation (except for general "history is important", which I think
we all agree) is a case of a character that originally existed both with
a MEAT radical and a MOON radical, but has only a single usage. Then
whether there were one or two code points would provide an analog for
the situation we have at hand.
Also note that there is a difference in meaning. The characters with
MEAT radicals mostly refer to body parts and organs. The characters with
MOON radicals are mostly time-related.
>> Let's look at the actual characters with these radicals (e.g. U+6709,... for MOON and U+808A,... for MEAT), in the multi-column code charts of ISO 10646. There are some exceptions, but in most cases, the G/J/K columns show no difference (i.e. always the ⺝ shape, with two horizontal bars), whereas the H/T/V columns show the ⺼ shape (two downwards slanted bars) for the "MEAT" radical and the ⺝ shape for the moon radical. So whether these radicals have identical glyphs depends on typographic tradition/font/…
> They are still always very similar, right?
Similarity is in the eye of the beholder (or the script).
Sometimes, a little dot or hook is irrelevant. Sometimes it's the single
difference that makes it a totally different character.
>> In Japan, many people may be rather unaware of the difference, whereas in Taiwan, it may be that school children get drilled on the difference.
> That’s interesting.
Not necessarily for the poor Taiwanese students, and not necessarily for
the Japanese who try to find a character in a dictionary ordered by
>>> Changing to a different font in order to change one or two glyphs is a mechanism that we have actually rejected many times in the past. We have encoded variant and alternate characters for many scripts.
>> Well, yes, rejected many times in cases where that was appropriate. But also accepted many times, in cases that we may not even remember, because they may not even have been made explicitly.
> Do come up with examples if you have any.
I had the following in mind:
>> The roman/italic a/ɑ and g/ɡ distinctions (the later code points only used to show the distinction in plain text, which could as well be done descriptively),
> Aa and Ɑɑ are used contrastively for different sounds in some languages and in the IPA. Ɡɡ is not, to my knowledge, used contrastively with Gg (except that ɡ can only mean /ɡ/, while orthographic g can mean /ɡ/, /dʒ/, /x/ etc. But g vs ɡ is reasonably analogous to and <lig></lig> being used for /juː/.
The contrastive use *in some languages or notations* (IPA) is the reason
these are separately encoded. The fact that these are not contrastively
used in most major languages is responsible for the fact that they don't
use different code points when used in these languages. It would be a
real hassle to have to change from g to ɡ when switching e.g. from Times
Roman to Times Italic.
In Deseret, we are still missing any contrastive usage, so that suggests
to be careful with encoding.
>> as well as a large number of distinctions in Han fonts, come to my mind.
It's difficult to show these distinctions, because they are NOT
separately encoded, but three-stroke and four-stroke grass radical is
the most well known.
> And the same goes for the /juː/ ligatures. The word tube /tjuːb/ can be written TYŪB or or <>. But the unligated the sequences would be pronounced differently: /tjuːb/ and /tɪuːb/ and /tɪʊb/.
Ah, I see. So we seem to have five different ways (counting the two
ligature variants) of writing the same word, with three different
pronunciations. The important question is whether the two ligatures do
imply any difference in pronunciation (as opposed to time of writing or
author/printer preference), i.e. whether the ligated sequences or
<> are pronounced differently (not by a phonologist but by an
>> Is the choice of variant up to the author (for which variants), or is it the editor or printer who makes the choice (for which variants)?
> In a handwritten manuscript obviously the choice is the author’s. As to historical printing, printers may have
Did you want to write something more here?
>> And what informs this choice? If we have any historic metal types, are there examples where a font contains both ligature variants?
> Ken Beesley have samples of a metal font (the 1857 St Luois punches) which had both and ; I don’t know what other sorts were in that font.
As I explained in another post, that may just be a 1855/1859 hybrid.
More information about the Unicode