L2/18-181

William_J_G Overington via Unicode unicode at unicode.org
Thu May 17 05:04:25 CDT 2018


Otto Stolz wrote:

> I wonder how English and French ever could be made to use a single script, let alone German (“ß”), Icelandic (“þ”), Swedish (“å”), Latvian (“ē”), Chech (“č”) or – you name it.

Years ago I used to hand set metal type - letterpress printing was a family hobby.

For a fount of type of a particular style and size and case, there was a typecase, subdivided into areas of various sizes and there was a more or less standard lay of the typecase so that, for example, a lowercase e was in a larger area than a lowercase q, because there were more pieces of type of a lowercase e than of a lowercase q, and e and q were in a known place within the typecase so that a lowercase e in any of the typecases was in the same place within the typecase. There were a number of extra small areas near the edge of the typecase which were unspecified and could be used for extra sorts as they were known.

I had become interested in Esperanto and bought some sorts, some of each of twelve sorts, so as to augment a fount used for printing in English be able to print in Esperanto as well.

These sorts were placed in some of the small areas near the edge of the typecase. Had I wanted to print in French I could have bought the accented sorts needed for French. Indeed the type catalogue from the typefounder had a list of which sorts were needed for each of various European languages. I learned most of that list.

This has proved useful at times, such as in the early 1970s when two researchers were trying to translate a research paper from what they thought was Spanish into English and were having problems and I was able to point out that it was not Spanish but Portuguese as there was an a tilde in the text, even though I do not know Portuguese.

There was a publication by the Monotype Corporation, published in 1963.

Languages of the world that can be set on 'Monotype' machines / compiled by R.A. Downie.

I have just looked it up in the British Library online catalogue.

I bought a copy of the publication in the 1960s. I do not have it immediately to hand. Does anyone have a copy readily available and can say what is said about Assamese in that book please?

Going back to look at what was done in relation to Assamese with metal type - not just the Monotype brand -  could be an interesting insight.

I notice that Otto Stolz mentions the following.

> Icelandic (“þ”),

Yet the thorn character was part of English too.

Yet it was lost from English.

Was that because William Caxton got his founts of metal type from the European mainland and the necessary sort was not in the font?

Is the same sort of thing happening now, over five hundred years later, in relation to Assamese?

Maybe people should be helping to get this resolved to the satisfaction of all and helping rather than criticising.

By the way, in relation to language identification, Unicode has a perfectly good plain text mechanism for language identification built into it, using the character

U+E0001 LANGUAGE TAG

and other tag characters.

All of the tag characters were deprecated years ago, against opposition by at least two of the contributors to this present thread, then all except U+E0001 have been undeprecated more recently.

There is a note in the code chart.

>> This character is deprecated, and its use is strongly discouraged.

It does not say by whom it is discouraged though nor why.

www.unicode.org/charts/PDF/UE0000.pdf

I opine that it time for a rethink on this and that U+E0001 should be undeprecated and its application be encouraged instead of all the stuff about using higher level protocols all the time - after all, higher level protocols are not encouraged instead when people want to send emoji.

William Overington

Thursday 17 May 2018




More information about the Unicode mailing list