Could U+E0001 LANGUAGE TAG become undeprecated please? There is a good reason why I ask

wjgo_10009@btinternet.com via Unicode unicode at unicode.org
Mon Feb 10 04:29:00 CST 2020


Hi

Could U+E0001 LANGUAGE TAG become undeprecated please? There is a good 
reason why I ask

There is a German song, Lorelei, and I searched to find an English 
translation.

I found the following video.

https://www.youtube.com/watch?v=lJ3JhxOUbw0

The video is an instrumental version and is particularly interesting is 
that there are lyrics displayed in four languages, with two versions of 
the translation in English.

Being a native speaker of English and living in England I first watched 
the video viewing just the version labelled British:. Later I played the 
video again and I just viewed the version labelled U.S..

Remembering that I had some time ago heard a version in Esperanto, I 
searched nd found the two following videos.

https://www.youtube.com/watch?v=reUpdGgdBsA

https://www.youtube.com/watch?v=7dHhTXDmP0k

They may be of the same recording. This first has in its notes the text 
of the lyrics.

The song in Esperanto has the rather expressive Esperanto word belega in 
it. This single word, an adjective, is composed from the Esperanto word 
bela which means beautiful augmented with the Esperanto word-building 
component -eg- that modifies the word to which it is an augmentation to 
indicate greatness. So the word belega expresses in one three-syllable 
Esperanto word the concept that is in English "greatly beautiful".

http://esperanto.davidgsimpson.com/eo-affixes.html

Thinking of the first video to which I linked, it occurred to me that if 
a plain text message were sent containing each of two or more versions 
of the same text, for whatever text, probably a short message in 
practice, each in a different language from the other or others, with 
the language of a particular version preceded by a tag sequence: then 
software at the receiving end could be set to a chosen language and only 
text in that language would be displayed.

Thinking around this idea I thought that this could be very useful in 
The Internet of Things for machine to human communication, whereby, if, 
say, an end user (human) is wanting to dialogue with a device (thing) 
then the technique could be used to send the message

Please enter the password

from the thing in a number of languages. The decoding software in the 
end user's computer could use the first message in the list as the 
default if the sequence sent by the thing does not have a version for 
the particular language set by the end user in his or her computer.

The list of languages supported by a particular thing would not be 
specified by a universal standard, but could perhaps have English, 
French, German and one or more others depending up the location and 
application of the thing. Any language expressible in Unicode could be 
included in the list.

Support for Unicode characters beyond plane 0 is much more obtainable in 
software these days.

I know that people have been urged to use a higher level protocol for 
indicating in  language documents, but please consider if one is wanting 
to assemble automatically a status report by combing reports from each 
of a number of mutually independent sensors on the Internet of Things, 
each of relatively small size, located in a variety of physical 
locations perhaps miles apart. In such a case the concatenation of such 
plain text sequences would be straightforward.

Such an undeprecating of U+E0001 LANGUAGE TAG would, in my opinion, 
contribute to the development of The Internet of Things.

William Overington

Monday 10 February 2020



More information about the Unicode mailing list