The native name of Tai Viet script and language(s)

Peter Constable via Unicode unicode at
Mon Aug 26 23:56:35 CDT 2019

" As the proposal for TaiViet script to the Unicode is still on

the progress, we use the Private Use Area for TaiViet

characters (U+F000..U+F07E). "

Er... The script has been in Unicode for about 10 years, since Unicode 5.2.

The block description in 16.8 of Unicode 12 provides useful info:

What may be helpful to understand is that "Tai" refers, at one level, to an entire language family that encompasses languages spoken from southern Thailand in the south to central China in the north, and from Vietnam in the east to eastern India in the west. "Tai" can also be used at a different level as the name for individual languages in that family (with either an un-aspirated /t/ as in /tai/, or an aspirated /tʰ/ as in /tʰai/ — and in China /t/ is usually written with “d”), though usually a distinguishing qualifier is added to the name, as in “Tai Dam” or “Dehong Dai”. Thai, aka Siamese, is a particular exception.

So, Tai Viet is used for writing various Tai languages in Vietnam and Laos, and reportedly also in Central Thailand. These are all distinct languages. IIRC, the script name “Tai Viet” was coined because of predominant use in Vietnam, not because that’s what any user community historically would call the script.

The script _is_ related to Thai script, but I’m not sure I would say it has “the same origin as that of Thai language/script used in Thailand”, as that is too simplistic a view of the historic connections: it suggests that Thai script and Tai Viet developed directly from the same precursor, which isn’t really accurate.

And the mentions of language reflect misunderstanding.

“TaiViet refers to the Tai language used by Tai people in Vietnam…”

No, it does not refer to a language at all. And “_the_ Tai language… in Vietnam” is misunderstanding the language situation: of over 100 languages spoken in Vietnam, there are 32 languages from the Tai-Kadai language family, and 12 from the Southwestern Tai branch, which is the branch that includes Thai (Siamese). To say “the language [has] the same origin as that of Thai…” isn’t correct in that there isn’t _one_ language involved. It would be accurate to say that the languages written with the Tai Viet script are closely-related to Thai (in the same sense that French, Spanish, Italian, etc. are closely-related to one another).

For more on the Southwestern Tai languages, see

Hope that’s of some help.


-----Original Message-----
From: Unicode <unicode-bounces at> On Behalf Of Eli Zaretskii via Unicode
Sent: Thursday, August 22, 2019 6:46 AM
To: unicode at
Subject: The native name of Tai Viet script and language(s)

Could someone "in the know" please help me make the Tai Viet script documentation in Emacs accurate?

The current short description we have is in the file lisp/language/tai-viet.el in the Emacs source tree.  You can see it


My concern is with the text under "sample-text" (line 40) and in the documentation string following that (starting on line 48), which states the name of the script and the language expressed with Tai Viet characters.

However, that text is from long ago, before Unicode had a Tai Viet block, so it still uses at least one PUA character, whuch I think is incorrect.  In addition, I didn't find any place where I could copy/paste the current accurate name of the script and at least one of the languages that use that script.

Could someone please help me set this text straight?  Bonus points for also telling how to say "hello" (or any similar greeting) in one of the Tai Viet languages, so that we could add that to the etc/HELLO file.  (I think the sample-text attempts to include such a greeting, but again, I'm not sure it is correct.)

Thanks in advance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list