Usage stats?

Richard Wordingham richard.wordingham at ntlworld.com
Sat Mar 28 18:29:21 CDT 2015


On Sat, 28 Mar 2015 00:59:56 +0000
Richard Wordingham <richard.wordingham at ntlworld.com> wrote:

> On Fri, 27 Mar 2015 16:27:26 -0400
> Michael Norton <michaelanortonster at gmail.com> wrote:
> 
> > Easy example: what's the code for [blank space] U+020 across all
> > language sets of Unicode?  Is it the same ie: 100%?

I've seen a claim from a normally reliable source that U+0020 is
extremely rare in Thai or Japanese text.  It does occur in Japanese
text, though quite possibly as an error for IDEOGRAPHIC SPACE.

In Thai, U+0020 is an extremely common and prescribed punctuation
mark. It is reliably used as a clause and sentence separator, and is
also used to delimit names and also numbers composed of digits.  In
newspaper columns, it occurs in most lines, and in books there are
usually several to the line.  The other common punctuation marks in
serious material are the abbreviation mark U+002E FULL STOP (especially
for initialisms) and the list item separator U+002C COMMA.  Quotation
marks, exclamation marks and ellipses occur in fictional dialogue with
pretty much the same meaning as in English.

Richard.


More information about the Unicode mailing list