Concise term for non-ASCII Unicode characters

Daniel Bünzli daniel.buenzli at
Tue Sep 29 14:27:28 CDT 2015

Le mardi, 29 septembre 2015 à 19:50, Ken Whistler a écrit :
> I agree that "scalar values greater than U+007F" doesn't just trip off the tongue,
> and while technically accurate, it is bad terminology -- precisely because it
> begs the question "wtf are 'scalar values'?!" for the average engineer.

And an average engineer knows how to lookup definitions, that one being precise and exceptionally well defined in the Unicode glossary — in stark contrast to the shady (and deceiving for the newbie) notion of "character" that you use subsequently in your message.

This is not "bad terminology", it's *precise* terminology and what I would like to see used in protocols and standards.  

Many programmers I talk to are confused by Unicode because their notion of Unicode "character" is a chaotic mix of scalar values, code points and their various *encodings* (i.e. byte level considerations).  

Introducing more terminology to talk about that confused idea of Unicode is not going to help. Educating about the difference between scalar values, code points and their various encodings will.



More information about the Unicode mailing list