Private Use Area in Use (from Tag characters and in-line graphics (from Tag characters))

Nathan Sharfi mailinglists at ngalt.com
Tue Jun 9 18:46:06 CDT 2015


> On Jun 3, 2015, at 1:26 AM, William_J_G Overington <wjgo_10009 at btinternet.com> wrote:
> 
> Private Use Area in Use (from Tag characters and in-line graphics (from Tag characters))
> 
> 
>>> That's not agreed upon. I'd say that the general agreement is that the private ranges are of limited usefulness for some very limited use cases (such as designing encodings for new scripts).
> 
> 
>> They are of limited usefulness precisely because it is pathologically hard to make use of them in their current state of technological evolution. If they were easy to make use of, people would be using them all the time. I’d bet good money that if you surveyed a lot of applications where custom characters are being used, they are not using private use ranges. Now why would that be?
> 
> 
> Actually, I have used Private Use Area characters a lot, and, once I had got used to them, I found them incredibly straightforward to use.

That's nice; I've found some persistent annoyances when I use PUA codepoints.

A while back I learned Quikscript, an alternate English orthography. Since May 2013, my blog's been in Quikscript using PUA codepoints. I've also joined the Shavian mailing list, sent e-mails in Shavian, and wrote an "I'm switching my Quikscript blog to Shavian" blog post in Shavian for April Fool's Day. To do all this typing, I made both Quikscript and Shavian keyboard layouts for OS X, as well as a Quikscript font. All of my Quikscript stuff is linked to from https://www.frogorbits.com/qs/ if you're interested.

I'm something of a Johnny-come-lately to Shavian, so I've only used it in the SMP with fonts others have made.

So, how much nicer is dealing with Shavian?

- The Keyboard Viewer and input-source preview know what font to use for each key for Shavian; Quikscript keyboard layouts display boxes for the letters because there's no way for the system to guess which font to use for a particular codepoint. 
- Double-tapping a Shavian word in my browser will select the word; double-tapping a Quikscript word will select just one letter.
- Internet Explorer will happily break Quikscript text in the middle of a word; Shavian gets broken at word boundaries just like English. While IE's behavior is unlike other browsers' and Not What I Want, I can't fault the IE team; I could be using PUA code points for a language that doesn't use spaces much, like Japanese.
- I can read and write Shavian posts on Twitter on the desktop in a reasonable font for both Shavian and other scripts; if I wanted to do the same in Quikscript, I'd have to have a custom user-supplied stylesheet to override Twitter's own font suggestions.
- Scripts already in Unicode attract the attention of talented completionist organizations that PUA communities generally can't attract beforehand. Everson Mono, Noto, and Segoe UI Historic (as of Windows 10) — all great typefaces — support Shavian and not Quikscript.

This tends to be because:

- I could have multiple fonts that have wildly differing meanings and glyphs mapped to the same code point; the OS can't guess which I might mean.
- All the information that the OS needs to detect word breaks is in character properties data supplied by the Consortium and handled by the OS.

~ ~ ~

Specialists like us might be able to put up with these things, but we can't control everything about the reading and writing experience online unless we're all resigned to taking pictures of handwritten text.


More information about the Unicode mailing list