Private Use areas

Janusz S. Bień via Unicode unicode at unicode.org
Fri Aug 24 11:40:07 CDT 2018


On Fri, Aug 24 2018 at 16:12 +0300, eliz at gnu.org writes:
>> From: jsbien at mimuw.edu.pl (Janusz S. Bień)
>> Cc: unicode at unicode.org,  richard.wordingham at ntlworld.com
>> Date: Thu, 23 Aug 2018 21:47:03 +0200
>> 
>> I'm very glad you join the discussion.
>
> I'm sorry for not joining sooner.  In my defense, I missed the
> reference to Emacs, and the rest of the discussion is not really
> interesting for me, as using PUA for new characters is not something I
> have interest in or experience with.

I don't think you missed anything important.

>
>> My needs are very simple, for example C-x 8 Return LATIN CAPITAL LETTER
>> A WITH MACRON AND BREVE [MUFI] should yield the character with the code
>> E010. I can provide the list of names and codes.
>
> So you'd like to extend "C-x 8 RET" to recognize names of additional
> characters and associate them with codepoints in the PUA area?  That
> shouldn't be hard to add.

I would prefer extensibility over efficiency, I don't mind loading PUA
information from a source declared somehow in .emacs.d., so I can
change/expand the list of characters from time to time.

> But is that all? won't you also want to tell Emacs about the
> properties of those characters?

Personally I would like additionally to be able to change the case of a
letter or string, and I am willing to prepare the necessary information
for MUFI characters.

Displaying other properties would be nice, but for me this is not
crucial. Moreover, somebody has to prepare the data...

> or be able to set up fonts for displaying them?

It would be nice. I haven't asked for it because I typeset my texst with
XeTeX or LuaTeX and the input is more important for me than rendering.

> IOW, would it be okay to have these
> characters be "second-class citizens" in Emacs?

For me it would be acceptable.

BTW, I just got perhaps a crazy idea: what about treating a PUA
declaration (as you probably noticed, there may be conficting ones) as a
separate coding system? Of course some mechanism for escaping the
standard PUA interpretation would be needed.

>
>> > It is true that the Unicode related data is produced at build time,
>> > but only some of that is actually recorded in the Emacs binary, the
>> > rest is loaded upon demand.  But all the data is stored in data
>> > structures that are mutable, given some Lisp programming.
>> 
>> I never was fluent in Lisp programming and by now I forgot almost
>> everything I knew, so it's not a task for me. I was thinking about
>> submitting a feature request, but I forgot also the proper procedures to
>> do it.
>
> The proper procedure is to type "M-x report-emacs-bug RET" and then
> describe the feature(s) you'd like to see added/improved.

I will definitely remember now :-)

>
>> Moreover I had the impression that I'm the only person who needs
>> it...
>
> That shouldn't stop you.  Many a feature in Emacs started as a request
> from a single individual.
>
>> > (It is not clear to me which part of the Unicode data you would like
>> > to change; are you talking about adding characters to the list of
>> > those defined by Unicode?  If you are using the PUA codepoints, it's
>> > possible that you will need to update Emacs's notion of PUA as well.)
>> 
>> Yes, I would like the PUA codepoints to be handled analogically as the
>> proper ones. What do you mean by Emacs's notion of PUA?
>
> Emacs knows about the PUA regions of the Unicode code-space, and
> treats those codepoints specially.  The features you request will
> probably need to affect the PUA region as well, because the codepoints
> you use should no longer be treated as PUA.

Best regards

Janusz

-- 
             ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien



More information about the Unicode mailing list