Names for control characters

Per Starbäck starback at stp.lingfil.uu.se
Wed Mar 12 14:37:57 CDT 2014


Ken Whistler wrote:
> Please be very careful here. Having a non-empty value in field 1 of
> UnicodeData.txt is *not* the same has "having a Unicode name".
>
> See:
>
> http://www.unicode.org/versions/Unicode6.2.0/ch04.pdf#G135207

I know it's not a name. My question was *why* control characters don't
*have* names like

  CONTROL CHARACTER NULL
  CONTROL CHARACTER START OF HEADING
  CONTROL CHARACTER START OF TEXT
  etc.

It would be so obvious to have it like that, so I assume there is some
specific reason not to, but I still can't figure it out. For me there is
not less reason for these characters to have names than any others, so
for me it's like Linear B characters didn't have names, and I got the
answer "no problem, they have aliases, so that's OK!" This is just
strange to me. If names aren't needed, why do almost all characters have
them?

This is not about Emacs. Emacs was an example of a program that has use
for character names, and has a harder job because of this strangeness.
Too bad that (Emacs developer) Eli Zaretskii sees it as a rant against
Emacs when I mention that this property of Unicode has led to
longstanding (small) bugs there, but I think real examples are better
than made-up ones.

> If Emacs were to use "ALERT" or the abbreviation "BEL" for U+0007, ...

Yes, programs could have their own lists of preferred aliases to use,
or have a rule such as always use the first alias, but why? Why not have
a name, so programs don't have to choose which alias to use?

(I may be coming of as having a mission about this; "it should be done
like this!!", but mostly this is just a question: "it seems obvious it
should be done like this, so what am i missing?")



More information about the Unicode mailing list