NamesList.txt as data source

Ken Whistler kenwhistler at att.net
Fri Mar 11 12:24:29 CST 2016



On 3/11/2016 9:37 AM, Oren Watson wrote:
> Ok, so let me see if I understand this correctly. Suppose I'm writing 
> a editor for math equations, and I want the user to be able to press a 
> "Doublestruck" button and then type an C or D to get a ℂ or �� 
> respectively. There is apparently no official source containing a 
> machine-readable table of the doublestruck equivalents of each 
> character that has such an equivalent. Such a table might also include 
> { -> ⦃ and such.
>
> This seems like something that would be very convenient to have 
> centralized and standardized.
>

O.k., it is taking more time to talk about this than to just make the lists.
See attached list, which took about 5 minutes to cull.

That lists the 24 "unifications" mentioned on page 7 of UTR #25, Unicode
Support for Mathematics:

http://www.unicode.org/reports/tr25/

It matches the 24 explicit cross-references listed in the Unicode names
list.

If the ability to pull out such a list and make it "machine-readable" in 
a few minutes
doesn't suffice, and you need something which counts as a more "official
source", then the best way forward would be to engage with the UTC during
the next update cycle for UTR #25, when its associated data table needs
to be checked for the 9.0 repertoire additions, and advocate that some 
further documentation
be made explicitly for those 24 mappings.

BTW, all 24 *are* already present in MathClassEx-14.txt:

http://www.unicode.org/Public/math/revision-14/MathClassEx-14.txt

as commented-out entry lines. So an even faster way to get a centralized
(if not "official") list, is to take MathClassEx-14.txt and

% grep #1D MathClassEx-14.txt | grep reserved > maplistout.txt

See also attached.

As for starting down the road of suggesting additional equivalences, 
e.g. for
double-struck parentheses, that is certainly something somebody could do,
and might be interesting content to add to UTR #25 -- but it goes beyond the
formal unification issue for the 24 mathematical alphabet letters already
encoded in the 2100 block.

--Ken




-------------- next part --------------
1D455 ; 210E # planck constant
1D49D ; 212C # script capital b
1D4A0 ; 2130 # script capital e
1D4A1 ; 2131 # script capital f
1D4A3 ; 210B # script capital h
1D4A4 ; 2110 # script capital i
1D4A7 ; 2112 # script capital l
1D4A8 ; 2133 # script capital m
1D4AD ; 211B # script capital r
1D4BA ; 212F # script small e
1D4BC ; 210A # script small g
1D4C4 ; 2134 # script small o
1D506 ; 212D # black-letter capital c
1D50B ; 210C # black-letter capital h
1D50C ; 2111 # black-letter capital i
1D515 ; 211C # black-letter capital r
1D51D ; 2128 # black-letter capital z
1D53A ; 2102 # double-struck capital c
1D53F ; 210D # double-struck capital h
1D545 ; 2115 # double-struck capital n
1D547 ; 2119 # double-struck capital p
1D548 ; 211A # double-struck capital q
1D549 ; 211D # double-struck capital r
1D551 ; 2124 # double-struck capital z
-------------- next part --------------
#1D455=210E;N;;;;;ITALIC SMALL H <reserved>
#1D49D=212C;A;;Bscr;ISOMSCR;;SCRIPT CAPITAL B <reserved>
#1D4A0=2130;A;;Escr;ISOMSCR;;SCRIPT CAPITAL E <reserved>
#1D4A1=2131;A;;Fscr;ISOMSCR;;SCRIPT CAPITAL F <reserved>
#1D4A3=210B;A;;Hscr;ISOMSCR;;SCRIPT CAPITAL H <reserved>
#1D4A4=2110;A;;Iscr;ISOMSCR;;SCRIPT CAPITAL I <reserved>
#1D4A7=2112;A;;Lscr;ISOMSCR;;SCRIPT CAPITAL L <reserved>
#1D4A8=2133;A;;Mscr;ISOMSCR;;SCRIPT CAPITAL M <reserved>
#1D4AD=211B;A;;Rscr;ISOMSCR;;SCRIPT CAPITAL R <reserved>
#1D4BA=212F;A;;escr;ISOMSCR;;SCRIPT SMALL E <reserved>
#1D4BC=210A;A;;gscr;ISOMSCR;;SCRIPT SMALL G <reserved>
#1D4C4=2134;A;;oscr;ISOMSCR;;SCRIPT SMALL O <reserved>
#1D506=212D;A;;Cfr;ISOMFRK;;FRAKTUR CAPITAL C <reserved>
#1D50B=210C;A;;Hfr;ISOMFRK;;FRAKTUR CAPITAL H <reserved>
#1D50C=2111;A;;Ifr;ISOMFRK;;FRAKTUR CAPITAL I <reserved>
#1D515=211C;A;;Rfr;ISOMFRK;;FRAKTUR CAPITAL R <reserved>
#1D51D=2128;A;;Zfr;ISOMFRK;;FRAKTUR CAPITAL Z <reserved>
#1D53A=2102;A;;Copf;ISOMOPF;;DOUBLE-STRUCK CAPITAL C <reserved>
#1D53F=210D;A;;Hopf;ISOMOPF;;DOUBLE-STRUCK CAPITAL H <reserved>
#1D545=2115;A;;Nopf;ISOMOPF;;DOUBLE-STRUCK CAPITAL N <reserved>
#1D547=2119;A;;Popf;ISOMOPF;;DOUBLE-STRUCK CAPITAL P <reserved>
#1D548=211A;A;;Qopf;ISOMOPF;;DOUBLE-STRUCK CAPITAL Q <reserved>
#1D549=211D;A;;Ropf;ISOMOPF;;DOUBLE-STRUCK CAPITAL R <reserved>
#1D551=2124;A;;Zopf;ISOMOPF;;DOUBLE-STRUCK CAPITAL Z <reserved>


More information about the Unicode mailing list