UTN-11 Normalisation Algorithm

Richard Wordingham richard.wordingham at ntlworld.com
Tue Nov 15 18:36:10 CST 2022


What is the licensing status of the code in UTN #11 to 'normalise'
Myanmar codepoint sequences?  The description of the ordering rules
have combining classes that work similarly to Unicode canonical
normalisation, and the algorithm reorganises characters between those
with combining class 0 to find a sequence that complies with the
ordering requirements.  The solution is reported to be unique, but need
not exist.

The algorithm is:

1) Stably sort the characters by combining class.
2) Fix up the positions of ASAT, SIGN AI and ANUSVARA.

The code presented for Part 2 seems to be a misleadingly apparently
general way of handling what is a set of special cases.  (Art that
conceals labour.)  I therefore fear that even a translation to a
different language would be a derived work under copyright law.  So, is
a license available for using or changing this code?

I am asking because I would like to convert some Mon text from a
Unicode-like encoding to a reasonable Unicode encoding. One part of my
code will obviously be dependent on the probably font-specific encoding
used.  I will realise no income from this activity.

Incidentally, as there need not be a reordering compliant with the UTN
#11 ordering requirements, I presume there is no definitive
reorganisation in such cases, and that the term 'normalisation' is
actually a misnomer.

Richard.


More information about the Unicode mailing list