UTN-11 Normalisation Algorithm
Richard Wordingham
richard.wordingham at ntlworld.com
Tue Nov 15 18:36:10 CST 2022
What is the licensing status of the code in UTN #11 to 'normalise'
Myanmar codepoint sequences? The description of the ordering rules
have combining classes that work similarly to Unicode canonical
normalisation, and the algorithm reorganises characters between those
with combining class 0 to find a sequence that complies with the
ordering requirements. The solution is reported to be unique, but need
not exist.
The algorithm is:
1) Stably sort the characters by combining class.
2) Fix up the positions of ASAT, SIGN AI and ANUSVARA.
The code presented for Part 2 seems to be a misleadingly apparently
general way of handling what is a set of special cases. (Art that
conceals labour.) I therefore fear that even a translation to a
different language would be a derived work under copyright law. So, is
a license available for using or changing this code?
I am asking because I would like to convert some Mon text from a
Unicode-like encoding to a reasonable Unicode encoding. One part of my
code will obviously be dependent on the probably font-specific encoding
used. I will realise no income from this activity.
Incidentally, as there need not be a reordering compliant with the UTN
#11 ordering requirements, I presume there is no definitive
reorganisation in such cases, and that the term 'normalisation' is
actually a misnomer.
Richard.
More information about the Unicode
mailing list