Call for feedback on UTS #18: Unicode Regular Expressions

Karl Williamson via Unicode unicode at unicode.org
Thu Jan 2 11:36:46 CST 2020


One thing I noticed in reviewing this is the removal of text about loose 
matching of the name property.  But I didn't see an explanation for this 
removal.  Please point me to the explanation, or tell me what it is.

Specifically these lines were removed:

As with other property values, names should use a loose match, 
disregarding case, spaces and hyphen (the underbar character "_" cannot 
occur in Unicode character names). An implementation may also choose to 
allow namespaces, where some prefix like "LATIN LETTER" is set globally 
and used if there is no match otherwise.

There are, however, three instances that require special-casing with 
loose matching, where an extra test shall be made for the presence or 
absence of a hyphen.

     U+0F68 TIBETAN LETTER A and
     U+0F60 TIBETAN LETTER -A
     U+0FB8 TIBETAN SUBJOINED LETTER A and
     U+0FB0 TIBETAN SUBJOINED LETTER -A
     U+116C HANGUL JUNGSEONG OE and
     U+1180 HANGUL JUNGSEONG O-E




More information about the Unicode mailing list