Compatibility normalization

Piotr Karocki pkar at ieee.org
Thu Oct 12 13:29:30 CDT 2023


>if one must map some text with diacritics onto text in ISO Basic Latin
>letters (ASCII letters)
> for purposes beyond just fuzzy matching, it is usually better to use (with
> awareness of the
> language in use) an appropriate transcription scheme rather than just
> removing all diacritics;
> see German DIN 91379 for European languages[3], Vietnamese Telex[4],
> Gwoyeu Romatzyh for
> Mandarin tones[5], Revised Romanisation for Korean vowels[6], etc.)
Such mapping is useful to create filenames compatible with POSIX portable
filename characterset. And such charset is meant to create filenames that
can be used in any existing file system.


More information about the Unicode mailing list