Removing accents and diacritics from a word
Tex via Unicode
unicode at unicode.org
Wed Jul 17 13:37:38 CDT 2019
Asmus, are you including the case where an accented character maps to two unaccented characters?
e.g. Å to AA or Ä to AE
From: Unicode [mailto:unicode-bounces at unicode.org] On Behalf Of Asmus Freytag (c) via Unicode
Sent: Wednesday, July 17, 2019 11:07 AM
To: Norbert Lindenberg
Cc: Unicode Mailing List
Subject: Re: Removing accents and diacritics from a word
On 7/17/2019 11:02 AM, Norbert Lindenberg wrote:
“Misspelling”?
Not helpful. Anybody have a serious suggestion?
A./
On Jul 17, 2019, at 10:37, Asmus Freytag via Unicode <mailto:unicode at unicode.org> <unicode at unicode.org> wrote:
A question has come up in another context:
Is there any linguistic term for describing the process of removing accents and diacritics from a word to create its “base form”, e.g. São Tomé to Sao Tome?
The linguistic term "string normalization" appears not that preferable in a computing context.
Any ideas?
A./
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20190717/58d55b84/attachment.html>
More information about the Unicode
mailing list