Removing accents and diacritics from a word

Asmus Freytag (c) via Unicode unicode at unicode.org
Wed Jul 17 18:55:03 CDT 2019


On 7/17/2019 11:37 AM, Tex wrote:
>
> Asmus, are you including the case where an accented character maps to 
> two unaccented characters?
>
> e.g. Å to AA or Ä to AE
>
If that's covered by the same term; but it's not simple 
"typewriter/telegraph" fallback.


>
> *From:*Unicode [mailto:unicode-bounces at unicode.org] *On Behalf Of 
> *Asmus Freytag (c) via Unicode
> *Sent:* Wednesday, July 17, 2019 11:07 AM
> *To:* Norbert Lindenberg
> *Cc:* Unicode Mailing List
> *Subject:* Re: Removing accents and diacritics from a word
>
> On 7/17/2019 11:02 AM, Norbert Lindenberg wrote:
>
>     “Misspelling”?
>
> Not helpful. Anybody have a serious suggestion?
>
> A./
>
>         On Jul 17, 2019, at 10:37, Asmus Freytag via Unicode<unicode at unicode.org>  <mailto:unicode at unicode.org>  wrote:
>
>         A question has come up in another context:
>
>         Is there any linguistic term for describing the process of removing accents and diacritics from a word to create its “base form”, e.g. São Tomé to Sao Tome?
>
>         The linguistic term "string normalization" appears not that preferable in a computing context.
>
>         Any ideas?
>
>         A./
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20190717/500f02c2/attachment.html>


More information about the Unicode mailing list