Unicode String Models
Hans Åberg via Unicode
unicode at unicode.org
Tue Sep 11 14:10:03 CDT 2018
> On 11 Sep 2018, at 20:40, Eli Zaretskii <eliz at gnu.org> wrote:
>> From: Hans Åberg <haberg-1 at telia.com>
>> Date: Tue, 11 Sep 2018 20:14:30 +0200
>> Cc: hsivonen at hsivonen.fi,
>> unicode at unicode.org
>> If one encounters a file with mixed encodings, it is good to be able to view its contents and then convert it, as I see one can do in Emacs.
> Yes. And mixed encodings is not the only use case: it may well happen
> that the initial attempt to decode the file uses incorrect assumption
> about the encoding, for some reason.
> In addition, it is important that changing some portion of the file,
> then saving the modified text will never change any part that the user
> didn't touch, as will happen if invalid sequences are rejected at
> input time and replaced with something else.
Indeed, before UTF-8, in the 1990s, I recall some Russians using LaTeX files with sections in different Cyrillic and Latin encodings, changing the editor encoding while typing.
More information about the Unicode