Unicode String Models
Hans Åberg via Unicode
unicode at unicode.org
Tue Sep 11 12:13:28 CDT 2018
> On 11 Sep 2018, at 13:13, Eli Zaretskii via Unicode <unicode at unicode.org> wrote:
> In Emacs, each raw byte belonging
> to a byte sequence which is invalid under UTF-8 is represented as a
> special multibyte sequence. IOW, Emacs's internal representation
> extends UTF-8 with multibyte sequences it uses to represent raw bytes.
> This allows mixing stray bytes and valid text in the same buffer,
> without risking lossy conversions (such as those one gets under model
> 2 above).
Can you give a reference detailing this format?
More information about the Unicode