Encoding italic

Richard Wordingham via Unicode unicode at unicode.org
Tue Jan 22 18:16:40 CST 2019


On Mon, 21 Jan 2019 00:29:42 -0800
David Starner via Unicode <unicode at unicode.org> wrote:

> The superscripts show a problem with multiple encoding; even if you
> think they should be Unicode superscripts, and they look like Unicode
> superscripts, they might be HTML superscripts. Same thing would happen
> with italics if they were encoded in Unicode.

But if one strips the mark-up out, and searching is then based on
the collation elements of the text, then this is not a problem.
Mathematical and ASCII capitals differ only at the identity level.

Searching on the basis of codepoint sequences would come unstuck with
scriptio continua scripts - WJ and ZWSP can be optionally inserted to
improve line-breaking, and even to overcome spell-checkers.

Richard.



More information about the Unicode mailing list