Encoding italic

Richard Wordingham via Unicode unicode at unicode.org
Sat Jan 19 23:30:39 CST 2019


On Fri, 18 Jan 2019 10:51:18 -0500
"Mark E. Shoulson via Unicode" <unicode at unicode.org> wrote:

> On 1/16/19 6:23 AM, Victor Gaultney via Unicode wrote:
> >
> > Encoding 'begin italic' and 'end italic' would introduce
> > difficulties when partial strings are moved, etc. But that's no
> > different than with current punctuation. If you select the second
> > half of a string that includes an end quote character you end up
> > with a mismatched pair, with the same problems of interpretation as
> > selecting the second half of a string including an 'end italic'
> > character. Apps have to deal with it, and do, as in code editors.
> >  
> It kinda IS different.  If you paste in half a string, you get a 
> mismatched or unmatched paren or quote or something.  A typo, but a 
> transient one.  It looks bad where it is, but everything else is 
> unaffected.  It's no worse than hitting an extra key by mistake. If
> you paste in a "begin italic" and miss the "end italic", though, then
> *all* your text from that point on is affected!  (Or maybe "all until
> a newline" or some other stopgap ending, but that's just
> damage-control, not damage-prevention.)  Suddenly, letters and
> symbols five words/lines/paragraphs/pages look different, the
> pagination is all altered (by far more than merely a single extra
> punctuation mark, since italic fonts generally are narrower than
> roman).  It's a disaster.

The problem is worst when you have a small amount of italicisable text
scattered within unitalicisable text.  Unlike the case with bidi
controls, the text usually remains intelligible with some work, and
one can generally see where the missing italic should go.  However,
damage-limitation is desirable - I would suggest cancelling effects
at the end of paragraph, as with bidi controls.  On the other hand, the
corresponding stateful ISCII character settings (for font effects and
script) are ended at the end of line, which might be a finer concept.

There are several stateful control characters for Arabic, mostly
affecting numbers.  However, as far as I can see, their effect is
limited to one word (typically a string of digits).  That seems too
limited for italics, though it would be reasonable for switching
between Antiqua and black letter.

One minor problem with the stateful encoding, which seems to be in the
original spirit of ISO 10646, is that redundant instances of the
italic controls would build up in heavily edited text.  I see that
effect with ZWSP when I don't have a display mode that shows it.  One
solution would be for tricks such as "start italic" having a visible
glyph in italic mode when the contrast between italic and non-italic
mode is displayed. I don't believe italicity should be nested.
However, such a build-up is a very minor problem.

Richard.



More information about the Unicode mailing list