Encoding italic (was: A last missing link)

James Kass via Unicode unicode at unicode.org
Tue Jan 15 22:40:16 CST 2019

Victor Gaultney wrote,

 > Use of variation selectors, a single character modifier, or combining
 > characters also seem to be less useful options, as they act at the 
 > character level and are highly impractical. They also violate the key 
 > that italics are a way of marking a span of text as 'special' - not 
 > letters. Matched punctuation works the same way and is a good fit for 

The VS possibility would double the character count of any strings 
including them.  That may make it undesirable for groups like Twitter 
who have limits.  But math (mis)use doesn't affect the character count.  
If the VS method were to be used, the math alphanumerics might continue 
to be used where possible, at least by Twitter users who already employ 
the math-alphas to make their corpus of legacy data.

Using VS arose in the parent thread as a way of avoiding the necessity 
of adding additional characters to the standard.  (But we don't seem to 
be running out of available code space.)  The purpose of VS is to 
preserve variant letter form distinctions in plain-text, which seems to 
apply to italics.  Further, VS is an existing mechanism which wouldn't 
be expected to impact searching and so forth on savvy systems.  (An 
opening/closing pair of control characters also shouldn't impact 
searching.)  Finally, VS already works in existing technology and there 
wouldn't be a long down-time waiting for updates to the standard and 
implementation of same. (Not that we should rush to judgment or 
"solutions" here, just that an ad-hoc "solution" is possible and could 
be implemented by third-parties.)

Concerns about statefulness in plain-text exist.  Treating "italic" as 
an opening/closing "punctuation" may help get around such concerns.  
IIRC, it was proposed that the Egyptian cartouche be handled that way.

Like emoji, people who don't like italics in plain text don't have to 
use them.

More information about the Unicode mailing list