One encoding per shape (was Re: Long standing problem with Vedic tone markers and post-base visarga/anusvara)

James Kass via Unicode unicode at unicode.org
Wed Jan 1 14:11:04 CST 2020


On 2020-01-01 11:17 AM, Richard Wordingham via Unicode wrote:

 > That's exactly the sort of mess that jack-booted renderers are trying
 > to minimise.  Their principle is that there should be only one encoding
 > per shape, though to be fair:
 >
 > 1) some renderers accept canonical equivalents.
 > 2) tolerance may be allowed for ligating (ZWJ, ZWNJ, CGJ), collating
 > (CGJ, SHY) and line-breaking controls (SHY, ZWSP, WJ).
 > 3) Superseded chillu encodings are still supported.

There was never any need for atomic chillu form characters.  The 
principle of only one encoding per shape is best achieved when every 
shape gets an atomic encoding.  Glyph-based encoding is incompatible 
with Unicode character encoding principles.

It’s too bad that ISCII didn’t accomodate the needs of Vedic Sanskrit, 
but here we are.



More information about the Unicode mailing list