Displaying Lines of Text as Line-Broken by a Human

Richard Wordingham via Unicode unicode at unicode.org
Sun Jul 21 18:03:00 CDT 2019


I've been transcribing some Pali text written on palm leaf in the
Tai Tham script.  I'm looking for a way of reflecting the line
boundaries in a manuscript in a transcription.  The problem is that
lines sometimes start or end with an isolated spacing mark.  I want
my text to be searchable and therefore encoded in Unicode.  (I
appreciate that There is a trade-off between searchability and showing
line boundaries.  The unorthodox spelling is also a problem.)

How unreasonable is it for a font to render

<NBSP, ZWJ, U+25CC DOTTED CIRCLE, spacing_mark>

as just the spacing mark?  Some rendering systems give the font no way
of distinguishing dotted circles in the backing store from dotted
circles added by the renderer, so this technique is not Unicode
compliant.

An alternative solution is to have a parallel font (or, more neatly, a
feature) that renders some base character (or sequence) as a zero-width
non-inking character.  This, however, would violate that character's
identity.  I suspect there is no Unicode-compliant solution.

Richard.


More information about the Unicode mailing list