Request for Information

fantasai fantasai.lists at
Wed Jul 23 14:45:48 CDT 2014

I would like to request that Unicode include, for each writing system it
encodes, some information on how it might justify. Possible options include

   a) Text justification typically expands at word-separating characters,
      but may also expand between letters.
   b) Since this writing system does not use spaces, justification typically
      expands between letters.
   c) Text justification can elongate glyphs and/or expand spaces, but because
      the script is cursive, cannot introduce inter-letter spacing.
   e) We do not have information on text justifying practices for this script.

Anything, really, would be helpful. Even saying you have no clue is helpful.

I would also like to request that the prose chapters include, for each
writing system encoded, some information on line-breaking conventions.
For example

   a) Latin typically breaks only at spaces and other punctuation.
      However, it also admits hyphenation within words.
      In some contexts (such as Japanese), it may, as a stylistic option,
      break anywhere (without hyphens).
   b) Arabic breaks between words. Some languages (such as Uyghur)
      allow hyphenation, but most do not.
   c) Japanese can break anywhere, except restrictions can be introduced
      by symbols and punctuation, and sometimes breaks before small kana
      are suppressed.
   c) Javanese only breaks between clauses, where punctuation is used,
      resulting in horrendously ragged lines. (Did I get that right?)
   d) We have no idea how this script would break across lines. There
      is only one inscription extant, and it is undeciphered.
   e) We believe that this script can break across lines, but the
      encoding proposal neglected to tell us how. We suggest pretending
      it's Latin for now.

This information is of course encoded into UAX#14 and can be extracted
from there (as I have done for Javanese above), however it's helpful
to have overview of what to expect from the data tables, and also a
clue about possible tailoring requirements.


More information about the Unicode mailing list