Word_Break for Hieroglyphs
Richard Wordingham via Unicode
unicode at unicode.org
Sat Dec 16 16:06:03 CST 2017
On Thu, 14 Dec 2017 15:53:13 +0100
Mark Davis ☕️ via Unicode <unicode at unicode.org> wrote:
> On Thu, Dec 14, 2017 at 3:22 PM, Michael Everson
> <everson at evertype.com> wrote:
> > NO. Clusters cannot be broken up just anywhere.
> Does that mean that ancient inscriptions would leave gaps at the end
> of lines in order to not break a cluster, or that modern users would
> expect software to leave gaps at the end of lines in order to not
> break a cluster? And what constitutes a cluster? Is that semantically
> determined (eg like Thai), or is it based on algorithmic features of
> the hieroglyphs?
An absence of gaps in ancient inscriptions would not be revealing. One
justification trick available to the engravers was variable spelling -
spacing phonetic complements were optional. Original letters would
offer the best evidence in this respect.
We're going to have some algorithmic clusters - it will make no sense
to break quadrats between lines. Also, it would be perverse to
line-break a graphic transposition. Phonetic elements normally occur
in phonetic order, but bird plus tall thin character is usually
replaced by tall thin character plus bird. Thus splitting /wḏ/
'order' <wD-w-Y1A> i.e. <U+13397 EGYPTIAN HIEROGLYPH V024, U+13171
EGYPTIAN HIEROGLYPH G043, U+133DC EGYPTIAN HIEROGLYPH Y001A> into wD on
one line and w-Y1A on the next would be perverse. Unfortunately, I
don't know whether it happens or not. Preventing this particular
example ought to require a semantic analysis, but I couldn't find an
example of word final V024 in the free, 2006 edition of Paul Dickson's
"Dictionary of Middle Egyptian in Gardiner Classification Order", so
perhaps a sequence wD-w will always be word-internal.
More information about the Unicode