Word_Break for Hieroglyphs

Richard Wordingham via Unicode unicode at unicode.org
Sat Dec 16 16:06:03 CST 2017


On Thu, 14 Dec 2017 15:53:13 +0100
Mark Davis ☕️ via Unicode <unicode at unicode.org> wrote:

> On Thu, Dec 14, 2017 at 3:22 PM, Michael Everson
> <everson at evertype.com> wrote:

> > NO. Clusters cannot be broken up just anywhere.
 
> Does that mean that ancient inscriptions would leave gaps at the end
> of lines in order to not break a cluster, or that modern users would
> expect software to leave gaps at the end of lines in order ​to not
> break a cluster? And what constitutes a cluster? Is that semantically
> determined (eg like Thai), or is it based on algorithmic features of
> the hieroglyphs?

An absence of gaps in ancient inscriptions would not be revealing.  One
justification trick available to the engravers was variable spelling -
spacing phonetic complements were optional.  Original letters would
offer the best evidence in this respect.

We're going to have some algorithmic clusters - it will make no sense
to break quadrats between lines. Also, it would be perverse to
line-break a graphic transposition.  Phonetic elements normally occur
in phonetic order, but bird plus tall thin character is usually
replaced by tall thin character plus bird.  Thus splitting ������ /wḏ/
'order' <wD-w-Y1A> i.e. <U+13397 EGYPTIAN HIEROGLYPH V024, U+13171
EGYPTIAN HIEROGLYPH G043, U+133DC EGYPTIAN HIEROGLYPH Y001A> into wD on
one line and w-Y1A on the next would be perverse.  Unfortunately, I
don't know whether it happens or not.  Preventing this particular
example ought to require a semantic analysis, but I couldn't find an
example of word final V024 in the free, 2006 edition of Paul Dickson's
"Dictionary of Middle Egyptian in Gardiner Classification Order", so
perhaps a sequence wD-w will always be word-internal.

Richard.



More information about the Unicode mailing list