Why do the Hebrew Alphabetic Presentation Forms Exist

Mark E. Shoulson mark at kli.org
Thu Jun 4 15:30:20 CDT 2020


{Sent this morning, but it bounced due to size.  Re-sending, with 
attachments, using jpg for smaller file-sizes.}

On 6/4/20 3:31 AM, Marius Spix via Unicode wrote:
> Unicode also has German s (U+0073) and ſ (U+017F) which are
> equivalent, but were used in typesetting for a long time. If you want
> to precisely reproduce a historic text, it is required to have
> separate ways to encode different glyps. In plaintext documents you
> have no influence on OpenType presentation.
Long-s also existed in earlier standards, and so had to be preserved.
> But you can use variant selectors, which can be registered
> in the IVD database. This would be propably the best way. Technically,
> using variant selectors has the same effect as different code points,
> as <U+05DC> and <U+05DC U+FE0E> would encode different shapes of the
> same character.
I don't think this rises even to the level of variation selectors.  This 
is a scribal alternation, like deciding to put some extra swash into a 
letter in this word but not that one. It's the whole purpose of OpenType 
tables.
> It also appears that there are more variants of lamed with special
> meanings in the bible:
> https://www.hebrew4christians.com/Grammar/Unit_One/Aleph-Bet/Lamed/lamed.html
>
> Can someone confirm that all variants of lamed have the same numeric
> value of 30? If it is different between the variants, that would
> qualify for different characters.

They are all 30, and more importantly they are all LAMEDs. Every one of 
those examples, the spelling of the word includes simply LAMED.  That's 
what's in the text.  What's on the paper (or parchment) can't be 
considered "plain text" since written or printed text is by definition 
formatted somehow, to fit on the page.

You don't want to go down the rabbit hole of letters written in certain 
old Torahs with anomalous tags, extra tags, curled and looped heads, etc 
(these exist, I have sources if you want.)  Those are specialized cases 
and not even accepted (halachically) as significant in writing a Torah. 
(You'd have better luck with the broken VAV in Numbers 25:12, which is 
at least still done in modern Torahs.)  I think these are too 
specialized a case to be considered actual variant letters.  Attached 
are some pictures from an old Torah I saw on display.  The first shows a 
"looped" or "wrapped" PEH.  In the second one, note extra tags on the 
SAMECHs in the second line and on the FINAL KAF in the last.  The medial 
closed MEM in לםרבה in Isaiah 9:6 is at least codified in the Mesorah as 
well.

Unlike (I think) Arabic positional variants, the Hebrew final forms have 
had more of an independent life as letters, considered as symbols of 
their own, so even if it weren't for the legacy encodings, they probably 
would have been rightly encoded separately.  After all, you can adjust 
what kind of joining an Arabic letter shows with proper use of ZWJ and 
ZWNJ, so the use of non-final PEH at the end of a word, *from a purely 
typographic perspective*, would not have been a barrier to encoding only 
a single PEH and choosing the form only by context.

But there are other considerations in the case of Hebrew as it actually 
is.  The fact is that a straight (final) PEH and a bent (non-final) PEH 
are *distinct* and different letters in Modern Hebrew, at least in the 
context of the end of a word.  As was mentioned already, if you spell 
the word סקופ as סקוף, you have spelled it *wrong*, and it would be 
pronounced differently.  And that usage has been in place for a long 
time; I think it's in Yiddish as well (but not Biblical Hebrew, witness 
Proverbs 30:6, with the word תּוֹסְףְּ, a final -P spelled with 
straight-PEH-dagesh).  There are some forms of gematria (numerology) 
which consider the final letters to have different numerical values than 
the non-final letters.  So there's some reasonable history to consider 
them distinct, and encoding them separately would have been the right 
move even without the legacy considerations.  I think Arabic traditions 
don't have such distinctions.

> We also have special glyph variants of the same character for special
> purposes, like an open tail g for IPA (ɡ, U+0261⟩ or an alternative phi
> for math (ϕ, U+03D5),  but these are completely optional and have no
> different meaning from the closed tail g and the curled phi. As far as
> I know linguists and mathematicians accept both glyph variants mutually
> interchangeable. I guess, they are only in Unicode for historic reasons.
Not so!  Contrariwise, in fact, at least for the IPA ɡ.  The reason it 
is encoded is because IPA stipulates that the symbol for the voiced 
velar stop be written ɡ with an open loop, and it is incorrect to write 
it with a binocular g.  Linguists do not consider these to be mutually 
interchangeable.  Same with the IPA ɑ, which is wrong if written 
two-storey.  I'm not sure about mathematics usage, but I think that 
there may be situations in math wherein φ and ϕ were used with distinct 
meanings (and not just by an isolated author.)


~mark

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/mailman/private/unicode/attachments/20200604/688911df/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedpic.jpg
Type: image/jpeg
Size: 40116 bytes
Desc: not available
URL: <https://corp.unicode.org/mailman/private/unicode/attachments/20200604/688911df/attachment-0002.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedpic2.jpg
Type: image/jpeg
Size: 7026 bytes
Desc: not available
URL: <https://corp.unicode.org/mailman/private/unicode/attachments/20200604/688911df/attachment-0003.jpg>


More information about the Unicode mailing list