Why do the Hebrew Alphabetic Presentation Forms Exist

Mark E. Shoulson mark at kli.org
Mon Jun 8 16:02:37 CDT 2020

Look, think of it this way: what exactly is the content of, say, Exodus 
6:10, for a nice short and common verse?  What are the letters, vowels, 
and cantillations that make up that verse?  The answer is pretty 
well-agreed-upon by most sources.  Tell me: is the LAMED in that verse 
bent or straight?  Can you find a list of LAMEDs in the Torah that are 
bent?  Not "which ones are bent in this particular book."  That's like 
finding me a list of YODs that are at the end of a line: it has nothing 
to do with the actual TEXT.  Which LAMEDs in the Torah are bent?  None 
of them.  Nor are any of them straight.  Nor are any of them written in 
Frank-Ruehl, or Hadassah, or David.  Those are not properties of the 
text.  The consonantal text of the Torah uses exactly 22 letters plus 
final forms, plus the NUN HAFUKHA and a few instances of UPPER DOT.

Now, there *are* some letters in the Torah which are written unusually 
large or small, like the BET at the very beginning, or the small ALEPH 
in Leviticus 1:1.  But Unicode rightly considers those to be glyphic 
variants, to be handled at a higher level.  There's actually a better 
case for encoding these, because there IS a list of large BETs or small 
ALEPHs in the Torah, which "everyone" (who accepts Masoretic tradition) 
agrees are in these and those places in the text.  (But don't try to 
encode these, either.)

Down to one sentence: until you can talk about which LAMEDs in the Torah 
are bent and which are straight, I would expect this to be a non-starter.


On 6/8/20 1:45 PM, Abraham Gross via Unicode wrote:
> Unicode encodes characters that other character sets have even though it normally wouldn't. So if I find a character set with a folded lamed they'd add it?
> Here are 2 character sets with a folded lamed:
> https://i.imgur.com/iq8awBe.jpg – an אלף בינה with the standing and folded lameds as separate letters.
> https://www.tug.org/TUGboat/tb15-3/tb44haralambous-hebrew.pdf#page=12 – A TeX typesetting module with the standing and folded lameds as separate characters for fine-grain control when the automatic system doesn't work.
> 2020年6月7日 10:27, "Mark E. Shoulson via Unicode" <unicode at unicode.org> wrote:
>> On 6/7/20 7:46 AM, Richard Wordingham via Unicode wrote:
>> I agree.  Sorry, pretty typography is nice and everything, but if bent LAMED is anything, it's at
>> best a presentation form (and even that is a hard sell.)  You show ANYONE a word spelled with any
>> combination of bent and straight LAMEDs and ask how it's spelled, they'll just say "LAMED" for each
>> one.  Unicode encodes different *characters*, symbols that have a different *meaning* in text, not
>> things that happen to look different.  A U+05BA HOLAM HASER FOR VAV means not just "a dot like
>> U+05B9 only shifted over a little," it means that there is something *different* going on: VAV plus
>> HOLAM usually means one thing (a VAV as mater lectionis for an /o/ vowel), this is a consonantal
>> VAV followed by a vowel.  In spelling it out, you could call one a holam malé, but not the other.
>> A QAMATS QATAN is not just a qamats that looks a little different, it is a grammatically distinct
>> character, and moreover one that cannot be deduced algorithmically by looking at the letters around
>> it.  What you're talking about is a LAMED and a LAMED.  They are two *glyphs* for the same
>> character, and Unicode doesn't encode glyphs (anymore?)
>> ~mark

More information about the Unicode mailing list