Why do the Hebrew Alphabetic Presentation Forms Exist

Mark E. Shoulson mark at kli.org
Sun Jun 7 09:27:17 CDT 2020


On 6/7/20 7:46 AM, Richard Wordingham via Unicode wrote:
> On Sat, 6 Jun 2020 23:58:42 -0600
> Anshuman Pandey via Unicode <unicode at unicode.org> wrote:
>
>> Hi Abraham,
>>
>> If you’re seriously thinking of submitting a proposal for a new
>> Hebrew character, please consider getting in touch with Deborah
>> Anderson, Michael Everson, or me. We’d be happy to help you figure
>> out the suitability of encoding the character in question or figuring
>> out ways to represent it in plain text, if need be.
> I[t] doesn't belong in plain text.  It only becomes useful once line
> breaks and character spacing are known.
>
> Richard.

I agree.  Sorry, pretty typography is nice and everything, but if bent 
LAMED is anything, it's at best a presentation form (and even that is a 
hard sell.)  You show ANYONE a word spelled with any combination of bent 
and straight LAMEDs and ask how it's spelled, they'll just say "LAMED" 
for each one.  Unicode encodes different *characters*, symbols that have 
a different *meaning* in text, not things that happen to look 
different.  A U+05BA HOLAM HASER FOR VAV means not just "a dot like 
U+05B9 only shifted over a little," it means that there is something 
*different* going on: VAV plus HOLAM usually means one thing (a VAV as 
mater lectionis for an /o/ vowel), this is a consonantal VAV followed by 
a vowel.  In spelling it out, you could call one a holam malé, but not 
the other.  A QAMATS QATAN is not just a qamats that looks a little 
different, it is a grammatically distinct character, and moreover one 
that cannot be deduced algorithmically by looking at the letters around 
it.  What you're talking about is a LAMED and a LAMED.  They are two 
*glyphs* for the same character, and Unicode doesn't encode glyphs 
(anymore?)

~mark



More information about the Unicode mailing list