Why do the Hebrew Alphabetic Presentation Forms Exist

abrahamgross at disroot.org abrahamgross at disroot.org
Sun Jun 7 13:45:05 CDT 2020

If this is the case, then why do the CJK blocks have tons of alternatives for the same character? (not counting the compatibility ideographs that were just added for compatibility with other encodings) If you look at old dictionaries, these alternatives get listed as alternatives of the same character you might see some fonts use. The meaning is exactly the same.

Some examples (theres tons and tons more): 

2020年6月7日 10:27, "Mark E. Shoulson via Unicode" <unicode at unicode.org> wrote:

> On 6/7/20 7:46 AM, Richard Wordingham via Unicode wrote:
>> On Sat, 6 Jun 2020 23:58:42 -0600
>> Anshuman Pandey via Unicode <unicode at unicode.org> wrote:
>>> Hi Abraham,
>>> If you’re seriously thinking of submitting a proposal for a new
>>> Hebrew character, please consider getting in touch with Deborah
>>> Anderson, Michael Everson, or me. We’d be happy to help you figure
>>> out the suitability of encoding the character in question or figuring
>>> out ways to represent it in plain text, if need be.
>> I[t] doesn't belong in plain text. It only becomes useful once line
>> breaks and character spacing are known.
>> Richard.
> I agree.  Sorry, pretty typography is nice and everything, but if bent LAMED is anything, it's at
> best a presentation form (and even that is a hard sell.)  You show ANYONE a word spelled with any
> combination of bent and straight LAMEDs and ask how it's spelled, they'll just say "LAMED" for each
> one.  Unicode encodes different *characters*, symbols that have a different *meaning* in text, not
> things that happen to look different.  A U+05BA HOLAM HASER FOR VAV means not just "a dot like
> U+05B9 only shifted over a little," it means that there is something *different* going on: VAV plus
> HOLAM usually means one thing (a VAV as mater lectionis for an /o/ vowel), this is a consonantal
> VAV followed by a vowel.  In spelling it out, you could call one a holam malé, but not the other. 
> A QAMATS QATAN is not just a qamats that looks a little different, it is a grammatically distinct
> character, and moreover one that cannot be deduced algorithmically by looking at the letters around
> it.  What you're talking about is a LAMED and a LAMED.  They are two *glyphs* for the same
> character, and Unicode doesn't encode glyphs (anymore?)
> ~mark

More information about the Unicode mailing list