Moving The Hebrew Extended Block Into The SMP

Philippe Verdy verdy_p at
Wed May 11 07:46:10 CDT 2016

Effectively, if you need Arabic diacritics on top of Hebrew letters, just
use them. There will be no defect on script breaking, except in  strict
security checks for identifiers  where such usage is very unlikely or only

You could as well use Latin/generic  diacritics if needed such as a
circumflex or cedilla. You could also use Latin letter-like diacritics, but
not the spacing ones, such as superscript o.

Combining characters should not ne desunified even if they are used un
several scripts, and even if those script have different directions, unless
they behave differently, i.e when they don't stack properly.

Hebrew diacritics written above or below normally don't stack vertically
but are ordered horizontally, but even in this case this can be infered
from the  base letter  which determines the effective layout  and even the
effective glyph to use for the diacritic (e.g. with the cedilla which
attaches sometimes above left instead of  below with some Latin letters
that have descenders like "g", or when some accents are added to Greek
letters and  placed on the left of capital letters instead of above).

Desunification of these diacritics however is needed when layouts are
distinguished both visually and semantically (such as the sin vs. shin
dots), and when their normalisation would cause major problems  requiring
systematic use of CGJ to block their reordering.

So don't fear using Arabic points or Latin accents, on top of Hebrew
letters they will be interpreted correctly in their Hebrew context, and by
themseves those combining diacritics have no direction (for the Bidi
algorithm which preverves the combining clusters).
Le 11 mai 2016 03:28, "Robert Wheelock" <rwhlk142 at> a écrit :

> Hello again!  Shalom!
> After reading through the V. 9β code charts PDF document, I DID find a new
> area to relocate our new Hebrew Extended block (a very important area to
> add into Unicode):
> THE AREA FROM U+30000 TO U+3014F (336 codepoints)
> ·U+30000—U+30014 (21 codepoints):  Additional characters for typesetting
> Biblical/Classical Hebrew
> ·U+30015—U+3001F (11 codepoints):  Palestinian vowel and pronunciation
> points for Hebrew and Galilean Aramaic
> ·U+30020—U+30021 (2 codepoints):  Small superscript top-left signs for the
> letter *shin*—superscript śin and superscript shin
> ·U+30022—U+30041 (32 codepoints):  Palestinian cantillation signs for
> Hebrew and Galilean Aramaic
> ·U+30042 is reserved
> ·U+30043—U+3005C (26 codepoints):  Babylonian vowel and pronunciation
> points for Hebrew
> ·U+3005D—U+3005F are reserved
> ·U+30060—U+30071 (18 codepoints):  Babylonian cantillation signs for Hebrew
> ·U+30072—U+3007D are reserved
> ·U+3007E—U+3008F (18 codepoints):  Samaritan vowel points, pronunciation
> points, and cantillation signs for Hebrew (copies of those also being used
> for Samaritan script in BMP)
> ·U+30090—U+3010F (128 codepoints):  Additional characters in Hebrew script
> for other Jewish languages (these are pointed like the corresponding Arabic
> characters in the BMP)
> ·U+30110—U+3012F (32 codepoints):  Basic Hebrew superscript characters
> (regular letters+5 final forms+top-left pointed *śin*+top-right pointed
> *shin*+*maqqef*)
> ·U+30130—U+3014F (32 codepoints):  Basic Hebrew subscript characters
> (regular letters+5 final forms+top-left pointed *śin*+top-right pointed
> *shin*+*maqqef*)
> Please STAY TUNED for updates.  Thank You!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list