how to make custom combining diacritical marks for arabic letters?

dinar qurbanov via Unicode unicode at unicode.org
Wed Jan 15 04:30:54 CST 2020


"What are the combining marks supposed to look like?"

as you can see in http://tmf.org.ru/arabic.html , i have tested
reversed fatkha. also i have ideas to make reversed kasra, different
reversed dhammas, and vertical variants of them all, and maybe totally
other diacritics, like caron, circumflex.

some of that ideas already available in unicode. see
https://en.wikipedia.org/wiki/Arabic_script_in_Unicode#Compact_table
line U+065x . i see there are reversed and inverted dammas, small v
and inverted small v, and others, but probably they are not enough for
me.

i have just read https://en.wikipedia.org/wiki/Arabic_diacritics and i
have seen and remembered that there are different levels of arabic
diacritics. consonant modifiers, "ijam", are more close to main
part/line , and diacritics for short vowels, haraka, are further. also
there is tashdid that is usually between them by its distance... i
would like to be able to make several more symbols to extend short
vowels.


"Are they your creation or do you have samples of usage?"

i have an idea to use arabic script for tatar language (it is turkic
language), and that is also usable for other languages, with using
harakas instead of full/long/"main line" vowel letters. this would
make writing shorter with possibility of omitting some of the
vowels...

there are only 3 short vowels in arabic language and 3 long vowels.
long vowels are written with main line like consonants, shorts with
diacritics.

i have checked in https://en.wikipedia.org/wiki/Uyghur_language and
then in https://en.wikipedia.org/wiki/Arabic_script#Special_letters ,
and as i know and as i see languages with arabic script use "whole"
letters to represent their additional vowels, for example, ۆ‎ , ې in
uyghur language, these are made with using diacritic, but the "ijam"
diacritic, consonant modifier. logically, short vowel diacritics still
can be put above or below them, though that has no usage in that
languages, and it probably works in unicode (ie probably the consonant
modifiers and the 3 short vowels do not intersect/cross, if put
together).

how many vowels i need for tatar language: аоуыи, their "thin" pairs
әөүэи, their "russian" pairs аоуы, and 2 "russian" vowels "е" and "э".
so, i need 16 diacritics to put them above or below consonant letters.

this my idea is not used anywhere, only in a short handwriting
example. it is here http://qdb.narod.ru/tattyazmagif/qaradaft07.gif .
so, yes, this is my creation, a constructed script, and it is not
developed completely, but just a sketch. so, i would like to use
private use area for that.


2020-01-14 20:02 GMT+03:00, Lorna Evans <lorna_evans at sil.org>:
> What are the combining marks supposed to look like? Are they your
> creation or do you have samples of usage? It is true that you will not
> likely get combining marks to work if either they or the base character
> are PUA. Adding the complexity of RTL makes the issue worse.
>
> Lorna
>
> On 1/10/2020 12:30 PM, dinar qurbanov via Unicode wrote:
>> hello.
>>
>> you can browse to replies that are not quoted below from
>> https://unicode.org/mail-arch/unicode-ml/y2018-m05/0039.html .
>>
>> where can i write some bug reports or feature requests in order to get
>> custom diacritic marks automatically positioned at right place above
>> and below arabic letters, and also without having to put beginning /
>> middle / end forms of arabic letters manually, but using just "simple"
>> arabic letter unicode codes. and, where should i submit bug reports
>> for what, what is responsible for what.
>>
>> seems users of unicode should be able to use private use area like
>> this, to develop their own arabic and other diacritics, not only latin
>> / greek / cyrillic... though i am even not tried to make
>> latin/cyrillic/greek custom diacritics yet... i used custom latin and
>> cyrillic scripts, but i need not to develop custom diacritics, because
>> there are plenty of ready diacritics to use with them.
>>
>>
>> 2018-05-19 13:22 GMT+03:00, dinar qurbanov <qdinar at gmail.com>:
>>> this is a test i made that time: http://tmf.org.ru/arabic.html . look
>>> at second line. my custom mark is located too left on the most left
>>> "B", and is located too right on the middle (that is of middle form of
>>> B) and on the most righ "B" (that is of starter form of B). it should
>>> be located right above the below dot.
>>>
>>> - this was the problem that i could not solve.
>>>
>>> also there are problems that i could solve by using 1) rtl override
>>> mark; 2) and using start, middle, end, separate B characters instead
>>> of using simple arabic B, that would be easier. (you can see in the
>>> example that that characters are used). (using different forms of
>>> letter can also be achieved by using php or javascript, etc).
>>>
>>>
>>>
>>>
>>> 2018-05-17 22:12 GMT+03:00 Richard Wordingham via Unicode
>>> <unicode at unicode.org>:
>>>> On Thu, 17 May 2018 09:49:55 +0300
>>>> dinar qurbanov via Unicode <unicode at unicode.org> wrote:
>>>>
>>>>> how to make custom combining diacritical marks for arabic letters?
>>>>> should only font drivers and programs support it, or should also
>>>>> unicode support it, for example, have special area for them?
>>>>>
>>>>> as i know, private use area can be used to make combining diacritical
>>>>> marks for latin script without problems.
>>>>>
>>>>> but when i tried, several years ago, to make that for arabic script,
>>>>> with fontforge, i had to use right to left override mark, and manually
>>>>> insert beginning, middle, ending forms of arabic letters, and even
>>>>> then, my custom marks were not located very properly above letters.
>>>> I'm offering suggestions, but I don't that they will work.
>>>>
>>>> The one thing that may help you is that these marks cannot appear in
>>>> plain text.  There are a number of things you need to do:
>>>>
>>>> 1) Persuade the renderer to treat your character as being a run in a
>>>> single script.  You might be able to do this by:
>>>>
>>>> a) Not having any lookups for the Arabic script.
>>>>
>>>> b) Using RLM to persuade the renderer that you have a right-to-left
>>>> run.
>>>>
>>>> It is just possible that his may fail with OpenType fonts but work
>>>> with Graphite or AAT fonts.  If it works, you will then have to
>>>> implement all the Arabic shaping yourself.
>>>>
>>>> 2) If OpenType fonts will treat the data as a single script run, you
>>>> will need to ensure that there is an OpenType substitution feature that
>>>> the renderer will support.  Fortunately, many modern text applications
>>>> will allow you to force the ccmp feature to be enabled - I have used
>>>> such feature forcing with OpenType in LibreOffice and also in HTML,
>>>> which renders accordingly in all the modern browsers I have tested - MS
>>>> Edge on Windows 10, Firefox and, on iPhones, Safari.  While the ccmp
>>>> feature is enabled for the PUA in Firefox, it is disabled in MS Edge on
>>>> Windows 10.
>>>>
>>>> 3) I believe AAT will soon be available for products using the HarfBuzz
>>>> layout engine, so it is likely to become available on Firefox and
>>>> LibreOffice.  If AAT looks like a solution, you may need to research
>>>> the
>>>> attitudes of Chrome and OpenOffice, for I believe they have chosen not
>>>> to support Graphite.
>>>>
>>>> A totally different solution would be to recompile your application so
>>>> that it believes that your diacritics are in the Arabic script.
>>>>
>>>> Richard.
>



More information about the Unicode mailing list