Terminology (was: Latin glottal stop in ID in NWT, Canada)

Eli Zaretskii eliz at gnu.org
Sat Oct 24 00:40:32 CDT 2015

> Date: Fri, 23 Oct 2015 23:16:32 +0100
> From: Richard Wordingham <richard.wordingham at ntlworld.com>
> "C6: A process shall not assume that the interpretations of two
> canonical-equivalent character sequences are distinct."
> Firstly, I have grave difficulties assigning mental activities to
> processes.
> Secondly, it may be possible to interpet "A process shall not assume X"
> as "A process shall function correctly regardless of whether X holds."
> However, let image(Y) be the bitmap depicting the string Y.  Then the
> following logic would be non-compliant:
> if A and B are canonically equivalent and image(A) and image(B) are
> different, then
>     write(A, " and ", B, "are canonically equivalent but have different
>     images ", image(A), " and ", image(B));
> end if
> The logic is non-compliant, for if it is invoked then the write
> statement will only work correctly if image(A) and image(B) are
> different, i.e. if A and B are interpreted differently. Apparently it
> is permissible to render canonically equivalent sequences differently, so
> image(A) and image(B) might be different even though canonically
> equivalent.
> I therefore conclude that C6 is in some language that I do not
> adequately understand.

AFAIU, Unicode is about processing text, and only mentions display
rarely, where it's directly related to the processing part.  So the
above is about _processing_ canonically-equivalent sequences, not
about their display.  When looked at in this way, I see no
difficulties in understanding the text.

> > Again, I do know nothing about Thai, but if in TUS an abugida can be
> > addressed to as an alphabet if the same is used as such, it seems to
> > me that the word 'alphabet' has a pretty extended meaning in TUS.
> TUS tries to make accurate use of the distinction between 'alphabet',
> 'abugida' and 'abjad', 20th century jargon promoted if not invented by
> Peter Daniels.  The distinction lies in the way vowels are indicated -
> always / with a default / not at all.  The distinction may be useful
> for a writing system, i.e. a way of using the 'script', but it rapidly
> encounters the problem that a script may have several different writing
> systems.  For example, the presence or absence of vowel marks switches
> the Arabic and Hebrew scripts, as used for those languages, between
> being an abjad and being an alphabet.

The Hebrew script is never an alphabet, AFAIU, it's likely an abugida
when the vowel marks are used.  The so-called "full spelling", where
some vowels are indicated by consonants, does not replace all the
vowels with consonants, so it isn't, strictly speaking, an alphabet in
the above sense.

More information about the Unicode mailing list