From plug.gulp at gmail.com  Tue Dec  8 21:24:39 2015
From: plug.gulp at gmail.com (Plug Gulp)
Date: Wed, 9 Dec 2015 03:24:39 +0000
Subject: Devanagari and Subscript and Superscript
Message-ID: <CAL01L+1Bdws+41JRnb-RDzGOv58O8bLutyf7UidM2z_Cd-danA@mail.gmail.com>

Hi,

I am trying to understand if there is a way to use Devanagari
characters (and grapheme clusters) as subscript and/or superscript in
unicode text. It will help if someone could please direct me to any
document that explains how to achieve that. Is there a unicode marker
that will treat the next grapheme cluster in the unicode text as
super/subscript? For e.g. if one wants to represent "? raise to ???"
how does one achieve that; is there a marker to represent it as
follows: ? + SUP + ? + ? + ?
where SUP acts as a marker for superscripting the next grapheme
cluster. Similar for subscripting.

Sorry if this is not the right place to ask this question; in that
case please could you direct me to the right forum?

Thanks and kind regards

~Plug


From maxwell at umiacs.umd.edu  Wed Dec  9 09:42:13 2015
From: maxwell at umiacs.umd.edu (maxwell)
Date: Wed, 09 Dec 2015 10:42:13 -0500
Subject: Devanagari and Subscript and Superscript
In-Reply-To: <CAL01L+1Bdws+41JRnb-RDzGOv58O8bLutyf7UidM2z_Cd-danA@mail.gmail.com>
References: <CAL01L+1Bdws+41JRnb-RDzGOv58O8bLutyf7UidM2z_Cd-danA@mail.gmail.com>
Message-ID: <c3827d4bf7e00c682d676d205b897039@umiacs.umd.edu>

On 2015-12-08 22:24, Plug Gulp wrote:
> I am trying to understand if there is a way to use Devanagari
> characters (and grapheme clusters) as subscript and/or superscript in
> unicode text. It will help if someone could please direct me to any
> document that explains how to achieve that. Is there a unicode marker
> that will treat the next grapheme cluster in the unicode text as
> super/subscript? For e.g. if one wants to represent "? raise to 
> ???"
> how does one achieve that; is there a marker to represent it as
> follows: ? + SUP + ? + ? + ?
> where SUP acts as a marker for superscripting the next grapheme
> cluster. Similar for subscripting.

I may be wrong (it's been known to happen), but I don't think there's 
anything in Unicode that will sub-/super-script an arbitrary character.  
There are some pre-sub-/super-scripted Latin characters (see 
https://en.wikipedia.org/wiki/Unicode_subscripts_and_superscripts), but 
that won't help you.

So the next thing is, what are you using for displaying text?  HTML, 
Word, LibreOffice, (Xe)LaTeX,...?  Because it will probably have to be 
done in that tool.

    Mike Maxwell


From richard.wordingham at ntlworld.com  Fri Dec 11 06:28:48 2015
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Fri, 11 Dec 2015 12:28:48 +0000
Subject: Devanagari and Subscript and Superscript
In-Reply-To: <CAL01L+1Bdws+41JRnb-RDzGOv58O8bLutyf7UidM2z_Cd-danA@mail.gmail.com>
References: <CAL01L+1Bdws+41JRnb-RDzGOv58O8bLutyf7UidM2z_Cd-danA@mail.gmail.com>
Message-ID: <20151211122848.03ad0d7b@JRWUBU2>

On Wed, 9 Dec 2015 03:24:39 +0000
Plug Gulp <plug.gulp at gmail.com> wrote:

> I am trying to understand if there is a way to use Devanagari
> characters (and grapheme clusters) as subscript and/or superscript in
> unicode text.

Why do you want to do this?  Are you asking about writing Devanagari
vertically rather than horizontally?  If that is what you want, you
should be looking at mark-up such as is found in cascading style sheets
(CSS).  It is an important issue for CJK and Mongolian, and there have
been questions as to what is needed for Indian scripts.  (There's also
an antiquarian interest for historical scripts, such as Phags-pa and
even Egyptian - moves are afoot to support the hieroglyphic script as
plain text.)

Richard.

From plug.gulp at gmail.com  Tue Dec 15 05:55:02 2015
From: plug.gulp at gmail.com (Plug Gulp)
Date: Tue, 15 Dec 2015 11:55:02 +0000
Subject: Devanagari and Subscript and Superscript
In-Reply-To: <5667B9B5.3010208@it.aoyama.ac.jp>
References: <CAL01L+1Bdws+41JRnb-RDzGOv58O8bLutyf7UidM2z_Cd-danA@mail.gmail.com>
 <5667B9B5.3010208@it.aoyama.ac.jp>
Message-ID: <CAL01L+27vdVn4rcsXzAgqDPyg2Tm4S4mcmgCao7Si6bpZ4hdXg@mail.gmail.com>

On Wed, Dec 9, 2015 at 5:18 AM, Martin J. D?rst <duerst at it.aoyama.ac.jp> wrote:
>
> I suggest using HTML:
>
> ?<sup>? ??</sup>
>

This will work only if the end-users are always going to use a web
browser to view the text content.

It will help if Unicode standard itself intrinsically supports
generalised subscript/superscript text. I think the meaning of the
text should be contained within the text itself rather than relying on
external text markers and viewers. That way the text-content creator
does not have to rely on what type of unicode compliant text viewer or
editor the end user is using. The text should retain it's meaning
irrespective of the type of unicode compliant text viewer or editor
used. Similarly, if the text has to be saved in a database without
losing it's meaning, then either it has to be saved with all the known
markers of all the available editors, or some special processing needs
to be incorporated to convert some saved marker to markers of various
available text viewers and editors. Having generalised Unicode support
for superscript and subscript will solve all these problems.

Following is one of the use-cases where general Unicode support for
superscript/subscript will help tremendously:

A math teacher(??????? ??????) in a Marathi(?????) language school is
writing notes, in her Unicode compliant plain text editor, to explain
mathematical terms to her students. Following is an excerpt from the
notes that explains terms such as exponents(??????) and base(????).
(English translation is given below):

"?????? ??????? ?????? ???????? ???? ???? ??????? ???? ?????? ????
?????????? ???????? ???????????? ???????? ?????? ??? ???????.
??????????, ? ?? ?????? ?? ??????? ? ???? ????? ??? ????, ?????? ? x ?
x ?, ?? ?????? ?????? ??????? ?^? ??? ???????. ???? ???????? ?????? "?
?? ? ?? ???" ??? ???????. ??? ???? ?? ?????? ?????, "? ?? ?? ?? ??
???", ?????? ? ?? ?????? ??????? ?? ???? ????? ???? ???. ?????? ???
?^?? ??? ??????. ?? ?????????, ??????? ?????? ? ?????? ??????? ???
???? ????????? ?????? ?????? ?????? ??????? ?^??? ??? ???????, ???
???? ?????? "? ?? ??? ?? ???" ??? ???????. ??? ? ???? ?????? ????
??????? ??? ??? ???? ?????? ??? ??? ???????. ?? ????????, ????????
?????? ????^??? ??? ???????."

English translation:
"Exponent is a shorthand notation that denotes a multiplication of a
number by itself a number of times. For example, if a number 5 is
multiplied by itself 3 times i.e. 5 x 5 x 5, then it is represented in
an exponential form as 5^3. This exponential term is referred to as "5
raise to the power of 3". Let us consider another example, "2 raise to
the power of 10", i.e. 2 is multiplied by itself 10 times. This is
written in exponential form as 2^10. So, in general any number b that
is multiplied by itself k number of times is written as b^k and the
term is referred to as "b raise to the power of k". The number b is
called the base, and the number k is called the exponent. In short,
exponential term is written as base^exponent."

Please note that the teacher had to use a Circumflex Accent (Caret) to
indicate superscript, which is an unwritten convention, in the absence
of proper superscript support within Unicode. To make the text
available to wider audience and still retain it's meaning, the teacher
will have to partly rely on Unicode support, partly on the markers
available in the various text viewers of her students, partly on the
markers available in the text editors of the peer-reviewers of her
text and partly on the unwritten convention(such as the caret). This
conundrum can be resolved only if there is a generalised support for
superscript and subscript within Unicode standard.

The standard already has a section for superscript and subscript.
Generalising and extending this support will help other languages and
scripts. General support for all characters, words and sentences could
be achieved by just three new formatting characters, e.g. SCR, SUP and
SUB, similar to the way other formatting characters such as ZWS, ZWJ,
ZWNJ etc are defined. The new formatting characters could be defined
as:

SCR: In a character stream, all the characters following this
formatting character shall be treated as normal text until either the
end of the character stream or the next SUP or SUB character is
reached. This shall be the default marker i.e. if no marker is
specified then the text shall be treated as normal text until either
the end of the character stream or the next SUP or SUB character is
reached.

SUP: In a character stream, all the characters following this
formatting character shall be treated as superscript text until either
the end of the character stream or the next SCR or SUB character is
reached.

SUB: In a character stream, all the characters following this
formatting character shall be treated as subscript text until either
the end of the character stream or the next SCR or SUP character is
reached.

A general support within Unicode for subscripting and superscripting
text(characters and words) will tremendously help languages and
scripts that are not English/Latin.

Thanks and kind regards,

~Plug


>>
>> Hi,
>>
>> I am trying to understand if there is a way to use Devanagari
>> characters (and grapheme clusters) as subscript and/or superscript in
>> unicode text. It will help if someone could please direct me to any
>> document that explains how to achieve that. Is there a unicode marker
>> that will treat the next grapheme cluster in the unicode text as
>> super/subscript? For e.g. if one wants to represent "? raise to ???"
>> how does one achieve that; is there a marker to represent it as
>> follows: ? + SUP + ? + ? + ?
>> where SUP acts as a marker for superscripting the next grapheme
>> cluster. Similar for subscripting.
>>
>> Sorry if this is not the right place to ask this question; in that
>> case please could you direct me to the right forum?
>>
>> Thanks and kind regards
>>
>> ~Plug
>>
>> .
>>
>