From indic at unicode.org Wed Dec 6 07:17:36 2017 From: indic at unicode.org (=?UTF-8?B?4KS54KSw4KS/4KSw4KS+4KSu?= via Indic) Date: Wed, 6 Dec 2017 18:47:36 +0530 Subject: How to disable Indic syllable form editing in MS word Message-ID: While working in MS word 2007, 2010 or higher When one try to Find & replace any particular Unicode Character For Example to replace all '?' depended vowel AA with '?' depended vowel i it does not works. Only full syllable with ' ?' i.e. ??, ??, ??, etc. has to be search and replaced one by one with many repeats. This takes too much time and unnecessary repeats. ---- 2. When one try to delete a Indic Character with delete key putting the cursor before a syllable, the right side entire syllable is being deleted. How to delete a particular character instead of entire syllable? How to disable the Indic layout feature in MS word? Would anybody guide please? ?????? ????? ???? -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Wed Dec 6 17:19:16 2017 From: indic at unicode.org (Richard Wordingham via Indic) Date: Wed, 6 Dec 2017 23:19:16 +0000 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: Message-ID: <20171206231916.57d80bc3@JRWUBU2> On Wed, 6 Dec 2017 18:47:36 +0530 ?????? via Indic wrote: > While working in MS word 2007, 2010 or higher > > When one try to Find & replace any particular Unicode Character > For Example > to replace all > '?' depended vowel AA > with > '?' depended vowel i > > it does not works. > > Only full syllable with ' ?' i.e. ??, ??, ??, etc. has to be search > and replaced one by one with many repeats. > > This takes too much time and unnecessary repeats. > > ---- > 2. > > When one try to delete a Indic Character with delete key putting the > cursor before a syllable, the right side entire syllable is being > deleted. > > How to delete a particular character instead of entire syllable? > > How to disable the Indic layout feature in MS word? > > Would anybody guide please? Are you a real Indian? UTS#29 (https://www.unicode.org/reports/tr29/tr29-31.html) Section 3 Paragraph 1 strongly suggests that what you are trying to do is not natural. These particular behaviours you are complaining of annoy me intensely, but I'm a Westerner and so have no rights in these matters. Indic layout is not particularly guilty, though it makes editing clusters difficult. SIL has a split cursor which attempts to address the issue, but I've only seen it in their Worldpad text editor. Another technique, which has been available in emacs (I'm unsure of the current status), enables one to move the cursor into a cluster Unicode character by character, and disables shaping across the cluster. Even this will have shortcomings when working with two part vowels canonically equivalent to a single character - one won't know whether one has one character or two until one steps into the cluster. Emacs does, by default, provide what I consider the civilised behaviour, whereby pressing the delete key deletes the next character. That makes my life much easier, as I deal with Indic scripts in which it is not at all unusual to have 3 or more marks attached to a single base character. Richard. From indic at unicode.org Wed Dec 6 21:16:24 2017 From: indic at unicode.org (Kan!skA via Indic) Date: Thu, 7 Dec 2017 08:46:24 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: <20171206231916.57d80bc3@JRWUBU2> References: <20171206231916.57d80bc3@JRWUBU2> Message-ID: <29d6f426-f459-b588-aa5b-d6a4aaa00c2e@cdac.in> Strongly Agree with Richard, It's not a layout related issue. You may use BabelPad, for your concerned issue. http://www.babelstone.co.uk/Software/BabelPad.html Best Regards, Kaniska PSS Nagraj https://kanis.hk On 07-Dec-17 4:49 AM, Richard Wordingham via Indic wrote: > On Wed, 6 Dec 2017 18:47:36 +0530 > ?????? via Indic wrote: > >> While working in MS word 2007, 2010 or higher >> >> When one try to Find & replace any particular Unicode Character >> For Example >> to replace all >> '?' depended vowel AA >> with >> '?' depended vowel i >> >> it does not works. >> >> Only full syllable with ' ?' i.e. ??, ??, ??, etc. has to be search >> and replaced one by one with many repeats. >> >> This takes too much time and unnecessary repeats. >> >> ---- >> 2. >> >> When one try to delete a Indic Character with delete key putting the >> cursor before a syllable, the right side entire syllable is being >> deleted. >> >> How to delete a particular character instead of entire syllable? >> >> How to disable the Indic layout feature in MS word? >> >> Would anybody guide please? > Are you a real Indian? UTS#29 > (https://www.unicode.org/reports/tr29/tr29-31.html) Section 3 Paragraph > 1 strongly suggests that what you are trying to do is not natural. > > These particular behaviours you are complaining of annoy me > intensely, but I'm a Westerner and so have no rights in these matters. > > Indic layout is not particularly guilty, though it makes editing > clusters difficult. SIL has a split cursor which attempts to address > the issue, but I've only seen it in their Worldpad text editor. > Another technique, which has been available in emacs (I'm unsure of the > current status), enables one to move the cursor into a cluster Unicode > character by character, and disables shaping across the cluster. Even > this will have shortcomings when working with two part vowels > canonically equivalent to a single character - one won't know whether > one has one character or two until one steps into the cluster. > > Emacs does, by default, provide what I consider the civilised behaviour, > whereby pressing the delete key deletes the next character. That makes > my life much easier, as I deal with Indic scripts in which it is not at > all unusual to have 3 or more marks attached to a single base character. > > Richard. > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic > > ------------------------------------------------------------------------------------------------------------------------------- [ C-DAC is on Social-Media too. Kindly follow us at: Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ] This e-mail is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies and the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email is strictly prohibited and appropriate legal action will be taken. ------------------------------------------------------------------------------------------------------------------------------- From indic at unicode.org Thu Dec 7 01:26:18 2017 From: indic at unicode.org (=?UTF-8?B?4KS54KSw4KS/4KSw4KS+4KSu?= via Indic) Date: Thu, 7 Dec 2017 12:56:18 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: <29d6f426-f459-b588-aa5b-d6a4aaa00c2e@cdac.in> References: <20171206231916.57d80bc3@JRWUBU2> <29d6f426-f459-b588-aa5b-d6a4aaa00c2e@cdac.in> Message-ID: Sir, More rending problems found in Babelpad For example I copied following line from MS word 2007 *???? ??????*, *2017 ?? ??????? ?? ?????? ??????? ??????????? ????? ?? ?????? ?? ??????? * *( ??????? ???? ??? 4 ?????? )* and pasted it in Babelpad But it shows wrongly as (Screenshot) *[image: Inline image 1]* so Babelpad is not a good Indic Editor like MS word. If you have any Trick/tool to temporarily disable the Indic Shaping Engine in MS word to enable character-level editing, Kindly inform. With regards. ?????? ????? ???? On Thu, Dec 7, 2017 at 8:46 AM, Kan!skA via Indic wrote: > Strongly Agree with Richard, It's not a layout related issue. > > You may use BabelPad, for your concerned issue. > http://www.babelstone.co.uk/Software/BabelPad.html > > > Best Regards, > Kaniska PSS Nagraj > https://kanis.hk > > > On 07-Dec-17 4:49 AM, Richard Wordingham via Indic wrote: > >> On Wed, 6 Dec 2017 18:47:36 +0530 >> ?????? via Indic wrote: >> >> While working in MS word 2007, 2010 or higher >>> >>> When one try to Find & replace any particular Unicode Character >>> For Example >>> to replace all >>> '?' depended vowel AA >>> with >>> '?' depended vowel i >>> >>> it does not works. >>> >>> Only full syllable with ' ?' i.e. ??, ??, ??, etc. has to be search >>> and replaced one by one with many repeats. >>> >>> This takes too much time and unnecessary repeats. >>> >>> ---- >>> 2. >>> >>> When one try to delete a Indic Character with delete key putting the >>> cursor before a syllable, the right side entire syllable is being >>> deleted. >>> >>> How to delete a particular character instead of entire syllable? >>> >>> How to disable the Indic layout feature in MS word? >>> >>> Would anybody guide please? >>> >> Are you a real Indian? UTS#29 >> (https://www.unicode.org/reports/tr29/tr29-31.html) Section 3 Paragraph >> 1 strongly suggests that what you are trying to do is not natural. >> >> These particular behaviours you are complaining of annoy me >> intensely, but I'm a Westerner and so have no rights in these matters. >> >> Indic layout is not particularly guilty, though it makes editing >> clusters difficult. SIL has a split cursor which attempts to address >> the issue, but I've only seen it in their Worldpad text editor. >> Another technique, which has been available in emacs (I'm unsure of the >> current status), enables one to move the cursor into a cluster Unicode >> character by character, and disables shaping across the cluster. Even >> this will have shortcomings when working with two part vowels >> canonically equivalent to a single character - one won't know whether >> one has one character or two until one steps into the cluster. >> >> Emacs does, by default, provide what I consider the civilised behaviour, >> whereby pressing the delete key deletes the next character. That makes >> my life much easier, as I deal with Indic scripts in which it is not at >> all unusual to have 3 or more marks attached to a single base character. >> >> Richard. >> >> _______________________________________________ >> Indic mailing list >> Indic at unicode.org >> http://unicode.org/mailman/listinfo/indic >> >> >> > > ------------------------------------------------------------ > ------------------------------------------------------------------- > [ C-DAC is on Social-Media too. Kindly follow us at: > Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ] > > This e-mail is for the sole use of the intended recipient(s) and may > contain confidential and privileged information. If you are not the > intended recipient, please contact the sender by reply e-mail and destroy > all copies and the original message. Any unauthorized review, use, > disclosure, dissemination, forwarding, printing or copying of this email > is strictly prohibited and appropriate legal action will be taken. > ------------------------------------------------------------ > ------------------------------------------------------------------- > > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 35333 bytes Desc: not available URL: From indic at unicode.org Thu Dec 7 02:08:44 2017 From: indic at unicode.org (=?UTF-8?B?4KS54KSw4KS/4KSw4KS+4KSu?= via Indic) Date: Thu, 7 Dec 2017 13:38:44 +0530 Subject: How to disable Indic syllable form editing in MS word Message-ID: Pl. check again the original text copied from MS word where at 3 places ZWNJ is used, which converted wrongly, when pasted in Babelpad (*???? ??????*, *2017 ?? ??????? ?? ?????? ??????? ??????????? ????? ?? ?????? ?? ??????? * *( ??????? ???? ??? 4 ?????? )* ?????? On Thu, Dec 7, 2017 at 1:14 PM, Anand Kumar Sharma wrote: > Dear Sir > > There is option in babel pad to have complex rendering that is FONT menu. > Please enable that. > > screenshot taken from Babel pad version 6.3. on windows 8.1 ,64 bit > > > > On 07-12-2017 12:56, ?????? via Indic wrote: > > Sir, > > More rending problems found in > Babelpad > > For example I copied following line from MS word 2007 > > *???? ??????*, *2017 ?? ??????? ?? ?????? ??????? ??????????? ????? ?? > ?????? ?? ??????? * > *( ??????? ???? ??? 4 ?????? )* > > and pasted it in Babelpad > > But it shows wrongly as (Screenshot) > > > *[image: Inline image 1] * > > so Babelpad is not a good Indic Editor like MS word. > > If you have any Trick/tool to temporarily disable the Indic Shaping Engine > in MS word > to enable character-level editing, Kindly inform. > > With regards. > > > ?????? > ????? ???? > > On Thu, Dec 7, 2017 at 8:46 AM, Kan!skA via Indic > wrote: > >> Strongly Agree with Richard, It's not a layout related issue. >> >> You may use BabelPad, for your concerned issue. >> http://www.babelstone.co.uk/Software/BabelPad.html >> >> >> Best Regards, >> Kaniska PSS Nagraj >> https://kanis.hk >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 35333 bytes Desc: not available URL: From indic at unicode.org Thu Dec 7 02:11:18 2017 From: indic at unicode.org (Shriramana Sharma via Indic) Date: Thu, 7 Dec 2017 13:41:18 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> <29d6f426-f459-b588-aa5b-d6a4aaa00c2e@cdac.in> Message-ID: Normally each editors have their own behaviour decisions hard-coded based on what the developers think is useful for end users, which differs. In such matters, I find copy pasting between editors the only way to get the best of both worlds. On 07-Dec-2017 12:58 PM, "?????? via Indic" wrote: > Sir, > > More rending problems found in > Babelpad > > For example I copied following line from MS word 2007 > > *???? ??????*, *2017 ?? ??????? ?? ?????? ??????? ??????????? ????? ?? > ?????? ?? ??????? * > *( ??????? ???? ??? 4 ?????? )* > > and pasted it in Babelpad > > But it shows wrongly as (Screenshot) > > > *[image: Inline image 1]* > > so Babelpad is not a good Indic Editor like MS word. > > If you have any Trick/tool to temporarily disable the Indic Shaping Engine > in MS word > to enable character-level editing, Kindly inform. > > With regards. > > > ?????? > ????? ???? > > On Thu, Dec 7, 2017 at 8:46 AM, Kan!skA via Indic > wrote: > >> Strongly Agree with Richard, It's not a layout related issue. >> >> You may use BabelPad, for your concerned issue. >> http://www.babelstone.co.uk/Software/BabelPad.html >> >> >> Best Regards, >> Kaniska PSS Nagraj >> https://kanis.hk >> >> >> On 07-Dec-17 4:49 AM, Richard Wordingham via Indic wrote: >> >>> On Wed, 6 Dec 2017 18:47:36 +0530 >>> ?????? via Indic wrote: >>> >>> While working in MS word 2007, 2010 or higher >>>> >>>> When one try to Find & replace any particular Unicode Character >>>> For Example >>>> to replace all >>>> '?' depended vowel AA >>>> with >>>> '?' depended vowel i >>>> >>>> it does not works. >>>> >>>> Only full syllable with ' ?' i.e. ??, ??, ??, etc. has to be search >>>> and replaced one by one with many repeats. >>>> >>>> This takes too much time and unnecessary repeats. >>>> >>>> ---- >>>> 2. >>>> >>>> When one try to delete a Indic Character with delete key putting the >>>> cursor before a syllable, the right side entire syllable is being >>>> deleted. >>>> >>>> How to delete a particular character instead of entire syllable? >>>> >>>> How to disable the Indic layout feature in MS word? >>>> >>>> Would anybody guide please? >>>> >>> Are you a real Indian? UTS#29 >>> (https://www.unicode.org/reports/tr29/tr29-31.html) Section 3 Paragraph >>> 1 strongly suggests that what you are trying to do is not natural. >>> >>> These particular behaviours you are complaining of annoy me >>> intensely, but I'm a Westerner and so have no rights in these matters. >>> >>> Indic layout is not particularly guilty, though it makes editing >>> clusters difficult. SIL has a split cursor which attempts to address >>> the issue, but I've only seen it in their Worldpad text editor. >>> Another technique, which has been available in emacs (I'm unsure of the >>> current status), enables one to move the cursor into a cluster Unicode >>> character by character, and disables shaping across the cluster. Even >>> this will have shortcomings when working with two part vowels >>> canonically equivalent to a single character - one won't know whether >>> one has one character or two until one steps into the cluster. >>> >>> Emacs does, by default, provide what I consider the civilised behaviour, >>> whereby pressing the delete key deletes the next character. That makes >>> my life much easier, as I deal with Indic scripts in which it is not at >>> all unusual to have 3 or more marks attached to a single base character. >>> >>> Richard. >>> >>> _______________________________________________ >>> Indic mailing list >>> Indic at unicode.org >>> http://unicode.org/mailman/listinfo/indic >>> >>> >>> >> >> ------------------------------------------------------------ >> ------------------------------------------------------------------- >> [ C-DAC is on Social-Media too. Kindly follow us at: >> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ] >> >> This e-mail is for the sole use of the intended recipient(s) and may >> contain confidential and privileged information. If you are not the >> intended recipient, please contact the sender by reply e-mail and destroy >> all copies and the original message. Any unauthorized review, use, >> disclosure, dissemination, forwarding, printing or copying of this email >> is strictly prohibited and appropriate legal action will be taken. >> ------------------------------------------------------------ >> ------------------------------------------------------------------- >> >> >> _______________________________________________ >> Indic mailing list >> Indic at unicode.org >> http://unicode.org/mailman/listinfo/indic >> > > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 35333 bytes Desc: not available URL: From indic at unicode.org Thu Dec 7 03:05:24 2017 From: indic at unicode.org (Shriramana Sharma via Indic) Date: Thu, 7 Dec 2017 14:35:24 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> Message-ID: On 07-Dec-2017 4:51 AM, "Richard Wordingham via Indic" wrote: On Wed, 6 Dec 2017 18:47:36 +0530 > > When one try to Find & replace any particular Unicode Character > For Example > to replace all > '?' depended vowel AA > with > '?' depended vowel i > > it does not works. > > Only full syllable with ' ?' i.e. ??, ??, ??, etc. has to be search > and replaced one by one with many repeats. > > This takes too much time and unnecessary repeats. @Richard he has a valid point, so no need to ask him whether he is really an Indian! Where else would you want to search-replace a vowel sign except as part of a grapheme cluster? Surely well formed normal text won't have one of those standing alone? > When one try to delete a Indic Character with delete key putting the > cursor before a syllable, the right side entire syllable is being > deleted. > > How to delete a particular character instead of entire syllable? Press right arrow and use back space. You can't do it from the left. > How to disable the Indic layout feature in MS word? There is no option for this to my knowledge nor is there likely to be (though not impossible). > > Would anybody guide please? Are you a real Indian? UTS#29 (https://www.unicode.org/reports/tr29/tr29-31.html) Section 3 Paragraph 1 strongly suggests that what you are trying to do is not natural. Ok let's quote: "Grapheme clusters commonly behave as units in terms of mouse selection, arrow key movement, backspacing, and so on. For example, when a grapheme cluster is represented internally by a character sequence consisting of base character + accents, then using the right arrow key would skip from the start of the base character to the end of the last accent. However, in some cases editing a grapheme cluster element by element may be preferable. For example, on a given system the backspace key might delete by code point, while the delete key may delete an entire cluster." This doesn't say anything about search and replace. There is unlikely to be a universally acceptable or even applicable solution for intra grapheme cursor placement. For example how would you indicate a cursor position in front of or after a virama which has caused two consonants to ligate? So IMO the issue is just with intra grapheme cursors and the current common behaviour of cursors jumping cluster boundaries (and the resultant Del/BkSp status) is perfectly fine, but search-replace operations need not be limited by cursor positions and should work per character instead. That way they are more useful because use cases such as Hariram's are satisfied. Escape characters or search dialogue options can always be used to request matches only of complete grapheme clusters just like currently available for full words. If OTOH, per character search replace is not provided for at all, that is indeed a serious limitation IMO as in the old MS Word versions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Thu Dec 7 05:27:33 2017 From: indic at unicode.org (=?UTF-8?B?4KS54KSw4KS/4KSw4KS+4KSu?= via Indic) Date: Thu, 7 Dec 2017 16:57:33 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> Message-ID: Thanks for your technical details. Many times it is needed to do cluster level Editing of Indic Text/Data. We have tried to do it in Notepad++ , Some Hex Editors, but non found perfect or user friendly. The old Yudit also not giving desired functionality. Is there any HEX/TEXT editor, which supports Unicode (2Byte/3Byte) Character groups? Which have a split window, one showing with rendering, another showing code points only in Characters? Doing editing of big/huge text files having common errors, (with global search and replace using wordlist), only PYTHON script provides small level of user friendliness, i.e. directly input of Indic Unicode chars, But very limited. Looking for some other editor. ?????? On Thu, Dec 7, 2017 at 2:35 PM, Shriramana Sharma via Indic < indic at unicode.org> wrote: > ..... > > So IMO the issue is just with intra grapheme cursors and the current > common behaviour of cursors jumping cluster boundaries (and the resultant > Del/BkSp status) is perfectly fine, but search-replace operations need not > be limited by cursor positions and should work per character instead. > > That way they are more useful ... ... > > If OTOH, per character search replace is not provided for at all, that is > indeed a serious limitation IMO as in the old MS Word versions. > > http://unicode.org/mailman/listinfo/indic > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Thu Dec 7 05:51:03 2017 From: indic at unicode.org (=?UTF-8?B?4KS54KSw4KS/4KSw4KS+4KSu?= via Indic) Date: Thu, 7 Dec 2017 17:21:03 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> Message-ID: Yes, Really this is serious limitation for Indic Editing, As some WildCards featurs, Search & replace of special characters like ZWJ/ZWNJ also does not work for Indic Data. Is Libre Office or other editors allows cluster-level Indic Editing? ?????? ... ... that is indeed a serious limitation IMO as in the old MS Word >> versions. >> >> http://unicode.org/mailman/listinfo/indic >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Thu Dec 7 07:13:15 2017 From: indic at unicode.org (Shriramana Sharma via Indic) Date: Thu, 7 Dec 2017 18:43:15 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> Message-ID: Kate on KDE is able to do intra grapheme replacement. I haven't used recent versions of Windows so I don't know about software available on that platform. -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Thu Dec 7 01:44:58 2017 From: indic at unicode.org (Anand Kumar Sharma via Indic) Date: Thu, 7 Dec 2017 13:14:58 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> <29d6f426-f459-b588-aa5b-d6a4aaa00c2e@cdac.in> Message-ID: <519a36a7-bb5e-1818-ae35-b70a49fda1e1@cdac.in> Dear Sir There is option in babel pad to have complex rendering that is FONT menu. Please enable that. screenshot taken from Babel pad version 6.3. on windows 8.1 ,64 bit On 07-12-2017 12:56, ?????? via Indic wrote: > Sir, > > More rending problems found in > Babelpad > > For example I copied following line from MS word 2007 > > *???? ??????*,***2017 ?? ??????? ?? ?????? ??????? ??????????? ????? > ?? ?????? ?? ??????? * > > *( ??????? ???? ??? 4 ?????? )* > * > * > and pasted it in Babelpad > > But it shows wrongly as (Screenshot) > * > * > *Inline image 1 > * > * > * > so Babelpad is not a good Indic Editor like MS word. > > If you have any Trick/tool to temporarily disable the Indic Shaping > Engine in MS word > to enable character-level editing, Kindly inform. > > With regards. > * > * > > ?????? > ????? ???? > > On Thu, Dec 7, 2017 at 8:46 AM, Kan!skA via Indic > wrote: > > Strongly Agree with Richard, It's not a layout related issue. > > You may use BabelPad, for your concerned issue. > http://www.babelstone.co.uk/Software/BabelPad.html > > > > Best Regards, > Kaniska PSS Nagraj > https://kanis.hk > > > On 07-Dec-17 4:49 AM, Richard Wordingham via Indic wrote: > > On Wed, 6 Dec 2017 18:47:36 +0530 > ?????? via Indic > wrote: > > While working in MS word 2007, 2010 or higher > > When one try to Find & replace any particular Unicode > Character > For Example > to replace all > '?' depended vowel AA > with > '?' depended vowel i > > it does not works. > > Only full syllable with ' ?' i.e. ??, ??, ??, etc. has to > be search > and replaced one by one with many repeats. > > This takes too much time and unnecessary repeats. > > ---- > 2. > > When one try to delete a Indic Character with delete key > putting the > cursor before a syllable, the right side entire syllable > is being > deleted. > > How to delete a particular character instead of entire > syllable? > > How to disable the Indic layout feature in MS word? > > Would anybody guide please? > > Are you a real Indian?? UTS#29 > (https://www.unicode.org/reports/tr29/tr29-31.html > ) Section 3 > Paragraph > 1 strongly suggests that what you are trying to do is not natural. > > These particular behaviours you are complaining of annoy me > intensely, but I'm a Westerner and so have no rights in these > matters. > > Indic layout is not particularly guilty, though it makes editing > clusters difficult.? SIL has a split cursor which attempts to > address > the issue, but I've only seen it in their Worldpad text editor. > Another technique, which has been available in emacs (I'm > unsure of the > current status), enables one to move the cursor into a cluster > Unicode > character by character, and disables shaping across the > cluster.? Even > this will have shortcomings when working with two part vowels > canonically equivalent to a single character - one won't know > whether > one has one character or two until one steps into the cluster. > > Emacs does, by default, provide what I consider the civilised > behaviour, > whereby pressing the delete key deletes the next character.? > That makes > my life much easier, as I deal with Indic scripts in which it > is not at > all unusual to have 3 or more marks attached to a single base > character. > > Richard. > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic > > > > > > ------------------------------------------------------------------------------------------------------------------------------- > [ C-DAC is on Social-Media too. Kindly follow us at: > Facebook: https://www.facebook.com/CDACINDIA > & Twitter: @cdacindia ] > > This e-mail is for the sole use of the intended recipient(s) and may > contain confidential and privileged information. If you are not the > intended recipient, please contact the sender by reply e-mail and > destroy > all copies and the original message. Any unauthorized review, use, > disclosure, dissemination, forwarding, printing or copying of this > email > is strictly prohibited and appropriate legal action will be taken. > ------------------------------------------------------------------------------------------------------------------------------- > > > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic > > > > > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic -- Thanks and Regards *This mail has came from desk of Anand Kumar Sharma GIST QA|CDAC-Pune|Ph:020-25503468|http://www.cdac.in * / / ------------------------------------------------------------------------------------------------------------------------------- [ C-DAC is on Social-Media too. Kindly follow us at: Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ] This e-mail is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies and the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email is strictly prohibited and appropriate legal action will be taken. ------------------------------------------------------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: amgmdkfkhfjadpem.png Type: image/png Size: 12413 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: lhdfbheohdlekehj.png Type: image/png Size: 12754 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: jbkkaddmpcjhfcmk.png Type: image/png Size: 12439 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 35333 bytes Desc: not available URL: From indic at unicode.org Thu Dec 7 02:11:54 2017 From: indic at unicode.org (Kan!skA via Indic) Date: Thu, 7 Dec 2017 13:41:54 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> <29d6f426-f459-b588-aa5b-d6a4aaa00c2e@cdac.in> Message-ID: <0c63caa6-5371-9bc3-047d-f7c80e778cb5@cdac.in> Dear Sir, *There are Zero Width Joiners, which are causing the issue in your PC:* *In My PC, it is showing perfect: [Windows 10 Pro (64 bit), BabelPad 10.0.0.5]* *Q:* If you have any Trick/tool to temporarily disable the Indic Shaping Engine in MS word to enable character-level editing, Kindly inform. *Ans:* Currently don't have any, I'll update you if found Best Regards, Kaniska PSS Nagraj https://kanis.hk On 07-Dec-17 12:56 PM, ?????? wrote: > Sir, > > More rending problems found in > Babelpad > > For example I copied following line from MS word 2007 > > *???? ??????*,***2017 ?? ??????? ?? ?????? ??????? ??????????? ????? > ?? ?????? ?? ??????? * > > *( ??????? ???? ??? 4 ?????? )* > * > * > and pasted it in Babelpad > > But it shows wrongly as (Screenshot) > * > * > *Inline image 1 > * > * > * > so Babelpad is not a good Indic Editor like MS word. > > If you have any Trick/tool to temporarily disable the Indic Shaping > Engine in MS word > to enable character-level editing, Kindly inform. > > With regards. > * > * > > ?????? > ????? ???? > > On Thu, Dec 7, 2017 at 8:46 AM, Kan!skA via Indic > wrote: > > Strongly Agree with Richard, It's not a layout related issue. > > You may use BabelPad, for your concerned issue. > http://www.babelstone.co.uk/Software/BabelPad.html > > > > Best Regards, > Kaniska PSS Nagraj > https://kanis.hk > > > On 07-Dec-17 4:49 AM, Richard Wordingham via Indic wrote: > > On Wed, 6 Dec 2017 18:47:36 +0530 > ?????? via Indic > wrote: > > While working in MS word 2007, 2010 or higher > > When one try to Find & replace any particular Unicode > Character > For Example > to replace all > '?' depended vowel AA > with > '?' depended vowel i > > it does not works. > > Only full syllable with ' ?' i.e. ??, ??, ??, etc. has to > be search > and replaced one by one with many repeats. > > This takes too much time and unnecessary repeats. > > ---- > 2. > > When one try to delete a Indic Character with delete key > putting the > cursor before a syllable, the right side entire syllable > is being > deleted. > > How to delete a particular character instead of entire > syllable? > > How to disable the Indic layout feature in MS word? > > Would anybody guide please? > > Are you a real Indian?? UTS#29 > (https://www.unicode.org/reports/tr29/tr29-31.html > ) Section 3 > Paragraph > 1 strongly suggests that what you are trying to do is not natural. > > These particular behaviours you are complaining of annoy me > intensely, but I'm a Westerner and so have no rights in these > matters. > > Indic layout is not particularly guilty, though it makes editing > clusters difficult.? SIL has a split cursor which attempts to > address > the issue, but I've only seen it in their Worldpad text editor. > Another technique, which has been available in emacs (I'm > unsure of the > current status), enables one to move the cursor into a cluster > Unicode > character by character, and disables shaping across the > cluster.? Even > this will have shortcomings when working with two part vowels > canonically equivalent to a single character - one won't know > whether > one has one character or two until one steps into the cluster. > > Emacs does, by default, provide what I consider the civilised > behaviour, > whereby pressing the delete key deletes the next character.? > That makes > my life much easier, as I deal with Indic scripts in which it > is not at > all unusual to have 3 or more marks attached to a single base > character. > > Richard. > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic > > > > > > ------------------------------------------------------------------------------------------------------------------------------- > [ C-DAC is on Social-Media too. Kindly follow us at: > Facebook: https://www.facebook.com/CDACINDIA > & Twitter: @cdacindia ] > > This e-mail is for the sole use of the intended recipient(s) and may > contain confidential and privileged information. If you are not the > intended recipient, please contact the sender by reply e-mail and > destroy > all copies and the original message. Any unauthorized review, use, > disclosure, dissemination, forwarding, printing or copying of this > email > is strictly prohibited and appropriate legal action will be taken. > ------------------------------------------------------------------------------------------------------------------------------- > > > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic > > > ------------------------------------------------------------------------------------------------------------------------------- [ C-DAC is on Social-Media too. Kindly follow us at: Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ] This e-mail is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies and the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email is strictly prohibited and appropriate legal action will be taken. ------------------------------------------------------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ZWJ.jpg Type: image/jpeg Size: 150082 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Valid.jpg Type: image/jpeg Size: 128276 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 35333 bytes Desc: not available URL: From indic at unicode.org Thu Dec 7 10:19:37 2017 From: indic at unicode.org (maxwell via Indic) Date: Thu, 07 Dec 2017 11:19:37 -0500 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> Message-ID: <39c1c9590e30301b5967cc57d1bfe45d@umiacs.umd.edu> On 2017-12-07 06:51, ?????? via Indic wrote: > Yes, Really this is serious limitation for Indic Editing, > > As some WildCards featurs, Search & replace of special characters like > ZWJ/ZWNJ also does not work for Indic Data. I haven't done any significant editing of Indic text, but I can sympathize. I use the programmer's editor jEdit (which works on Windows, Linux and Mac). Up until version 5.1, it treated combining diacritics as separate characters for purposes of editing--so you could edit them separately, search for them, etc. Then, to fix some other bug, they began to be treated as part of the preceding base character. For many people, that's probably ok; it was not helpful for me, though. My bug report is here: http://jedit.org/trackers/Bugs/3884.html The bug report focuses on determining what the Unicode code point of a character is, but alludes to the search problem. I suspect every editor treats things like this differently... Some tools (like SIL's Fieldworks Language Explorer, FLEx) actually turn all NFC characters into NFD (or vice versa, I don't remember) under the hood. In summary, I think this is a general problem, not confined to Indic text. Not that this helps you any! Mike Maxwell University of Maryland From indic at unicode.org Thu Dec 7 10:33:57 2017 From: indic at unicode.org (maxwell via Indic) Date: Thu, 07 Dec 2017 11:33:57 -0500 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: <20171206231916.57d80bc3@JRWUBU2> References: <20171206231916.57d80bc3@JRWUBU2> Message-ID: <226d6990e80d9bcc014e57eb77276383@umiacs.umd.edu> On 2017-12-06 18:19, Richard Wordingham via Indic wrote: > Another technique, which has been available in emacs (I'm unsure of the > current status), enables one to move the cursor into a cluster Unicode > character by character, and disables shaping across the cluster. Even > this will have shortcomings when working with two part vowels > canonically equivalent to a single character - one won't know whether > one has one character or two until one steps into the cluster. This brings up a related question that I've always wondered about. In Bangla, there are two code points, U+09CB and U+09CC, which represent two-part vowels; one part appears to the left of the preceding consonant, and one to its right. There are also three code points that individually represent the parts to the left and right: U+09C7 is the left-hand part of both U+09CB and U+09CC, U+09BE is the right-hand part of U+09CB, and U+09D7 is the right-hand part of U+09CC. The relationship of U+09CB and U+09CC to the stand-alone characters is documented in the Unicode standard for the Bengali block. Why are these not treated in the Unicode standard as analogous to base+diacritic pairs with respect to NCC and NCD? E.g. when you convert text to NCC, why isn't a sequence of U+09C7 + consonant + U+09BE converted to consonant + U+09CB, and vice versa when converting to NCD? Instead, when we do things that require normalization (like searching for a word in text), we have to insert our own manual normalization step to handle this problem. Mike Maxwell University of Maryland From indic at unicode.org Thu Dec 7 13:44:05 2017 From: indic at unicode.org (Dr. U.B. Pavanaja via Indic) Date: Fri, 8 Dec 2017 01:14:05 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> Message-ID: <00bb01d36f93$bfbe6d00$3f3b4700$@vishvakannada.com> I tried in Windows Notepad and it worked -Pavanaja From: Indic [mailto:indic-bounces at unicode.org] On Behalf Of ?????? via Indic Sent: 07 December 2017 04:58 PM To: Shriramana Sharma; indic at unicode.org Subject: Re: How to disable Indic syllable form editing in MS word Thanks for your technical details. Many times it is needed to do cluster level Editing of Indic Text/Data. We have tried to do it in Notepad++ , Some Hex Editors, but non found perfect or user friendly. The old Yudit also not giving desired functionality. Is there any HEX/TEXT editor, which supports Unicode (2Byte/3Byte) Character groups? Which have a split window, one showing with rendering, another showing code points only in Characters? Doing editing of big/huge text files having common errors, (with global search and replace using wordlist), only PYTHON script provides small level of user friendliness, i.e. directly input of Indic Unicode chars, But very limited. Looking for some other editor. ?????? On Thu, Dec 7, 2017 at 2:35 PM, Shriramana Sharma via Indic wrote: ..... So IMO the issue is just with intra grapheme cursors and the current common behaviour of cursors jumping cluster boundaries (and the resultant Del/BkSp status) is perfectly fine, but search-replace operations need not be limited by cursor positions and should work per character instead. That way they are more useful ... ... If OTOH, per character search replace is not provided for at all, that is indeed a serious limitation IMO as in the old MS Word versions. http://unicode.org/mailman/listinfo/indic -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Thu Dec 7 14:38:17 2017 From: indic at unicode.org (Richard Wordingham via Indic) Date: Thu, 7 Dec 2017 20:38:17 +0000 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> Message-ID: <20171207203817.7d5e318f@JRWUBU2> On Thu, 7 Dec 2017 14:35:24 +0530 Shriramana Sharma via Indic wrote: > On 07-Dec-2017 4:51 AM, "Richard Wordingham via Indic" > wrote: > > On Wed, 6 Dec 2017 18:47:36 +0530 > > > > When one try to Find & replace any particular Unicode Character > > For Example > > to replace all > > '?' depended vowel AA > > with > > '?' depended vowel i > > > > it does not works. > > > > Only full syllable with ' ?' i.e. ??, ??, ??, etc. has to be search > > and replaced one by one with many repeats. > > > > This takes too much time and unnecessary repeats. > > > @Richard he has a valid point, so no need to ask him whether he is > really an Indian! Where else would you want to search-replace a vowel > sign except as part of a grapheme cluster? Surely well formed normal > text won't have one of those standing alone? He's what I've been looking for - evidence that Indic grapheme clusters are not what Indian users think of as a character. (Admittedly, Tamil shows many signs of being an incipient syllabary.) > > When one try to delete a Indic Character with delete key putting the > > cursor before a syllable, the right side entire syllable is being > > deleted. > > > > How to delete a particular character instead of entire syllable? > > > Press right arrow and use back space. You can't do it from the left. > > > How to disable the Indic layout feature in MS word? > > > There is no option for this to my knowledge nor is there likely to be > (though not impossible). > > > > > Would anybody guide please? > > Are you a real Indian? UTS#29 > (https://www.unicode.org/reports/tr29/tr29-31.html) Section 3 > Paragraph 1 strongly suggests that what you are trying to do is not > natural. > > > Ok let's quote: > > "Grapheme clusters commonly behave as units in terms of mouse > selection, arrow key movement, backspacing, and so on. For example, > when a grapheme cluster is represented internally by a character > sequence consisting of base character + accents, then using the right > arrow key would skip from the start of the base character to the end > of the last accent. > > However, in some cases editing a grapheme cluster element by element > may be preferable. For example, on a given system the backspace key > might delete by code point, while the delete key may delete an entire > cluster." > > This doesn't say anything about search and replace. Paragraph 2 comes close: "Grapheme cluster boundaries are important for *collation*, regular expressions, UI interactions (such as mouse selection, arrow key movement, backspacing), segmentation for vertical text, identification of boundaries for first-letter styling, and counting ?character? positions within text. Word boundaries, line boundaries, and sentence boundaries should not occur within a grapheme cluster: in other words, a grapheme cluster should be an atomic unit with respect to the process of determining these other boundaries." If one accepts that collation is important for search, then that is where search and replace comes in. I don't like the principle - I end up having to use the 'regular expression' option when searching for short Tai Tham strings in LibreOffice. > There is unlikely > to be a universally acceptable or even applicable solution for intra > grapheme cursor placement. For example how would you indicate a > cursor position in front of or after a virama which has caused two > consonants to ligate? When characters have fused to a single glyph, the usual technique, used even by Windows for non-Indic text, is to choose a boundary position within the glyph. OpenType has a table of positions, but apparently Windows just places the divisions evenly. Obviously the simple scheme doesn't work well when a preposed vowel ligates with the base consonant. Richard. From indic at unicode.org Thu Dec 7 15:38:23 2017 From: indic at unicode.org (Richard Wordingham via Indic) Date: Thu, 7 Dec 2017 21:38:23 +0000 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: <226d6990e80d9bcc014e57eb77276383@umiacs.umd.edu> References: <20171206231916.57d80bc3@JRWUBU2> <226d6990e80d9bcc014e57eb77276383@umiacs.umd.edu> Message-ID: <20171207213823.2d67dd78@JRWUBU2> On Thu, 07 Dec 2017 11:33:57 -0500 maxwell via Indic wrote: > Why are these not treated in the Unicode standard as analogous to > base+diacritic pairs with respect to NCC and NCD? E.g. when you > convert text to NCC, why isn't a sequence of U+09C7 + consonant + > U+09BE converted to consonant + U+09CB, and vice versa when > converting to NCD? I'm puzzled by what you say. What *looks* like should, if it is represented by three characters, be encoded as , which is indeed canonically equivalent to . Richard. From indic at unicode.org Thu Dec 7 16:18:23 2017 From: indic at unicode.org (maxwell via Indic) Date: Thu, 07 Dec 2017 17:18:23 -0500 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: <20171207213823.2d67dd78@JRWUBU2> References: <20171206231916.57d80bc3@JRWUBU2> <226d6990e80d9bcc014e57eb77276383@umiacs.umd.edu> <20171207213823.2d67dd78@JRWUBU2> Message-ID: <8938b4eb5c162803649d964332c32573@umiacs.umd.edu> On 2017-12-07 16:38, Richard Wordingham wrote: > On Thu, 07 Dec 2017 11:33:57 -0500 > maxwell via Indic wrote: > >> Why are these not treated in the Unicode standard as analogous to >> base+diacritic pairs with respect to NCC and NCD? E.g. when you >> convert text to NCC, why isn't a sequence of U+09C7 + consonant + >> U+09BE converted to consonant + U+09CB, and vice versa when >> converting to NCD? > > I'm puzzled by what you say. What *looks* like U+09BE> should, if it is represented by three characters, be encoded as > , You're of course right, I got the underlying order wrong. > which is indeed canonically equivalent to > . It's canonically equivalent--that was what I was trying to say--but the last time I tested this using Python's Unicode conversion between NFC and NFD, I was sure it did not handle this case. But I tried it just now, and it did work; so either my test was wrong before, or this has now been fixed in Python. In either case, my earlier message was wrong. Mike Maxwell University of Maryland From indic at unicode.org Thu Dec 7 16:35:06 2017 From: indic at unicode.org (Richard Wordingham via Indic) Date: Thu, 7 Dec 2017 22:35:06 +0000 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> Message-ID: <20171207223506.3b8494ea@JRWUBU2> On Thu, 7 Dec 2017 17:21:03 +0530 ?????? via Indic wrote: > Is Libre Office or other editors allows cluster-level Indic Editing? LibreOffice allows global substitutions, at least for Tai Tham, which seems to be subject to almost the full range of Indic woes. For ZWJ and ZWNJ, it should work for controlling consonant clusters, though one may have to remember what the cursor is between as one steps through ??? K.SSA. In LibreOffice, one benefits from the fact that grapheme clusters only include one full consonant - K.SSA is the grapheme cluster followed by a grapheme cluster starting with SSA. In Tai Tham, I occasionally need to type . I can convert to it because Unicode has a grapheme cluster boundary before SIGN AA. Unfortunately, the main rendering engines don't support the sequence - I rely on my font removing the uncalled-for dotted circle. Richard. From indic at unicode.org Thu Dec 7 18:20:51 2017 From: indic at unicode.org (Richard Wordingham via Indic) Date: Fri, 8 Dec 2017 00:20:51 +0000 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: <20171207223506.3b8494ea@JRWUBU2> References: <20171206231916.57d80bc3@JRWUBU2> <20171207223506.3b8494ea@JRWUBU2> Message-ID: <20171208002051.529c5987@JRWUBU2> On Thu, 7 Dec 2017 22:35:06 +0000 Richard Wordingham via Indic wrote: > In LibreOffice, one benefits from the fact that > grapheme clusters only include one full consonant - K.SSA is the > grapheme cluster followed by a grapheme cluster starting > with SSA. However, if the current proposals for UAX#29 go through and there are then no longer any extended grapheme cluster breaks in , I fear it will no longer be easy to insert ZWNJ between virama and a following consonant. Richard. From indic at unicode.org Thu Dec 7 18:36:28 2017 From: indic at unicode.org (Shriramana Sharma via Indic) Date: Fri, 8 Dec 2017 06:06:28 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: <20171208002051.529c5987@JRWUBU2> References: <20171206231916.57d80bc3@JRWUBU2> <20171207223506.3b8494ea@JRWUBU2> <20171208002051.529c5987@JRWUBU2> Message-ID: The only sane way to insert joiners is at initial input time. Else BkSp BkSp BkSp then joiner is the only option. -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Fri Dec 8 08:46:04 2017 From: indic at unicode.org (Anish V via Indic) Date: Fri, 8 Dec 2017 20:16:04 +0530 Subject: Goykanadi script - script for Konkani language. In-Reply-To: References: Message-ID: Read that 'Goykanadi' was the original script for writing Konkani language. But I am not able to find a sample of this on Internet. Could anyone help with sample of this script? -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Fri Dec 8 12:12:15 2017 From: indic at unicode.org (Richard Wordingham via Indic) Date: Fri, 8 Dec 2017 18:12:15 +0000 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> <20171207223506.3b8494ea@JRWUBU2> <20171208002051.529c5987@JRWUBU2> Message-ID: <20171208181215.439d1f92@JRWUBU2> On Fri, 8 Dec 2017 06:06:28 +0530 Shriramana Sharma via Indic wrote: > The only sane way to insert joiners is at initial input time. Else > BkSp BkSp BkSp then joiner is the only option. Are you seriously suggesting deleting most of the akshara? Or are you using a different key for the function of deleting the preceding character? From indic at unicode.org Fri Dec 8 19:04:00 2017 From: indic at unicode.org (Sandeep Subramanian via Indic) Date: Fri, 8 Dec 2017 17:04:00 -0800 Subject: Goykanadi script - script for Konkani language. Message-ID: Hi Anish, I am familiar with the Goykanadi script and have explored it in some depth. I'm currently designing a font for it, after which I will consult some Konkani language organizations and determine whether it is suitable for encoding (at which point, I will propose it for encoding). Please feel free to contact me if you have any questions! If you have any contacts in these areas as well, I would appreciate your help. Best regards, Sandeep -- Sandeep Subramanian ??????? ??????????? University of California, Berkeley Aug. 2013 - Dec. 2017 (expected) B.Sc. Chemistry & B.A. Computer Science -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Sat Dec 9 13:33:16 2017 From: indic at unicode.org (Vicknesh Sanmugam via Indic) Date: Sun, 10 Dec 2017 03:33:16 +0800 Subject: Query on Vedix Extensions Message-ID: Greetings I've been facing this problem for months. How do i combine two unicodes, double svarita, ? (U+1CDA) and anusvara, ? (U+0902). If i insert both uncode independently, the anusvara won't sit in the limit of the shirorekha. It tend to appear after the intended consonant. [image: Inline image 2] Please do help me. Your help will be well appreciated. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screen Shot 2017-12-10 at 3.26.27 AM.png Type: image/png Size: 7185 bytes Desc: not available URL: From indic at unicode.org Sun Dec 10 00:03:07 2017 From: indic at unicode.org (Shriramana Sharma via Indic) Date: Sun, 10 Dec 2017 11:33:07 +0530 Subject: Query on Vedix Extensions In-Reply-To: References: Message-ID: Some older platforms do not provide proper support for Vedic accents and insert dotted circles like this thinking that these are not properly formed sequences. The only solution would be to use a recent OS. -- Shriramana Sharma ???????????? ???????????? ???????????????????????? From indic at unicode.org Sun Dec 10 00:10:02 2017 From: indic at unicode.org (Shriramana Sharma via Indic) Date: Sun, 10 Dec 2017 11:40:02 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: <20171208181215.439d1f92@JRWUBU2> References: <20171206231916.57d80bc3@JRWUBU2> <20171207223506.3b8494ea@JRWUBU2> <20171208002051.529c5987@JRWUBU2> <20171208181215.439d1f92@JRWUBU2> Message-ID: On 12/8/17, Richard Wordingham via Indic wrote: > On Fri, 8 Dec 2017 06:06:28 +0530 > Shriramana Sharma via Indic wrote: > >> The only sane way to insert joiners is at initial input time. Else >> BkSp BkSp BkSp then joiner is the only option. > > Are you seriously suggesting deleting most of the akshara? Or are you > using a different key for the function of deleting the preceding > character? Don't get you. I meant that because intra-cluster cursor placement isn't well representable visually, the only way to reliably place joiners at a particular place in the text when using a text editor that performs CTL is at time of initial input. So if I want to display ??????, what I do is I hit the keys for ?, ?, ZWJ, ?, ?, ? in that order. If I am faced with pre-input text ????? and I want to insert the ZWJ in the above position, and if my platform/application doesn't provide for intra-cluster cursor placement, I will have to do this to ?????: BkSp, BkSp, BkSp to get ?? and then I input ZWJ and re-type the rest. -- Shriramana Sharma ???????????? ???????????? ???????????????????????? From indic at unicode.org Sun Dec 10 00:24:06 2017 From: indic at unicode.org (Shriramana Sharma via Indic) Date: Sun, 10 Dec 2017 11:54:06 +0530 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: <20171208002051.529c5987@JRWUBU2> References: <20171206231916.57d80bc3@JRWUBU2> <20171207223506.3b8494ea@JRWUBU2> <20171208002051.529c5987@JRWUBU2> Message-ID: On 12/8/17, Richard Wordingham via Indic wrote: >> In LibreOffice, one benefits from the fact that >> grapheme clusters only include one full consonant - K.SSA is the >> grapheme cluster followed by a grapheme cluster starting >> with SSA. > > However, if the current proposals for UAX#29 go through and there are > then no longer any extended grapheme cluster breaks in SSA>, I fear it will no longer be easy to insert ZWNJ between virama > and a following consonant. I noticed the other thread in which you are participating, but unable to read through and understand due to lack of time to grok the technicalities. I notice that in the case of some platforms/apps, for example the Firefox on Kubuntu 16.04 that I'm using right now, if I place the cursor before or after a cluster like ????? and use the cursor keys, the visual cursor doesn't jump the cluster but traverses it progressively in N steps where N is one more than the number of viramas inside it. At each step, the logical cursor seems to be placed *after* a virama. Are you saying the proposed update to UAX#29 is going to prohibit this behaviour? That may not be advisable. Why are they trying to do it? However, I should also note that while this behaviour seems quite sensible for C1 conjoining cases, it won't help to insert joiners to request C2-conjoining forms where the ZWJ needs to be put *before* the virama. For instance in Kannada to get RA + post-base YA as in ???? the sequence is ?, ZWJ, ?, ?. This can only be achieved in initial input as I said earlier, because post-input, the cursor will only be placed internally *after* the virama, and putting a ZWJ there just breaks the cluster like ZWNJ: ????. This is because there is no defined behaviour for Virama + ZWJ in Kannada. But I presume Kannadigas can live with that (though I am not one myself) because such sequences aren't frequently used at all. (In fact most common users probably aren't aware that they exist.) OTOH the requirement of inputting ZWJ in Devanagari to inhibit ligatures in some over-enthusiastic fonts (since such ligatures are sometimes not uniquely identifiable at 12 points) is a somewhat *more* often experienced one among those typesetting Devanagari documents, especially Sanskrit language ones with heavy cluster use. So it would be useful to retain the behaviour of placing cursors after intra-cluster virama-s. -- Shriramana Sharma ???????????? ???????????? ???????????????????????? From indic at unicode.org Sun Dec 10 05:56:55 2017 From: indic at unicode.org (Richard Wordingham via Indic) Date: Sun, 10 Dec 2017 11:56:55 +0000 Subject: How to disable Indic syllable form editing in MS word In-Reply-To: References: <20171206231916.57d80bc3@JRWUBU2> <20171207223506.3b8494ea@JRWUBU2> <20171208002051.529c5987@JRWUBU2> Message-ID: <20171210115655.0f717b22@JRWUBU2> On Sun, 10 Dec 2017 11:54:06 +0530 Shriramana Sharma via Indic wrote: > I notice that in the case of some platforms/apps, for example the > Firefox on Kubuntu 16.04 that I'm using right now, if I place the > cursor before or after a cluster like ????? and use the cursor keys, > the visual cursor doesn't jump the cluster but traverses it > progressively in N steps where N is one more than the number of > viramas inside it. At each step, the logical cursor seems to be placed > *after* a virama. > > Are you saying the proposed update to UAX#29 is going to prohibit this > behaviour? That may not be advisable. Why are they trying to do it? The conspiracy view is that SE Asia has not been forgiven for the USA losing the Vietnamese War, coupled in the case of USE with an Abrahamist attack on the script of the Dharma. (The 'Tham' in 'Tai Tham' means 'Dharma'.) A tradition of wearing turbans just magnifies the offence. You can combine this with the declaration that U+2060 WORD JOINER does not indicate that text on either side is part of the same word. That declaration is a threat to spell checkers, which depend on a highly fallible word breaking algorithm to find word boundaries in the first place. Foreign names can be particularly awkward. Contact with the personalities involved suggest that is actually the cock-up theory that is true. In this particular case, there are two paragraphs in UAX#29 that do the damage: "The Unicode Standard provides default algorithms for determining grapheme cluster boundaries, with two variants: legacy grapheme clusters and extended grapheme clusters. The most appropriate variant depends on the language and operation involved. However, the extended grapheme cluster boundaries are recommended for general processing, while the legacy grapheme cluster boundaries are maintained primarily for backwards compatibility with earlier versions of this specification." "An extended grapheme cluster is the same as a legacy grapheme cluster, with the addition of some other characters. The continuing characters are extended to include all spacing combining marks, such as the spacing (but dependent) vowel signs in Indic scripts. For example, this includes U+093F ( ? ) DEVANAGARI VOWEL SIGN I. The extended grapheme clusters should be used in implementations in preference to legacy grapheme clusters, because they provide better results for Indic scripts such as Tamil or Devanagari in which editing by orthographic syllable is typically preferred. For scripts such as Thai, Lao, and certain other Southeast Asian scripts, editing by visual unit is typically preferred, so for those scripts the behavior of extended grapheme clusters is similar to (but not identical to) the behavior of legacy grapheme clusters." A case history is the addition of the 'prepend' class for the Tai vowels that are encoded in visual order rather than phonetic order. They had gc=Lo, and had been very accessible when editing. When the prepend class was added, editors started to treat preposed vowel plus consonant as indivisible units once they had been entered, and there were howls of protest from Thailand. The key effect of the change was withdrawn, with the preposed vowels reverting to having the grapheme cluster break value 'other'. For a while, there were no characters with gcb=prepend. > However, I should also note that while this behaviour seems quite > sensible for C1 conjoining cases, it won't help to insert joiners to > request C2-conjoining forms where the ZWJ needs to be put *before* the > virama. For instance in Kannada to get RA + post-base YA as in ???? > the sequence is ?, ZWJ, ?, ?. This can only be achieved in initial > input as I said earlier, because post-input, the cursor will only be > placed internally *after* the virama, and putting a ZWJ there just > breaks the cluster like ZWNJ: ????. This is because there is no > defined behaviour for Virama + ZWJ in Kannada. > > But I presume Kannadigas can live with that (though I am not one > myself) because such sequences aren't frequently used at all. (In fact > most common users probably aren't aware that they exist.) It's a case of a suboptimal system being better than nothing. One can position the cursor before the second consonant, delete *just* the virama, and then type . At no time do you lose the consonants. Scripts with consonant signs are not so lucky - the consonant signs tend to be lost if the first consonant of the cluster is mistyped. One of the joys of Emacs for Tai Tham is that it allows one to delete and replace the first consonant of cluster. Northern Thai Tai Tham has such glorious clusters as ????????? /n?ai/ 'to ache all over'. (The Siamese cognate is normally translated just as 'tired'.) At present that akshara is split into three grapheme clusters, composed of 2, 6 and 1 characters. (A user perception might split it into four logically contiguous groups of 3, 3, 1 and 2 characters for onset, vowel, tone and final consonant.) When the change goes through, this will be just one extended grapheme cluster, and even harder to edit. Richard. From indic at unicode.org Sun Dec 10 06:08:29 2017 From: indic at unicode.org (Richard Wordingham via Indic) Date: Sun, 10 Dec 2017 12:08:29 +0000 Subject: Query on Vedix Extensions In-Reply-To: References: Message-ID: <20171210120829.6b8c3062@JRWUBU2> On Sun, 10 Dec 2017 11:33:07 +0530 Shriramana Sharma via Indic wrote: > Some older platforms do not provide proper support for Vedic accents > and insert dotted circles like this thinking that these are not > properly formed sequences. The only solution would be to use a recent > OS. Alternatively, I expect you can use applications that use HarfBuzz regardless of the host. I believe that now includes LibreOffice and Firefox, though on Linux LibreOffice is liable to use the system's sharable object, which you might have to update yourself. I didn't have a suitable font to check them out on my system. Richard. From indic at unicode.org Mon Dec 11 19:05:18 2017 From: indic at unicode.org (Anish V via Indic) Date: Tue, 12 Dec 2017 06:35:18 +0530 Subject: Goykanadi script - script for Konkani language. In-Reply-To: References: Message-ID: Hello Sandeep, Thanks for the reply. Somehow I am not getting daily digest mails now, but saw your reply from archive. Unfortunately your attachment seems scrubbed. Could you please attach it and send? -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Tue Dec 12 10:51:09 2017 From: indic at unicode.org (Anish V via Indic) Date: Tue, 12 Dec 2017 22:21:09 +0530 Subject: Devanagari vowels - Alternative forms. In-Reply-To: References: Message-ID: Hello, Is there any current proposal to encode alternate forms of Devanagari vowels? I don't think they are archaic forms, we can see some in post independence Indian coins also. The images of scripts and coins are attached for reference. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image007.jpg Type: image/jpeg Size: 40105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.jpg Type: image/jpeg Size: 33828 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.png Type: image/png Size: 12469 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.jpg Type: image/jpeg Size: 37378 bytes Desc: not available URL: From indic at unicode.org Tue Dec 12 12:29:42 2017 From: indic at unicode.org (Kiran Kumar Chava via Indic) Date: Tue, 12 Dec 2017 10:29:42 -0800 Subject: Devanagari vowels - Alternative forms. In-Reply-To: References: Message-ID: Isn't it simply different font? Why can not a different font handle this? Are both used simultaneously in a single sentence? On Tue, Dec 12, 2017 at 8:51 AM, Anish V via Indic wrote: > Hello, > > Is there any current proposal to encode alternate forms of Devanagari > vowels? I don't think they are archaic forms, we can see some in post > independence Indian coins also. The images of scripts and coins are > attached for reference. > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic > > -- ---- ~Kiran Kumar Chava http://kinige.com http://suravara.com http://chavakiran.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Tue Dec 12 12:45:09 2017 From: indic at unicode.org (Anshuman Pandey via Indic) Date: Tue, 12 Dec 2017 12:45:09 -0600 Subject: Devanagari vowels - Alternative forms. In-Reply-To: References: Message-ID: Such variants generally do not appear concurrently as they are part of stylistic sets associated with scribal and, eventually, print traditions. Apart from ?, there are variants for letters such as ?, ?, etc, and conjuncts such as ???, ???, as well as digits, eg. ?, ?. These should be handled typographically. But, it is unfortunate that a user can?t specifically choose which alternate set they?d like. All my best, Anshuman > On Dec 12, 2017, at 12:29 PM, Kiran Kumar Chava via Indic wrote: > > Isn't it simply different font? > Why can not a different font handle this? > Are both used simultaneously in a single sentence? > >> On Tue, Dec 12, 2017 at 8:51 AM, Anish V via Indic wrote: >> Hello, >> >> Is there any current proposal to encode alternate forms of Devanagari vowels? I don't think they are archaic forms, we can see some in post independence Indian coins also. The images of scripts and coins are attached for reference. >> >> _______________________________________________ >> Indic mailing list >> Indic at unicode.org >> http://unicode.org/mailman/listinfo/indic >> > > > > -- > ---- > ~Kiran Kumar Chava > http://kinige.com > http://suravara.com > http://chavakiran.com > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Tue Dec 12 12:49:34 2017 From: indic at unicode.org (Anish V via Indic) Date: Wed, 13 Dec 2017 00:19:34 +0530 Subject: Devanagari vowels - Alternative forms. In-Reply-To: References: Message-ID: Thank you for the clarification. Is there any font supporting this variant format? On Dec 13, 2017 00:15, "Anshuman Pandey" wrote: > Such variants generally do not appear concurrently as they are part of > stylistic sets associated with scribal and, eventually, print traditions. > Apart from ?, there are variants for letters such as ?, ?, etc, and > conjuncts such as ???, ???, as well as digits, eg. ?, ?. These should be > handled typographically. But, it is unfortunate that a user can?t > specifically choose which alternate set they?d like. > > All my best, > Anshuman > > > On Dec 12, 2017, at 12:29 PM, Kiran Kumar Chava via Indic < > indic at unicode.org> wrote: > > Isn't it simply different font? > Why can not a different font handle this? > Are both used simultaneously in a single sentence? > > On Tue, Dec 12, 2017 at 8:51 AM, Anish V via Indic > wrote: > >> Hello, >> >> Is there any current proposal to encode alternate forms of Devanagari >> vowels? I don't think they are archaic forms, we can see some in post >> independence Indian coins also. The images of scripts and coins are >> attached for reference. >> >> _______________________________________________ >> Indic mailing list >> Indic at unicode.org >> http://unicode.org/mailman/listinfo/indic >> >> > > > -- > ---- > ~Kiran Kumar Chava > http://kinige.com > http://suravara.com > http://chavakiran.com > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Tue Dec 12 13:50:18 2017 From: indic at unicode.org (Patrick Chew via Indic) Date: Tue, 12 Dec 2017 11:50:18 -0800 Subject: Devanagari vowels - Alternative forms. In-Reply-To: References: Message-ID: Chandas and Uttara are full fonts that ought to show the differences... not a single font, however... http://www.sanskritweb.net/cakram/ On Tue, Dec 12, 2017 at 10:49 AM, Anish V via Indic wrote: > Thank you for the clarification. > Is there any font supporting this variant format? > > On Dec 13, 2017 00:15, "Anshuman Pandey" wrote: > >> Such variants generally do not appear concurrently as they are part of >> stylistic sets associated with scribal and, eventually, print traditions. >> Apart from ?, there are variants for letters such as ?, ?, etc, and >> conjuncts such as ???, ???, as well as digits, eg. ?, ?. These should be >> handled typographically. But, it is unfortunate that a user can?t >> specifically choose which alternate set they?d like. >> >> All my best, >> Anshuman >> >> >> On Dec 12, 2017, at 12:29 PM, Kiran Kumar Chava via Indic < >> indic at unicode.org> wrote: >> >> Isn't it simply different font? >> Why can not a different font handle this? >> Are both used simultaneously in a single sentence? >> >> On Tue, Dec 12, 2017 at 8:51 AM, Anish V via Indic >> wrote: >> >>> Hello, >>> >>> Is there any current proposal to encode alternate forms of Devanagari >>> vowels? I don't think they are archaic forms, we can see some in post >>> independence Indian coins also. The images of scripts and coins are >>> attached for reference. >>> >>> _______________________________________________ >>> Indic mailing list >>> Indic at unicode.org >>> http://unicode.org/mailman/listinfo/indic >>> >>> >> >> >> -- >> ---- >> ~Kiran Kumar Chava >> http://kinige.com >> http://suravara.com >> http://chavakiran.com >> >> _______________________________________________ >> Indic mailing list >> Indic at unicode.org >> http://unicode.org/mailman/listinfo/indic >> >> > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Tue Dec 12 13:50:51 2017 From: indic at unicode.org (Anshuman Pandey via Indic) Date: Tue, 12 Dec 2017 13:50:51 -0600 Subject: Devanagari vowels - Alternative forms. In-Reply-To: References: Message-ID: <9FC9102D-B0EC-4275-972F-44BB1C42D1AC@umich.edu> The Chandas and Uttara fonts support these ?southern? and ?northern? styles: http://www.sanskritweb.net/cakram/ All my best, Anshuman > On Dec 12, 2017, at 12:49 PM, Anish V wrote: > > Thank you for the clarification. > Is there any font supporting this variant format? > >> On Dec 13, 2017 00:15, "Anshuman Pandey" wrote: >> Such variants generally do not appear concurrently as they are part of stylistic sets associated with scribal and, eventually, print traditions. Apart from ?, there are variants for letters such as ?, ?, etc, and conjuncts such as ???, ???, as well as digits, eg. ?, ?. These should be handled typographically. But, it is unfortunate that a user can?t specifically choose which alternate set they?d like. >> >> All my best, >> Anshuman >> >> >>> On Dec 12, 2017, at 12:29 PM, Kiran Kumar Chava via Indic wrote: >>> >>> Isn't it simply different font? >>> Why can not a different font handle this? >>> Are both used simultaneously in a single sentence? >>> >>>> On Tue, Dec 12, 2017 at 8:51 AM, Anish V via Indic wrote: >>>> Hello, >>>> >>>> Is there any current proposal to encode alternate forms of Devanagari vowels? I don't think they are archaic forms, we can see some in post independence Indian coins also. The images of scripts and coins are attached for reference. >>>> >>>> _______________________________________________ >>>> Indic mailing list >>>> Indic at unicode.org >>>> http://unicode.org/mailman/listinfo/indic >>>> >>> >>> >>> >>> -- >>> ---- >>> ~Kiran Kumar Chava >>> http://kinige.com >>> http://suravara.com >>> http://chavakiran.com >>> >>> _______________________________________________ >>> Indic mailing list >>> Indic at unicode.org >>> http://unicode.org/mailman/listinfo/indic -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Tue Dec 12 13:57:51 2017 From: indic at unicode.org (Patrick Chew via Indic) Date: Tue, 12 Dec 2017 11:57:51 -0800 Subject: Devanagari vowels - Alternative forms. In-Reply-To: References: Message-ID: As it turns out, SIL's Anapurna font shows both sets of variants in a single font: https://software.sil.org/annapurna/, c.f. http://software.sil.org/downloads/r/annapurna/AnnapurnaSIL-features.pdf These differences are accessible only via applications that are smart feature capable, e.g. LibreOffice. On Tue, Dec 12, 2017 at 11:50 AM, Patrick Chew wrote: > Chandas and Uttara are full fonts that ought to show the differences... > not a single font, however... > > http://www.sanskritweb.net/cakram/ > > On Tue, Dec 12, 2017 at 10:49 AM, Anish V via Indic > wrote: > >> Thank you for the clarification. >> Is there any font supporting this variant format? >> >> On Dec 13, 2017 00:15, "Anshuman Pandey" wrote: >> >>> Such variants generally do not appear concurrently as they are part of >>> stylistic sets associated with scribal and, eventually, print traditions. >>> Apart from ?, there are variants for letters such as ?, ?, etc, and >>> conjuncts such as ???, ???, as well as digits, eg. ?, ?. These should be >>> handled typographically. But, it is unfortunate that a user can?t >>> specifically choose which alternate set they?d like. >>> >>> All my best, >>> Anshuman >>> >>> >>> On Dec 12, 2017, at 12:29 PM, Kiran Kumar Chava via Indic < >>> indic at unicode.org> wrote: >>> >>> Isn't it simply different font? >>> Why can not a different font handle this? >>> Are both used simultaneously in a single sentence? >>> >>> On Tue, Dec 12, 2017 at 8:51 AM, Anish V via Indic >>> wrote: >>> >>>> Hello, >>>> >>>> Is there any current proposal to encode alternate forms of >>>> Devanagari vowels? I don't think they are archaic forms, we can see some in >>>> post independence Indian coins also. The images of scripts and coins are >>>> attached for reference. >>>> >>>> _______________________________________________ >>>> Indic mailing list >>>> Indic at unicode.org >>>> http://unicode.org/mailman/listinfo/indic >>>> >>>> >>> >>> >>> -- >>> ---- >>> ~Kiran Kumar Chava >>> http://kinige.com >>> http://suravara.com >>> http://chavakiran.com >>> >>> _______________________________________________ >>> Indic mailing list >>> Indic at unicode.org >>> http://unicode.org/mailman/listinfo/indic >>> >>> >> _______________________________________________ >> Indic mailing list >> Indic at unicode.org >> http://unicode.org/mailman/listinfo/indic >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Tue Dec 12 18:55:50 2017 From: indic at unicode.org (Richard Wordingham via Indic) Date: Wed, 13 Dec 2017 00:55:50 +0000 Subject: Devanagari vowels - Alternative forms. In-Reply-To: References: Message-ID: <20171213005550.75c5faa4@JRWUBU2> On Tue, 12 Dec 2017 11:57:51 -0800 Patrick Chew via Indic wrote: > As it turns out, SIL's Anapurna font shows both sets of variants in a > single font: https://software.sil.org/annapurna/, c.f. > http://software.sil.org/downloads/r/annapurna/AnnapurnaSIL-features.pdf > > These differences are accessible only via applications that are smart > feature capable, e.g. LibreOffice. And MS Edge and Firefox support OpenType features via CSS. Richard. From indic at unicode.org Tue Dec 12 19:03:40 2017 From: indic at unicode.org (Shriramana Sharma via Indic) Date: Wed, 13 Dec 2017 06:33:40 +0530 Subject: Devanagari vowels - Alternative forms. In-Reply-To: References: <20171213005550.75c5faa4@JRWUBU2> Message-ID: Do I understand correctly that Open Type features need to be specified as part of the standard? OTOH Graphite features don't and can just be provided by the font maker. On 13-Dec-2017 6:28 AM, "Richard Wordingham via Indic" wrote: On Tue, 12 Dec 2017 11:57:51 -0800 Patrick Chew via Indic wrote: > As it turns out, SIL's Anapurna font shows both sets of variants in a > single font: https://software.sil.org/annapurna/, c.f. > http://software.sil.org/downloads/r/annapurna/AnnapurnaSIL-features.pdf > > These differences are accessible only via applications that are smart > feature capable, e.g. LibreOffice. And MS Edge and Firefox support OpenType features via CSS. Richard. _______________________________________________ Indic mailing list Indic at unicode.org http://unicode.org/mailman/listinfo/indic -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Tue Dec 12 20:31:35 2017 From: indic at unicode.org (John Hudson via Indic) Date: Tue, 12 Dec 2017 18:31:35 -0800 Subject: Devanagari vowels - Alternative forms. In-Reply-To: References: <20171213005550.75c5faa4@JRWUBU2> Message-ID: <7da28f5c-6fec-1100-b2cd-976b86b9dd65@tiro.ca> On 12/12/17 17:03, Shriramana Sharma via Indic wrote: > Do I understand correctly that Open Type features need to be specified > as part of the standard? OTOH Graphite features don't and can just be > provided by the font maker. Interoperable OpenType Layout features need to be registered, so that they have agreed upon behaviour and predictable results. In the case of variant glyph forms such as the regional historical forms of Devanagari letters, there are a few different ways in which these could be accessed using existing variant glyph substitution (GSUB) features. 1. If associated with specific orthographic use, as defined by an OTL Language System tag, then the Localised Forms feature can be used. So this is the case for the subset of variant Devanagari forms preferred in Marathi typography. However, in terms of historical regional use for Sanskrit, this option is not available because there is only one Language System tag associated with Sanskrit. 2. If associated as a variant set, as I understand the case to be with the Sanskrit vowel letter forms, then these can be implemented via a Stylistic Set feature. A feature name can be associated with the feature via the font name table, and this may be exposed in some UIs. 3. Variants of individual characters can be implemented using the Character Variants features. Note that variants can be accessed using more than one feature, so it is possible for variants to be associated with Language System tags, and as stylistic sets, and as individual character variants. The other way to provide variant forms, especially regional- or language-specific sets, is as separate fonts. This is the most robust mechanism, and hence why for the Murty Library typefaces we provided separate Murty Hindi and Murty Sanskrit fonts (we expect to add a Murty Marathi font in future). JH -- John Hudson Tiro Typeworks Ltd www.tiro.com Salish Sea, BC tiro at tiro.com NOTE: In the interests of productivity, I am currently dealing with email on only two days per week, usually Monday and Thursday unless this schedule is disrupted by travel. If you need to contact me urgently, please use some other method of communication. Thank you. From indic at unicode.org Wed Dec 13 08:30:15 2017 From: indic at unicode.org (Bobby de Vos via Indic) Date: Wed, 13 Dec 2017 07:30:15 -0700 Subject: Devanagari vowels - Alternative forms. In-Reply-To: References: Message-ID: Greetings, In a later email in this thread, John Hudson mentioned a method with wider application support, which is to provide a separate font with the desired features. With Annapurna SIL, we can do both (have a separate font for applications that do not support smart font features, and having the flexibility of glyph variants in? a single font). You can make a custom version of Annapurna SIL with the desired glyph variants at https://scripts.sil.org/ttw/fonts2go.cgi The resulting font would work in applications that do not support smart font features. Bobby On 2017-12-12 12:57, Patrick Chew via Indic wrote: > As it turns out, SIL's Anapurna font shows both sets of variants in a > single font: https://software.sil.org/annapurna/, c.f. > http://software.sil.org/downloads/r/annapurna/AnnapurnaSIL-features.pdf > > These differences are accessible only via applications that are smart > feature capable, e.g. LibreOffice. > > On Tue, Dec 12, 2017 at 11:50 AM, Patrick Chew > wrote: > > Chandas and Uttara are full fonts that ought to show the > differences... not a single font, however... > > http://www.sanskritweb.net/cakram/ > > > On Tue, Dec 12, 2017 at 10:49 AM, Anish V via Indic > > wrote: > > Thank you for the clarification.? > Is there any font supporting this variant format? > > On Dec 13, 2017 00:15, "Anshuman Pandey" > wrote: > > Such variants generally do not appear concurrently as they > are part of stylistic sets associated with scribal and, > eventually, print traditions. Apart from ?, there are > variants for letters such as ?, ?, etc, and conjuncts such > as ???, ???, as well as digits, eg. ?, ?. These should be > handled typographically. But, it is unfortunate that a > user can?t specifically choose which alternate set they?d > like. > > All my best, > Anshuman > > > On Dec 12, 2017, at 12:29 PM, Kiran Kumar Chava via Indic > > wrote: > >> Isn't it simply different font?? >> Why can not a different font handle this?? >> Are both used simultaneously in a single sentence?? >> >> On Tue, Dec 12, 2017 at 8:51 AM, Anish V via Indic >> > wrote: >> >> Hello, >> >> ? ?Is there any current proposal to encode alternate >> forms of Devanagari vowels? I don't think they are >> archaic forms, we can see some in post independence >> Indian coins also. The images of scripts and coins >> are attached for reference. >> >> _______________________________________________ >> Indic mailing list >> Indic at unicode.org >> http://unicode.org/mailman/listinfo/indic >> >> >> >> >> >> -- >> ---- >> ~Kiran Kumar Chava >> http://kinige.com >> http://suravara.com >> http://chavakiran.com >> >> _______________________________________________ >> Indic mailing list >> Indic at unicode.org >> http://unicode.org/mailman/listinfo/indic >> > > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic > > > > > > > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic -- Bobby de Vos /bobby_devos at sil.org/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Thu Dec 14 04:36:53 2017 From: indic at unicode.org (Anish V via Indic) Date: Thu, 14 Dec 2017 16:06:53 +0530 Subject: Tigalari / Tulu Unicode In-Reply-To: References: Message-ID: Hello, Why is the Tigalari proposal in unicode SMP plane link password protected from viewing? Other proposals in blue (like Vatteluttu) are not protected or restricted like this. -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Sun Dec 31 01:32:17 2017 From: indic at unicode.org (Manish Goregaokar via Indic) Date: Sun, 31 Dec 2017 13:02:17 +0530 Subject: Eyelash Ra in Marathi Message-ID: Hi, Marathi's Balbodh version of Devanagari has the Eyelash Reph, which is an alternate way of writing the reph half form, and can be seen in ???? (as opposed to ???, which uses the regular reph). I recall seeing this same form used to make a full letter ? (ra) in older texts at museums. These were written in Devanagari/Balbodh, but the use of the "eyelash ra" seems to be reminiscent of the Modi "ra". Basically, it looked something like ????, but with a longer and more curved eyelash. I'm unable to find instances of this anymore (it's been a while since I saw this in museums, I don't recall which ones this was at). If it's a distinct letter form like the eyelash reph it might be worth encoding. Does anyone know of this letter form and where I can find it? Thanks, -Manish -------------- next part -------------- An HTML attachment was scrubbed... URL: From indic at unicode.org Sun Dec 31 11:31:05 2017 From: indic at unicode.org (Anshuman Pandey via Indic) Date: Sun, 31 Dec 2017 09:31:05 -0800 Subject: Eyelash Ra in Marathi In-Reply-To: References: Message-ID: <3B237EDF-2619-4457-A82F-41840F32A585@umich.edu> Hi Manish, The variant of RA you describe seems to be similar to the form used in Newa, ie. U+1142C. I?m not sure about its derivation. All my best, Anshu > On Dec 30, 2017, at 11:32 PM, Manish Goregaokar via Indic wrote: > > Hi, > > Marathi's Balbodh version of Devanagari has the Eyelash Reph, which is an alternate way of writing the reph half form, and can be seen in ???? (as opposed to ???, which uses the regular reph). > > I recall seeing this same form used to make a full letter ? (ra) in older texts at museums. These were written in Devanagari/Balbodh, but the use of the "eyelash ra" seems to be reminiscent of the Modi "ra". Basically, it looked something like ????, but with a longer and more curved eyelash. > > I'm unable to find instances of this anymore (it's been a while since I saw this in museums, I don't recall which ones this was at). If it's a distinct letter form like the eyelash reph it might be worth encoding. Does anyone know of this letter form and where I can find it? > > Thanks, > -Manish > _______________________________________________ > Indic mailing list > Indic at unicode.org > http://unicode.org/mailman/listinfo/indic From indic at unicode.org Sun Dec 31 11:41:11 2017 From: indic at unicode.org (Manish Goregaokar via Indic) Date: Sun, 31 Dec 2017 23:11:11 +0530 Subject: Eyelash Ra in Marathi In-Reply-To: <3B237EDF-2619-4457-A82F-41840F32A585@umich.edu> References: <3B237EDF-2619-4457-A82F-41840F32A585@umich.edu> Message-ID: Thanks. These texts were in Maharashtra and it was clearly Devanagari (aside from the weird ra), however. Still, there might be a shared history with the newa ra. On Dec 31, 2017 11:01 PM, "Anshuman Pandey" wrote: > Hi Manish, > > The variant of RA you describe seems to be similar to the form used in > Newa, ie. U+1142C. I?m not sure about its derivation. > > All my best, > Anshu > > > > On Dec 30, 2017, at 11:32 PM, Manish Goregaokar via Indic < > indic at unicode.org> wrote: > > > > Hi, > > > > Marathi's Balbodh version of Devanagari has the Eyelash Reph, which is > an alternate way of writing the reph half form, and can be seen > in ???? (as opposed to ???, which uses the regular reph). > > > > I recall seeing this same form used to make a full letter ? (ra) in > older texts at museums. These were written in Devanagari/Balbodh, but the > use of the "eyelash ra" seems to be reminiscent of the Modi "ra". > Basically, it looked something like ????, but with a longer and more curved > eyelash. > > > > I'm unable to find instances of this anymore (it's been a while since I > saw this in museums, I don't recall which ones this was at). If it's a > distinct letter form like the eyelash reph it might be worth encoding. Does > anyone know of this letter form and where I can find it? > > > > Thanks, > > -Manish > > _______________________________________________ > > Indic mailing list > > Indic at unicode.org > > http://unicode.org/mailman/listinfo/indic > -------------- next part -------------- An HTML attachment was scrubbed... URL: