Proposed letters, 0C5B & 0C5C, in Telugu

Peter Constable pgcon6 at msn.com
Thu Jul 30 22:11:37 CDT 2020


+1

From: Unicore <unicore-bounces at unicode.org> On Behalf Of Tex via Unicore
Sent: Thursday, July 30, 2020 3:24 PM
To: 'Markus Scherer' <markus.icu at gmail.com>; 'John Hudson' <john at tiro.ca>; unicore at unicode.org
Subject: RE: Proposed letters, 0C5B & 0C5C, in Telugu

“Writing systems are shared by multiple languages and multiple traditions, past and present.”

That statement should be a key point made prominently (and repeatedly) in introductions to Unicode, scripts, etc.
It is an underlying foundation that needs to be understood and it mitigates the many objections based on personal or community experiences.
Well stated Markus.

Tex



From: Unicore [mailto:unicore-bounces at unicode.org] On Behalf Of Markus Scherer via Unicore
Sent: Thursday, July 30, 2020 1:10 PM
To: John Hudson
Cc: unicore UnicoRe Discussion
Subject: Re: Proposed letters, 0C5B & 0C5C, in Telugu

On Thu, Jul 30, 2020 at 11:20 AM John Hudson via Unicore <unicore at unicode.org<mailto:unicore at unicode.org>> wrote:
there is a reasonable, general question to ask about sufficiency of
attestation when it comes to very rare characters that might only occur
in one or two texts, perhaps the invention of a single author, not
embraced by any subsequent tradition of use. And one response to that
question is 'Any attestation is sufficient', which has the benefit of
removing the need to come up with applicable critieria of sufficiency
that would need to be considered on a case-by-case basis.

Right. As far as I understand, there are thousands of Chinese characters that have been used very rarely, or even just once in a dictionary or in a database of person names. They are real, they are encoded, but they are not common.

Implementers have to make choices, and sometimes it makes sense to support a subset.

If a font or keyboard vendor wants to support the entire Sinhala script, then they will have glyphs for all relevant code points -- whether inside or outside the Sinhala block -- and all relevant sequences, and punctuation, etc.

If someone cares to only support the subset needed for common, modern use of the Sinhala language, then they can define such subsets or look for organizations that have defined them. (E.g., Unicode CLDR has sets of "exemplar characters" for many languages.)

I understand a visceral reaction of "this does not belong". I was originally not in favor of adding a capital sharp s<https://en.wikipedia.org/wiki/Capital_%E1%BA%9E> (Latin script, German language) because it was not part of the German orthography and wasn't taught in school etc. However, it clearly existed and was used, and once evidence was presented showing that it was more than using a lowercase ß in all-caps words, it got added to the Unicode standard, and the official orthography now acknowledges it (as optional).

Writing systems are shared by multiple languages and multiple traditions, past and present.

Best regards,
markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20200731/bc2fdc86/attachment.htm>


More information about the Unicode mailing list