From manish at mozilla.com  Tue Jul  7 13:10:24 2020
From: manish at mozilla.com (Manish Goregaokar)
Date: Tue, 7 Jul 2020 11:10:24 -0700
Subject: When are the properties/character/unicodeset tools going to be back?
Message-ID: <CAFOnWkkhMCr1Cq3QWMSNavtXiLTC9NWofUGjAHh2Ja=hdaGHQg@mail.gmail.com>

I used to make heavy use of https://unicode.org/cldr/utility/character.jsp
, https://unicode.org/cldr/utility/properties.jsp,and
https://unicode.org/cldr/utility/list-unicodeset.jsp . They've been down
for a couple months now, do folks have any idea when they'll be back? Is
there a way to run them locally? Are there alternatives that people are
used to? I use UniView <https://r12a.github.io/uniview/> a lot but it's not
as helpful when it comes to property queries.

Thanks,
-Manish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/mailman/private/unicode/attachments/20200707/c45401b9/attachment.htm>

From c933103 at gmail.com  Sun Jul 12 01:45:04 2020
From: c933103 at gmail.com (Phake Nick)
Date: Sun, 12 Jul 2020 14:45:04 +0800
Subject: Incompleteness of Suzhou Numeral/FaMa encoding in Unicode
Message-ID: <CAGHjPPKJGTeNWeqdm9FyERb1nKRDiShw5xvKYR3YXtfsDOYkiw@mail.gmail.com>

Nunbers for Suzhou nunerals, also known as Hangzhou numerals in Unicode,
have been encoded as part of the Unicode since pretty long ago.
However, as far as I can tell, that encoding of Suzhou Numeral only
includes the encoding of numeric glyphs, but lack other necessary support
that can make the numeral become something actually usable.
The most important part is that, in most situation Suzhou numeral are
supposed to be combined together. Most of the time there will be two lines,
with the top line representing a string of numbers using Suzhou numerals,
while the bottom lines represent their place value and unit. So if the top
line say 123456 in Suzhou numeral and the bottom line say Hundred dollar in
Chinese characters, it can be undrestood as meaning $123.456.
And then there are also various other symbols being used in Suzhou numerial
expressions that're not currently included as part of the encoding. For
example it's reported that sometimes decidollar would be represented by a
triangle in Suzhou numerical, and a Kan (a weight unit) could be
represented by a specific cursive version of the Chinese character of the
unit, joining with the digit on top and forming ligature.
Then there are also rotation that's supposed to take place when multiple
line-pattern digits sit next to each other and some other rules.
These would need to be supported for Suzhou numeral to actually be
supported in Unicode.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/mailman/private/unicode/attachments/20200712/835acff3/attachment.htm>

From duerst at it.aoyama.ac.jp  Sun Jul 12 03:38:49 2020
From: duerst at it.aoyama.ac.jp (=?UTF-8?Q?Martin_J=2e_D=c3=bcrst?=)
Date: Sun, 12 Jul 2020 17:38:49 +0900
Subject: Incompleteness of Suzhou Numeral/FaMa encoding in Unicode
In-Reply-To: <CAGHjPPKJGTeNWeqdm9FyERb1nKRDiShw5xvKYR3YXtfsDOYkiw@mail.gmail.com>
References: <CAGHjPPKJGTeNWeqdm9FyERb1nKRDiShw5xvKYR3YXtfsDOYkiw@mail.gmail.com>
Message-ID: <0287bf19-82db-65bf-9fca-a11c1a1a4550@it.aoyama.ac.jp>

Hello Nick,

On 12/07/2020 15:45, Phake Nick via Unicode wrote:
> Nunbers for Suzhou nunerals, also known as Hangzhou numerals in Unicode,
> have been encoded as part of the Unicode since pretty long ago.
> However, as far as I can tell, that encoding of Suzhou Numeral only
> includes the encoding of numeric glyphs, but lack other necessary support
> that can make the numeral become something actually usable.
> The most important part is that, in most situation Suzhou numeral are
> supposed to be combined together. Most of the time there will be two lines,
> with the top line representing a string of numbers using Suzhou numerals,
> while the bottom lines represent their place value and unit. So if the top
> line say 123456 in Suzhou numeral and the bottom line say Hundred dollar in
> Chinese characters, it can be undrestood as meaning $123.456.
> And then there are also various other symbols being used in Suzhou numerial
> expressions that're not currently included as part of the encoding. For
> example it's reported that sometimes decidollar would be represented by a
> triangle in Suzhou numerical, and a Kan (a weight unit) could be
> represented by a specific cursive version of the Chinese character of the
> unit, joining with the digit on top and forming ligature.
> Then there are also rotation that's supposed to take place when multiple
> line-pattern digits sit next to each other and some other rules.
> These would need to be supported for Suzhou numeral to actually be
> supported in Unicode.
> 

Sounds interesting. Any (pointers to) examples?

Regards,   Martin.

From harjitmoe at outlook.com  Sun Jul 12 05:52:58 2020
From: harjitmoe at outlook.com (Harriet Riddle)
Date: Sun, 12 Jul 2020 10:52:58 +0000
Subject: Incompleteness of Suzhou Numeral/FaMa encoding in Unicode
In-Reply-To: <CAGHjPPKJGTeNWeqdm9FyERb1nKRDiShw5xvKYR3YXtfsDOYkiw@mail.gmail.com>
References: <CAGHjPPKJGTeNWeqdm9FyERb1nKRDiShw5xvKYR3YXtfsDOYkiw@mail.gmail.com>
Message-ID: <AM6PR0702MB3671AFC0CEDCA84F8488D618B7630@AM6PR0702MB3671.eurprd07.prod.outlook.com>

> From: Unicode <unicode-bounces at unicode.org> on behalf of Phake Nick via Unicode <unicode at unicode.org>
> Sent: 12 July 2020 08:45
> To: Unicode Mailing List <unicode at unicode.org>
> Subject: Incompleteness of Suzhou Numeral/FaMa encoding in Unicode 
> [?]
> [?] The most important part is that, in most situation Suzhou numeral are supposed to be combined together. Most of the time there will be two lines, with the top line representing a string of numbers using Suzhou numerals, while the bottom lines represent their place value and unit. [?]

Interesting.

It is probably worth noting that Unicode's current coverage of Suzhou numerals is essentially limited to what it inherited from Big5 and CSIC / CNS 11643.? Big5 and CNS 11643 also happen to be the main legacy charsets responsible for the weird and wonderful stylised underline characters in the CJK Compatibility Forms block, at U+FE34, and at U+FE49 through U+FE4F.? These, of course, are not of much use without specialised layout support either.? Although unlike the Suzhou numerals, the stylised underlines are basically pure legacy by this point.

For reference, the Suzhou numeral ranges in Big5, in CNS 11643 (as EUC-TW) and in Unicode:

? Big5 0xA2C3 through 0xA2CE
? EUC-TW 0xA4B5 though 0xA4C0 (with or without a prefixed 0x8EA1)
? Unicode U+3021 through U+3029, followed by U+3038 through U+303A.

(Self-pedantic note: the last three (ten, twenty and thirty) were, in Unicode prior to version 3.0, unified with their corresponding and homoglyphic hanzi.? Since Big5 mappings are nowadays mostly used for legacy compatibility, they are still more often than not implemented with their older mappings to U+5341, U+5344 and U+5345, rather than to U+3038 through U+303A.? Although, U+5341 and U+5345 also have Big5 representations in the hanzi section, and therefore do not round trip.)

-- Har.


From xfq.free at gmail.com  Sun Jul 12 20:16:35 2020
From: xfq.free at gmail.com (Fuqiao Xue)
Date: Mon, 13 Jul 2020 09:16:35 +0800
Subject: Incompleteness of Suzhou Numeral/FaMa encoding in Unicode
In-Reply-To: <0287bf19-82db-65bf-9fca-a11c1a1a4550@it.aoyama.ac.jp>
References: <CAGHjPPKJGTeNWeqdm9FyERb1nKRDiShw5xvKYR3YXtfsDOYkiw@mail.gmail.com>
 <0287bf19-82db-65bf-9fca-a11c1a1a4550@it.aoyama.ac.jp>
Message-ID: <CAAF+z6F1=9rkBXOXvLyEQ-DVH8hWHMLYDj5KmO4JQyOAsdnbEw@mail.gmail.com>

Hello Martin,

See the examples below.

2020?7?12?(?) 16:40 Martin J. D?rst via Unicode <unicode at unicode.org>:
>
> Hello Nick,
>
> On 12/07/2020 15:45, Phake Nick via Unicode wrote:

[...]

> > The most important part is that, in most situation Suzhou numeral are
> > supposed to be combined together. Most of the time there will be two lines,
> > with the top line representing a string of numbers using Suzhou numerals,
> > while the bottom lines represent their place value and unit. So if the top
> > line say 123456 in Suzhou numeral and the bottom line say Hundred dollar in
> > Chinese characters, it can be undrestood as meaning $123.456.

Here's an example:[1]

????
??

The first line contains the numerical values. "????" stands for
"4022". The second line consists of Chinese characters that represents
the order of magnitude and unit of measurement of the first digit in
the numerical representation. In this case "??" which stands for "ten
yuan". When put together, it is then read as "40.22 yuan".

> > And then there are also various other symbols being used in Suzhou numerial
> > expressions that're not currently included as part of the encoding. For
> > example it's reported that sometimes decidollar would be represented by a
> > triangle in Suzhou numerical

I have also heard of this but unfortunately didn't find examples. It
has mostly been replaced by Arabic numerals nowadays so it's difficult
to find examples.

> > , and a Kan (a weight unit) could be
> > represented by a specific cursive version of the Chinese character of the
> > unit, joining with the digit on top and forming ligature.

I have not heard of this, nor do I know the unit "Kan". It may be ?
(gu?n in pinyin, and kan in r?maji) or ? (j?n in pinyin, and kin in
r?maji), or something I have never heard of.

> > Then there are also rotation that's supposed to take place when multiple
> > line-pattern digits sit next to each other

I have heard of this too. When the numbers "?", "?", "?" are written
next to each other, in order to avoid confusion, the even digits are
rotated. For example:

?????? should be written as ??????

> > and some other rules.

> > These would need to be supported for Suzhou numeral to actually be
> > supported in Unicode.
> >
>
> Sounds interesting. Any (pointers to) examples?
>
> Regards,   Martin.

[1] From https://en.wikipedia.org/wiki/Suzhou_numerals#Notations


From marius.spix at web.de  Mon Jul 13 02:57:46 2020
From: marius.spix at web.de (Marius Spix)
Date: Mon, 13 Jul 2020 09:57:46 +0200
Subject: Aw: Re: Incompleteness of Suzhou Numeral/FaMa encoding in Unicode
In-Reply-To: <CAAF+z6F1=9rkBXOXvLyEQ-DVH8hWHMLYDj5KmO4JQyOAsdnbEw@mail.gmail.com>
References: <CAGHjPPKJGTeNWeqdm9FyERb1nKRDiShw5xvKYR3YXtfsDOYkiw@mail.gmail.com>
 <0287bf19-82db-65bf-9fca-a11c1a1a4550@it.aoyama.ac.jp>
 <CAAF+z6F1=9rkBXOXvLyEQ-DVH8hWHMLYDj5KmO4JQyOAsdnbEw@mail.gmail.com>
Message-ID: <trinity-272cf421-4ba3-4972-b5f8-4eff76a06038-1594627066601@3c-app-webde-bap04>

An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/mailman/private/unicode/attachments/20200713/dfff02ba/attachment.htm>

From c933103 at gmail.com  Mon Jul 13 07:34:57 2020
From: c933103 at gmail.com (Phake Nick)
Date: Mon, 13 Jul 2020 20:34:57 +0800
Subject: Incompleteness of Suzhou Numeral/FaMa encoding in Unicode
In-Reply-To: <CAAF+z6F1=9rkBXOXvLyEQ-DVH8hWHMLYDj5KmO4JQyOAsdnbEw@mail.gmail.com>
References: <CAGHjPPKJGTeNWeqdm9FyERb1nKRDiShw5xvKYR3YXtfsDOYkiw@mail.gmail.com>
 <0287bf19-82db-65bf-9fca-a11c1a1a4550@it.aoyama.ac.jp>
 <CAAF+z6F1=9rkBXOXvLyEQ-DVH8hWHMLYDj5KmO4JQyOAsdnbEw@mail.gmail.com>
Message-ID: <CAGHjPPK7p6APzjAN0voY+7ap_1rtacjeijmR5=XXR6LEbDFT7g@mail.gmail.com>

More information and examples of what I have mentioned in my previous mail
about Suzhou numerals can be found at webpages like
http://www.ptwhw.com/?post=17164 and
https://www.facebook.com/tcsince1915/posts/2146011468871160

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/mailman/private/unicode/attachments/20200713/c42b4cca/attachment.htm>

From kent.b.karlsson at bahnhof.se  Mon Jul 13 12:27:44 2020
From: kent.b.karlsson at bahnhof.se (Kent Karlsson)
Date: Mon, 13 Jul 2020 19:27:44 +0200
Subject: Incompleteness of Suzhou Numeral/FaMa encoding in Unicode
In-Reply-To: <CAGHjPPK7p6APzjAN0voY+7ap_1rtacjeijmR5=XXR6LEbDFT7g@mail.gmail.com>
References: <CAGHjPPKJGTeNWeqdm9FyERb1nKRDiShw5xvKYR3YXtfsDOYkiw@mail.gmail.com>
 <0287bf19-82db-65bf-9fca-a11c1a1a4550@it.aoyama.ac.jp>
 <CAAF+z6F1=9rkBXOXvLyEQ-DVH8hWHMLYDj5KmO4JQyOAsdnbEw@mail.gmail.com>
 <CAGHjPPK7p6APzjAN0voY+7ap_1rtacjeijmR5=XXR6LEbDFT7g@mail.gmail.com>
Message-ID: <3E30CA1E-E212-4F16-9C9F-461A7C82CAC1@bahnhof.se>


> 13 juli 2020 kl. 14:34 skrev Phake Nick via Unicode <unicode at unicode.org>:
> 
> More information and examples of what I have mentioned in my previous mail about Suzhou numerals can be found at webpages like http://www.ptwhw.com/?post=17164 <http://www.ptwhw.com/?post=17164> and https://www.facebook.com/tcsince1915/posts/2146011468871160 <https://www.facebook.com/tcsince1915/posts/2146011468871160>

You may want to look at https://unicode-org.atlassian.net/browse/CLDR-4473 <https://unicode-org.atlassian.net/browse/CLDR-4473>, where I (long ago, 2013) proposed to add Suzhou numerals to CLDR?s RBNF data.
(It switches between vertical and horizontal Suzhou digits.) While other Chinese and Japanese numerals have been incorporated into CLDR, Suzhou numerals
and counting rods have not yet been incorporated.

/Kent Karlsson

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/mailman/private/unicode/attachments/20200713/2f23794b/attachment.htm>

From xfq.free at gmail.com  Mon Jul 13 20:11:56 2020
From: xfq.free at gmail.com (Fuqiao Xue)
Date: Tue, 14 Jul 2020 09:11:56 +0800
Subject: Incompleteness of Suzhou Numeral/FaMa encoding in Unicode
In-Reply-To: <trinity-272cf421-4ba3-4972-b5f8-4eff76a06038-1594627066601@3c-app-webde-bap04>
References: <CAGHjPPKJGTeNWeqdm9FyERb1nKRDiShw5xvKYR3YXtfsDOYkiw@mail.gmail.com>
 <0287bf19-82db-65bf-9fca-a11c1a1a4550@it.aoyama.ac.jp>
 <CAAF+z6F1=9rkBXOXvLyEQ-DVH8hWHMLYDj5KmO4JQyOAsdnbEw@mail.gmail.com>
 <trinity-272cf421-4ba3-4972-b5f8-4eff76a06038-1594627066601@3c-app-webde-bap04>
Message-ID: <CAAF+z6HA72ByNewv0C051dghvFLV5vMXiOocAOuKmmCOX=qn6g@mail.gmail.com>

2020?7?13?(?) 15:57 Marius Spix <marius.spix at web.de>:
>
> >?????? should be written as ??????
> Isn?t that a feature, which is possible with OpenType?s feature GSUB? It would be hard to search, copy and paste numbers, if you have two separate codepoints per orientation.

Yes, there will be problems in search, copy and paste numbers indeed,
even if many string search APIs or algorithms already implement things
like case-insensitive matching, Kana folding, and Unicode
normalization.

Using the OpenType feature can also solve the input problem, since the
input method does not need to automatically change the code point for
some numbers. (Or the user manually chooses the digits when inputting
them one by one.)

Thanks,

Fuqiao

> Regards, Marius
>


From haberg-1 at telia.com  Wed Jul 15 16:01:00 2020
From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=)
Date: Wed, 15 Jul 2020 23:01:00 +0200
Subject: Mail archive link
Message-ID: <C23537AD-9E2A-4EAA-B8AD-641A0F3AD9E9@telia.com>

FYI, the link to the mail archive is dead, as on the URL below, and in the header of the email messages.

https://www.unicode.org/consortium/distlist-unicode.html


From kenwhistler at sonic.net  Wed Jul 15 16:55:03 2020
From: kenwhistler at sonic.net (Ken Whistler)
Date: Wed, 15 Jul 2020 14:55:03 -0700
Subject: Mail archive link
In-Reply-To: <C23537AD-9E2A-4EAA-B8AD-641A0F3AD9E9@telia.com>
References: <C23537AD-9E2A-4EAA-B8AD-641A0F3AD9E9@telia.com>
Message-ID: <887a11d7-aa64-f417-8d67-25fa33b8573a@sonic.net>

The pipermail archiving system was a victim of the April VM crash, and 
won't be returning.

The historic archives are still available on the site. We will fix the 
link on the distlist-unicode list information page to stop pointing to 
the missing archiving system.

Email headers for email currently sent to the Unicode list correctly 
contain the List-Archive link in the email headers pointing to the 
correct mailman archiving system, as best as I can tell. That archiving 
is not public, but is available to any list member by logging in using 
your mailman account password. There is nothing we can do about 
List-Archive links in email sent by the mailing list prior to April, 2020.

--Ken

On 7/15/2020 2:01 PM, Hans ?berg via Unicode wrote:
> FYI, the link to the mail archive is dead, as on the URL below, and in the header of the email messages.
>
> https://www.unicode.org/consortium/distlist-unicode.html
>
>
>

From lyratelle at gmx.de  Thu Jul 16 01:07:46 2020
From: lyratelle at gmx.de (Dominikus Dittes Scherkl)
Date: Thu, 16 Jul 2020 08:07:46 +0200
Subject: Mail archive link
In-Reply-To: <887a11d7-aa64-f417-8d67-25fa33b8573a@sonic.net>
References: <C23537AD-9E2A-4EAA-B8AD-641A0F3AD9E9@telia.com>
 <887a11d7-aa64-f417-8d67-25fa33b8573a@sonic.net>
Message-ID: <65376532-e2cd-fe99-b59c-08f159e1e38c@gmx.de>

Am 15.07.20 um 23:55 schrieb Ken Whistler via Unicode:
> The pipermail archiving system was a victim of the April VM crash, and
> won't be returning. [...] There is nothing we can do about
> List-Archive links in email sent by the mailing list prior to April, 2020.
>
If anybody is interestend, I have a personal archive of this list which
is complete since 2006.

--
                                          Dominikus Dittes Scherkl


From 4mm4adbfrm4 at tonton-pixel.com  Thu Jul 16 02:26:41 2020
From: 4mm4adbfrm4 at tonton-pixel.com (Michel Mariani)
Date: Thu, 16 Jul 2020 09:26:41 +0200
Subject: Mail archive link
In-Reply-To: <C23537AD-9E2A-4EAA-B8AD-641A0F3AD9E9@telia.com>
References: <C23537AD-9E2A-4EAA-B8AD-641A0F3AD9E9@telia.com>
Message-ID: <A5E7CBF7-466A-428F-BB43-773F5376866E@tonton-pixel.com>

FWIW, it seems it is still possible to freely access archives of the Unicode mailing list from March 2001 to April 2020:

https://www.unicode.org/mail-arch/unicode-ml/

And there is a link to these archives from this page:

https://www.unicode.org/mail-arch/


> Le 15 juil. 2020 ? 23:01, Hans ?berg via Unicode <unicode at unicode.org> a ?crit :
> 
> FYI, the link to the mail archive is dead, as on the URL below, and in the header of the email messages.
> 
> https://www.unicode.org/consortium/distlist-unicode.html
> 
> 
> 


From haberg-1 at telia.com  Thu Jul 16 03:41:32 2020
From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=)
Date: Thu, 16 Jul 2020 10:41:32 +0200
Subject: Mail archive link
In-Reply-To: <887a11d7-aa64-f417-8d67-25fa33b8573a@sonic.net>
References: <C23537AD-9E2A-4EAA-B8AD-641A0F3AD9E9@telia.com>
 <887a11d7-aa64-f417-8d67-25fa33b8573a@sonic.net>
Message-ID: <24BD2B54-FD40-4245-AF14-04189AF32D3B@telia.com>


> On 15 Jul 2020, at 23:55, Ken Whistler <kenwhistler at sonic.net> wrote:
> 
> Email headers for email currently sent to the Unicode list correctly contain the List-Archive link in the email headers pointing to the correct mailman archiving system, as best as I can tell. That archiving is not public, but is available to any list member by logging in using your mailman account password. There is nothing we can do about List-Archive links in email sent by the mailing list prior to April, 2020.

Email headers are correct now; I was looking at a message from February, about the Egyptologist hieroglyph. The intent was to give it the link to BBC, they have an article about a hieroglyphics translator [1], but that would not have worked anyway, as it is private.

1. https://www.bbc.com/news/technology-53420320


From steffen at sdaoden.eu  Thu Jul 16 08:59:13 2020
From: steffen at sdaoden.eu (Steffen Nurpmeso)
Date: Thu, 16 Jul 2020 15:59:13 +0200
Subject: Mail archive link
In-Reply-To: <A5E7CBF7-466A-428F-BB43-773F5376866E@tonton-pixel.com>
References: <C23537AD-9E2A-4EAA-B8AD-641A0F3AD9E9@telia.com>
 <A5E7CBF7-466A-428F-BB43-773F5376866E@tonton-pixel.com>
Message-ID: <20200716135913.kEfhk%steffen@sdaoden.eu>

Michel Mariani via Unicode wrote in
<A5E7CBF7-466A-428F-BB43-773F5376866E at tonton-pixel.com>:
 |> Le 15 juil. 2020 ? 23:01, Hans ?berg via Unicode <unicode at unicode.org> \
 |> a ?crit :
 |> 
 |> FYI, the link to the mail archive is dead, as on the URL below, and \
 |> in the header of the email messages.
 |> 
 |> https://www.unicode.org/consortium/distlist-unicode.html

 |FWIW, it seems it is still possible to freely access archives of the \
 |Unicode mailing list from March 2001 to April 2020:
 |
 |https://www.unicode.org/mail-arch/unicode-ml/
 |
 |And there is a link to these archives from this page:
 |
 |https://www.unicode.org/mail-arch/

The Mail Archive also mirrors/ed Unicode at

  https://www.mail-archive.com/unicode at unicode.org/

But it stops in April?  I surely got Unicode messages thereafter,
has it been actively unsubscribed?  These guys are very friendly
(even though now private-only project), and surely would fill in
gaps if given a MBOX or so with the messages that are missing.

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)


From doug at ewellic.org  Sun Jul 19 16:20:01 2020
From: doug at ewellic.org (Doug Ewell)
Date: Sun, 19 Jul 2020 15:20:01 -0600
Subject: Mail archive link
Message-ID: <002a01d65e12$5d442e60$17cc8b20$@ewellic.org>

Steffen Nurpmeso wrote:

> The Mail Archive also mirrors/ed Unicode at
>
>   https://www.mail-archive.com/unicode at unicode.org/
>
> But it stops in April?  I surely got Unicode messages thereafter,
> has it been actively unsubscribed?

I suggest, in all seriousness, that Ken or Rick or somebody compose a detailed FAQ about the Great Server Crash of 2020, something to which we can point curious people instead of pointing them to the new archive to hunt for clues. (Yes, there is a new archive: https://corp.unicode.org/mailman/listinfo/unicode)

Currently the only item on the web site about the Great Crash, other than the mail archive, is this quick note, written before the full scope of loss was known:
https://home.unicode.org/technical-alert-unicode-technical-website-down/

I was pretty sure someone had posted a lengthy, detailed description of what happened, but if it was on the mailing list I can't find it now, which is kind of my point.

Especially with the rollout of the new Unicode home page and the relegation of most non-marketing material to a "Technical Site," which occurred not very long before the Great Crash, it may be reasonable for some to assume (incorrectly) that changes to the mail archives or the loss of previously available material via FTP might have been caused by those intentional changes instead of the Great Crash.

--
Doug Ewell | Thornton, CO, US | ewellic.org


From steffen at sdaoden.eu  Mon Jul 20 10:56:57 2020
From: steffen at sdaoden.eu (Steffen Nurpmeso)
Date: Mon, 20 Jul 2020 17:56:57 +0200
Subject: Mail archive link
In-Reply-To: <20200720151407.ET24c%steffen@sdaoden.eu>
References: <002a01d65e12$5d442e60$17cc8b20$@ewellic.org>
 <20200720151407.ET24c%steffen@sdaoden.eu>
Message-ID: <20200720155657.6-gWB%steffen@sdaoden.eu>

Steffen Nurpmeso wrote in
<20200720151407.ET24c%steffen at sdaoden.eu>:
 |Doug Ewell via Unicode wrote in
 |<002a01d65e12$5d442e60$17cc8b20$@ewellic.org>:
 ||Steffen Nurpmeso wrote:
 ||> The Mail Archive also mirrors/ed Unicode at
 ||>
 ||>   https://www.mail-archive.com/unicode at unicode.org/
 ||>
 ||> But it stops in April?  I surely got Unicode messages thereafter,
 ||> has it been actively unsubscribed?
 ||
 ||I suggest, in all seriousness, that Ken or Rick or somebody compose \
 ||a detailed FAQ about the Great Server Crash of 2020, something to which \
 ||we can point curious people instead of pointing them to the new archive \
 ||to hunt for clues. (Yes, there is a new archive: https://corp.unicode.or\
 ||g/mailman/listinfo/unicode)
 |
 |I surely can understand if an archive requires a login, looking at
 |my own tiny mailing-lists and the (spam) traffic that hits them.
 |It actully makes me even more thankful to be able to use those
 |wonderful public services like Gmane / Gmene / mail-archive which
 |also i use for many years.
 ...
 |Some subscribers seem to have been lost, or maybe they reject
 |.. but no, the list address is still the same and these services
 |subscribe and then just stay.  Maybe, nonetheless i mean, the
 |corp.unicode.org server address?  I will ask the mail-archive
 |people about that, but i am not in the position to fill in the
 |missing messages, i do not archivy anything i receive.

I have also asked whether it is because of corpse.unicode server
address[1].

  [1] https://www.mail-archive.com/gossip at mail-archive.com/msg01599.html

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

From steffen at sdaoden.eu  Mon Jul 20 10:14:07 2020
From: steffen at sdaoden.eu (Steffen Nurpmeso)
Date: Mon, 20 Jul 2020 17:14:07 +0200
Subject: Mail archive link
In-Reply-To: <002a01d65e12$5d442e60$17cc8b20$@ewellic.org>
References: <002a01d65e12$5d442e60$17cc8b20$@ewellic.org>
Message-ID: <20200720151407.ET24c%steffen@sdaoden.eu>

Doug Ewell via Unicode wrote in
<002a01d65e12$5d442e60$17cc8b20$@ewellic.org>:
 |Steffen Nurpmeso wrote:
 |
 |> The Mail Archive also mirrors/ed Unicode at
 |>
 |>   https://www.mail-archive.com/unicode at unicode.org/
 |>
 |> But it stops in April?  I surely got Unicode messages thereafter,
 |> has it been actively unsubscribed?
 |
 |I suggest, in all seriousness, that Ken or Rick or somebody compose \
 |a detailed FAQ about the Great Server Crash of 2020, something to which \
 |we can point curious people instead of pointing them to the new archive \
 |to hunt for clues. (Yes, there is a new archive: https://corp.unicode.or\
 |g/mailman/listinfo/unicode)

I surely can understand if an archive requires a login, looking at
my own tiny mailing-lists and the (spam) traffic that hits them.
It actully makes me even more thankful to be able to use those
wonderful public services like Gmane / Gmene / mail-archive which
also i use for many years.

 |Currently the only item on the web site about the Great Crash, other \
 |than the mail archive, is this quick note, written before the full \
 |scope of loss was known:
 |https://home.unicode.org/technical-alert-unicode-technical-website-down/
 |
 |I was pretty sure someone had posted a lengthy, detailed description \
 |of what happened, but if it was on the mailing list I can't find it \
 |now, which is kind of my point.
 |
 |Especially with the rollout of the new Unicode home page and the relegat\
 |ion of most non-marketing material to a "Technical Site," which occurred \
 |not very long before the Great Crash, it may be reasonable for some \
 |to assume (incorrectly) that changes to the mail archives or the loss \
 |of previously available material via FTP might have been caused by \
 |those intentional changes instead of the Great Crash.

Some subscribers seem to have been lost, or maybe they reject
.. but no, the list address is still the same and these services
subscribe and then just stay.  Maybe, nonetheless i mean, the
corp.unicode.org server address?  I will ask the mail-archive
people about that, but i am not in the position to fill in the
missing messages, i do not archivy anything i receive.

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

From steffen at sdaoden.eu  Thu Jul 23 09:59:48 2020
From: steffen at sdaoden.eu (Steffen Nurpmeso)
Date: Thu, 23 Jul 2020 16:59:48 +0200
Subject: Mail archive link
In-Reply-To: <20200720151407.ET24c%steffen@sdaoden.eu>
References: <002a01d65e12$5d442e60$17cc8b20$@ewellic.org>
 <20200720151407.ET24c%steffen@sdaoden.eu>
Message-ID: <20200723145948.hH1vz%steffen@sdaoden.eu>

Steffen Nurpmeso via Unicode wrote in
<20200720151407.ET24c%steffen at sdaoden.eu>:
 |Doug Ewell via Unicode wrote in
 |<002a01d65e12$5d442e60$17cc8b20$@ewellic.org>:
 ||Steffen Nurpmeso wrote:
 ||
 ||> The Mail Archive also mirrors/ed Unicode at
 ||>
 ||>   https://www.mail-archive.com/unicode at unicode.org/
 ||>
 ||> But it stops in April?  I surely got Unicode messages thereafter,
 ||> has it been actively unsubscribed?
 ||
 ||I suggest, in all seriousness, that Ken or Rick or somebody compose \
 ||a detailed FAQ about the Great Server Crash of 2020, something to which \
 ||we can point curious people instead of pointing them to the new archive \
 ||to hunt for clues. (Yes, there is a new archive: https://corp.unicode.or\
 ||g/mailman/listinfo/unicode)
 |
 |I surely can understand if an archive requires a login, looking at
 |my own tiny mailing-lists and the (spam) traffic that hits them.
 |It actully makes me even more thankful to be able to use those
 |wonderful public services like Gmane / Gmene / mail-archive which
 |also i use for many years.
 |
 ||Currently the only item on the web site about the Great Crash, other \
 ||than the mail archive, is this quick note, written before the full \
 ||scope of loss was known:
 ||https://home.unicode.org/technical-alert-unicode-technical-website-down/
 ||
 ||I was pretty sure someone had posted a lengthy, detailed description \
 ||of what happened, but if it was on the mailing list I can't find it \
 ||now, which is kind of my point.
 ||
 ||Especially with the rollout of the new Unicode home page and the relegat\
 ||ion of most non-marketing material to a "Technical Site," which occurred \
 ||not very long before the Great Crash, it may be reasonable for some \
 ||to assume (incorrectly) that changes to the mail archives or the loss \
 ||of previously available material via FTP might have been caused by \
 ||those intentional changes instead of the Great Crash.
 |
 |Some subscribers seem to have been lost, or maybe they reject
 |.. but no, the list address is still the same and these services
 |subscribe and then just stay.  Maybe, nonetheless i mean, the
 |corp.unicode.org server address?  I will ask the mail-archive
 |people about that, but i am not in the position to fill in the
 |missing messages, i do not archivy anything i receive.

I want to point out that i am not stuck in some moderation queue,
but delivery is being refused for days:

  Jul 23 07:55:23 postfix/smtp[13127]: connect to corp.unicode.org[66.34.201.228]:25: Connection refused
  Jul 23 07:55:23 postfix/smtp[13127]: B2EAF16059: to=<unicode at unicode.org>, relay=none, delay=223106, delays=223106/0.02/0.14/0, dsn=4.4.1, status=deferred (connect to corp.unicode.org[66.34.201.228]:25: Connection refused)

But, finally then:

  Jul 23 16:05:24 postfix/tlsproxy[13904]: CONNECT to [66.34.201.228]:25
  Jul 23 16:05:24 postfix/smtp[13903]: Untrusted TLS connection established to corp.unicode.org[66.34.201.228]:25: TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256
  Jul 23 16:05:26 postfix/smtp[13903]: B2EAF16059: to=<unicode at unicode.org>, relay=corp.unicode.org[66.34.201.228]:25, delay=252509, delays=252506/0.01/1.9/1.1, dsn=2.0.0, status=sent (250 2.0.0 06NE5O65027564 Message accepted for delivery)
  Jul 23 16:05:26 postfix/qmgr[9144]: B2EAF16059: removed
  Jul 23 16:05:26 postfix/tlsproxy[13904]: DISCONNECT [66.34.201.228]:25

Ciao,

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

From steffen at sdaoden.eu  Sat Jul 25 12:54:37 2020
From: steffen at sdaoden.eu (Steffen Nurpmeso)
Date: Sat, 25 Jul 2020 19:54:37 +0200
Subject: Mail archive link
In-Reply-To: <20200720155657.6-gWB%steffen@sdaoden.eu>
References: <002a01d65e12$5d442e60$17cc8b20$@ewellic.org>
 <20200720151407.ET24c%steffen@sdaoden.eu>
 <20200720155657.6-gWB%steffen@sdaoden.eu>
Message-ID: <20200725175437.aFDhN%steffen@sdaoden.eu>

Hello.

Steffen Nurpmeso via Unicode wrote in
<20200720155657.6-gWB%steffen at sdaoden.eu>:
 |Steffen Nurpmeso wrote in
 |<20200720151407.ET24c%steffen at sdaoden.eu>:
 ||Doug Ewell via Unicode wrote in
 ||<002a01d65e12$5d442e60$17cc8b20$@ewellic.org>:
 |||Steffen Nurpmeso wrote:
 |||> The Mail Archive also mirrors/ed Unicode at
 |||>
 |||>   https://www.mail-archive.com/unicode at unicode.org/
 |||>
 |||> But it stops in April?  I surely got Unicode messages thereafter,
 |||> has it been actively unsubscribed?
 |||
 |||I suggest, in all seriousness, that Ken or Rick or somebody compose \
 |||a detailed FAQ about the Great Server Crash of 2020, something to which \
 |||we can point curious people instead of pointing them to the new archive \
 |||to hunt for clues. (Yes, there is a new archive: https://corp.unicode.or\
 |||g/mailman/listinfo/unicode)
 ...
 ||It actully makes me even more thankful to be able to use those
 ||wonderful public services like Gmane / Gmene / mail-archive which
 ||also i use for many years.
 ...
 ||Some subscribers seem to have been lost, or maybe they reject
 ...
 |I have also asked whether it is because of corpse.unicode server
 |address[1].
 |
 |  [1] https://www.mail-archive.com/gossip at mail-archive.com/msg01599.html

Jeff Breidenbach has answered:

  Jeff Breidenbach wrote in
  <CAHjiUbqFmzAhdABSYx_P_W7YPMQ2XW4iST_5T04XzF5W_2VBDw at mail.gmail.com>:
   |Hi Steffen,
   |
   |I checked and for whatever reason, they simply aren't sending email to
   |archive at mail-archive.com. No idea why. If you can help make that happen,
   |archiving will work again.

So i hope the Unicode list administrators can make it happen that

  archive at mail-archive.com

will get copies of the messages again?!  It would be hard to
believe that the archive of such an impressive endeavour like
Unicode has been moved consciously to a closed circle.

   |While we are talking, I wants folks to know that the Mail Archive itself is
   |running fine (knock on wood). But I can't keep up with the support requests
   |though, so sorry to everyone affected by that.
 --End of <CAHjiUbqFmzAhdABSYx_P_W7YPMQ2XW4iST_5T04XzF5W_2VBDw at mail.gmail\
 .com>

Ciao, and a nice (and virus etc. free) Weekend everybody!

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

From kenwhistler at sonic.net  Tue Jul 28 19:15:33 2020
From: kenwhistler at sonic.net (Ken Whistler)
Date: Tue, 28 Jul 2020 17:15:33 -0700
Subject: Mail archive link
In-Reply-To: <C23537AD-9E2A-4EAA-B8AD-641A0F3AD9E9@telia.com>
References: <C23537AD-9E2A-4EAA-B8AD-641A0F3AD9E9@telia.com>
Message-ID: <d762708c-b140-911d-b267-464e367d053c@sonic.net>

This has now been fixed. Email archives for this list from 2014 to 
current are now publicly accessible without login.

--Ken

On 7/15/2020 2:01 PM, Hans ?berg via Unicode wrote:
> FYI, the link to the mail archive is dead, as on the URL below, and in the header of the email messages.
>
> https://www.unicode.org/consortium/distlist-unicode.html
>
>

From haberg-1 at telia.com  Wed Jul 29 03:38:53 2020
From: haberg-1 at telia.com (=?utf-8?Q?Hans_=C3=85berg?=)
Date: Wed, 29 Jul 2020 10:38:53 +0200
Subject: Mail archive link
In-Reply-To: <d762708c-b140-911d-b267-464e367d053c@sonic.net>
References: <C23537AD-9E2A-4EAA-B8AD-641A0F3AD9E9@telia.com>
 <d762708c-b140-911d-b267-464e367d053c@sonic.net>
Message-ID: <9CE88C69-9292-4415-9B16-AD03DA7116BF@telia.com>

Great! You might extend it back to March 2001, as pointed out in this message:

https://corp.unicode.org/pipermail/unicode/2020-July/008936.html


> On 29 Jul 2020, at 02:15, Ken Whistler via Unicode <unicode at unicode.org> wrote:
> 
> This has now been fixed. Email archives for this list from 2014 to current are now publicly accessible without login.
> 
> --Ken
> 
> On 7/15/2020 2:01 PM, Hans ?berg via Unicode wrote:
>> FYI, the link to the mail archive is dead, as on the URL below, and in the header of the email messages.
>> 
>> https://www.unicode.org/consortium/distlist-unicode.html
>> 


From pgcon6 at msn.com  Thu Jul 30 22:11:37 2020
From: pgcon6 at msn.com (Peter Constable)
Date: Fri, 31 Jul 2020 03:11:37 +0000
Subject: Proposed letters, 0C5B & 0C5C, in Telugu
In-Reply-To: <004601d666c0$2f07c280$8d174780$@xencraft.com>
References: <6973549F-25BD-4DD5-9605-3F9C0D4932BB@umich.edu>
 <ae2bf785-e892-407e-4c56-cadd51238e31@tiro.ca>
 <CAN49p6r=TLHhHbL7sOY6cK0q+VbJhHT=SrLGTx4hHmsb+4OpQw@mail.gmail.com>
 <004601d666c0$2f07c280$8d174780$@xencraft.com>
Message-ID: <MWHPR1301MB211273B5BB14B5704CB50B87864E0@MWHPR1301MB2112.namprd13.prod.outlook.com>

+1

From: Unicore <unicore-bounces at unicode.org> On Behalf Of Tex via Unicore
Sent: Thursday, July 30, 2020 3:24 PM
To: 'Markus Scherer' <markus.icu at gmail.com>; 'John Hudson' <john at tiro.ca>; unicore at unicode.org
Subject: RE: Proposed letters, 0C5B & 0C5C, in Telugu

?Writing systems are shared by multiple languages and multiple traditions, past and present.?

That statement should be a key point made prominently (and repeatedly) in introductions to Unicode, scripts, etc.
It is an underlying foundation that needs to be understood and it mitigates the many objections based on personal or community experiences.
Well stated Markus.

Tex


From: Unicore [mailto:unicore-bounces at unicode.org] On Behalf Of Markus Scherer via Unicore
Sent: Thursday, July 30, 2020 1:10 PM
To: John Hudson
Cc: unicore UnicoRe Discussion
Subject: Re: Proposed letters, 0C5B & 0C5C, in Telugu

On Thu, Jul 30, 2020 at 11:20 AM John Hudson via Unicore <unicore at unicode.org<mailto:unicore at unicode.org>> wrote:
there is a reasonable, general question to ask about sufficiency of
attestation when it comes to very rare characters that might only occur
in one or two texts, perhaps the invention of a single author, not
embraced by any subsequent tradition of use. And one response to that
question is 'Any attestation is sufficient', which has the benefit of
removing the need to come up with applicable critieria of sufficiency
that would need to be considered on a case-by-case basis.

Right. As far as I understand, there are thousands of Chinese characters that have been used very rarely, or even just once in a dictionary or in a database of person names. They are real, they are encoded, but they are not common.

Implementers have to make choices, and sometimes it makes sense to support a subset.

If a font or keyboard vendor wants to support the entire Sinhala script, then they will have glyphs for all relevant code points -- whether inside or outside the Sinhala block -- and all relevant sequences, and punctuation, etc.

If someone cares to only support the subset needed for common, modern use of the Sinhala language, then they can define such subsets or look for organizations that have defined them. (E.g., Unicode CLDR has sets of "exemplar characters" for many languages.)

I understand a visceral reaction of "this does not belong". I was originally not in favor of adding a capital sharp s<https://en.wikipedia.org/wiki/Capital_%E1%BA%9E> (Latin script, German language) because it was not part of the German orthography and wasn't taught in school etc. However, it clearly existed and was used, and once evidence was presented showing that it was more than using a lowercase ? in all-caps words, it got added to the Unicode standard, and the official orthography now acknowledges it (as optional).

Writing systems are shared by multiple languages and multiple traditions, past and present.

Best regards,
markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20200731/bc2fdc86/attachment.htm>

From lisa at unicode.org  Thu Jul 30 23:37:20 2020
From: lisa at unicode.org (Lisa Moore)
Date: Thu, 30 Jul 2020 21:37:20 -0700
Subject: Proposed letters, 0C5B & 0C5C, in Telugu
In-Reply-To: <MWHPR1301MB211273B5BB14B5704CB50B87864E0@MWHPR1301MB2112.namprd13.prod.outlook.com>
References: <6973549F-25BD-4DD5-9605-3F9C0D4932BB@umich.edu>
 <ae2bf785-e892-407e-4c56-cadd51238e31@tiro.ca>
 <CAN49p6r=TLHhHbL7sOY6cK0q+VbJhHT=SrLGTx4hHmsb+4OpQw@mail.gmail.com>
 <004601d666c0$2f07c280$8d174780$@xencraft.com>
 <MWHPR1301MB211273B5BB14B5704CB50B87864E0@MWHPR1301MB2112.namprd13.prod.outlook.com>
Message-ID: <a5f8cddf-c92b-c615-9783-18eb73d74fcd@unicode.org>

Yup, I agree...Lisa

On 7/30/2020 8:11 PM, Peter Constable via Unicode wrote:
>
> +1
>
> *From:*Unicore <unicore-bounces at unicode.org> *On Behalf Of *Tex via 
> Unicore
> *Sent:* Thursday, July 30, 2020 3:24 PM
> *To:* 'Markus Scherer' <markus.icu at gmail.com>; 'John Hudson' 
> <john at tiro.ca>; unicore at unicode.org
> *Subject:* RE: Proposed letters, 0C5B & 0C5C, in Telugu
>
> *?**Writing systems are shared by multiple languages and multiple 
> traditions, past and present.?*
>
> **
>
> That statement should be a key point made prominently (and repeatedly) 
> in introductions to Unicode, scripts, etc.
>
> It is an underlying foundation that needs to be understood and it 
> mitigates the many objections based on personal or community experiences.
>
> Well stated Markus.
>
> Tex
>
> **
>
> *From:*Unicore [mailto:unicore-bounces at unicode.org] *On Behalf Of 
> *Markus Scherer via Unicore
> *Sent:* Thursday, July 30, 2020 1:10 PM
> *To:* John Hudson
> *Cc:* unicore UnicoRe Discussion
> *Subject:* Re: Proposed letters, 0C5B & 0C5C, in Telugu
>
> On Thu, Jul 30, 2020 at 11:20 AM John Hudson via Unicore 
> <unicore at unicode.org <mailto:unicore at unicode.org>> wrote:
>
>     there is a reasonable, general question to ask about sufficiency of
>     attestation when it comes to very rare characters that might only
>     occur
>     in one or two texts, perhaps the invention of a single author, not
>     embraced by any subsequent tradition of use. And one response to that
>     question is 'Any attestation is sufficient', which has the benefit of
>     removing the need to come up with applicable critieria of sufficiency
>     that would need to be considered on a case-by-case basis.
>
> Right. As far as I understand, there are thousands of Chinese 
> characters that have been used very rarely, or even just once in a 
> dictionary or in a database of person names. They are real, they are 
> encoded, but they are not common.
>
> Implementers have to make choices, and sometimes it makes sense to 
> support a subset.
>
> If a font or keyboard vendor wants to support the entire Sinhala 
> /_script_/, then they will have glyphs for all relevant code points -- 
> whether inside or outside the Sinhala block -- and all relevant 
> /sequences/, and punctuation, etc.
>
> If someone cares to only support the subset needed for common, modern 
> use of the Sinhala /_language_/, then they can define such subsets or 
> look for organizations that have defined them. (E.g., Unicode CLDR has 
> sets of "exemplar characters" for many languages.)
>
> I understand a visceral reaction of "this does not belong". I was 
> originally not in favor of adding a capital sharp s 
> <https://en.wikipedia.org/wiki/Capital_%E1%BA%9E>?(Latin script, 
> German language) because it was not part of the German orthography and 
> wasn't taught in school etc. However, it clearly existed and was used, 
> and once evidence was presented showing that it was more than using a 
> lowercase ? in all-caps words, it got added to the Unicode standard, 
> and the official orthography now acknowledges it (as optional).
>
> Writing systems are shared by multiple languages and multiple 
> traditions, past and present.
>
> Best regards,
>
> markus
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20200730/9706bd0b/attachment.htm>