From prospero at cyber-wizard.com  Wed Dec  6 17:44:17 2023
From: prospero at cyber-wizard.com (prospero)
Date: Thu, 7 Dec 2023 00:44:17 +0100
Subject: Question regarding TR-29
Message-ID: <trinity-e235e123-3aa3-4f5c-a257-e407e04bb27b-1701906257696@3c-app-mailcom-lxa06>


unicode.org/reports/tr29
?
The WB4 rule for word breaks:
?
> Ignore Format and Extend characters, except after sot, CR, LF, and Newline. (See Section 6.2, Replacing Ignore Rules[https://unicode.org/reports/tr29/#Grapheme_Cluster_and_Format_Rules].)
> This also has the effect of: Any ? (Format | Extend | ZWJ)

seems incomplete and ambiguous. First, the "except after" part needs to apply to WSegSpace also, otherwise tests fail. And the handling of WB3c seems contradicted by the tests, e.g., the one on line 1158:

? 200D ? 0308 ? 231A ?	#  ? [0.2] ZERO WIDTH JOINER (ZWJ_FE) ? [4.0] COMBINING DIAERESIS (	Extend_FE) ? [999.0] WATCH (ExtPict) ? [0.3]

seems to contradict it, since ignoring the 0308 (Extend_FE) should yield a ZWJ_FE + ExtPict, which should not break, but the test requires a break. If the tests are dispositive, could TR-29 be better clarified to reflect them?


From pgcon6 at msn.com  Thu Dec  7 12:44:43 2023
From: pgcon6 at msn.com (Peter Constable)
Date: Thu, 7 Dec 2023 18:44:43 +0000
Subject: UTC public review issues to close January 2
In-Reply-To: <CAJ3fhbjFBxEU-6A0u3nG3D0Oz_2sArbnkaONbh_60nZXr9aAwQ@mail.gmail.com>
References: <CAJ3fhbgse73kmyxzpORxpx-rCATun-GP9TnKwGsGRn+L_gBzug@mail.gmail.com>
 <CAJ3fhbjFBxEU-6A0u3nG3D0Oz_2sArbnkaONbh_60nZXr9aAwQ@mail.gmail.com>
Message-ID: <DS0PR12MB7535F37EC93B6593C36535C5868BA@DS0PR12MB7535.namprd12.prod.outlook.com>

After the last Unicode Technical Committee meeting, there were some public review issues <https://www.unicode.org/review/> posted. PRIs are a way that UTC uses to solicit input and feedback on specific proposals or work in progress. The input period for these PRIs ends January 2, 2024. (Time is needed before the next UTC meeting to process the feedback.) That means we're almost halfway through the public review period.

Here's a summary of the five open PRIs:

PRI #483: <https://www.unicode.org/review/pri483/> Proposed Update UAX #38, Unicode Han Database (Unihan)<https://www.unicode.org/review/pri483/> ? UAX #38 describes the many properties Unicode provides for CJK ideographs. This is a draft update of this spec for Unicode 16.0.

PRI #484: <https://www.unicode.org/review/pri484/> Proposed Update UAX #50, Unicode Vertical Text Layout <https://www.unicode.org/review/pri484/> ? UAX #50 describes how characters should be adjusted between horizontal and vertical layout. This is a draft update of this spec for Unicode 16.0.

PRI #485: <https://www.unicode.org/review/pri485/> Draft UTR #56, Unicode Cuneiform Sign Lists<https://www.unicode.org/review/pri485/> ? This is a draft for a new technical report that will provide additional data that will aid in the use of the Unicode encoding for Sumero-Akkadian Cuneiform script<https://www.unicode.org/charts/PDF/U12000.pdf>.

PRI #486: <https://www.unicode.org/review/pri486/> Stabilization of UAX #42, Unicode Character Database in XML (UCDXML)<https://www.unicode.org/review/pri486/> ? UAX #42 provides the data for the Unicode Character Database in XML format. (UCD is character property data for use in processing algorithms that is provide with each version of Unicode. This PRI is for feedback on a planned UTC action to freeze UAX #42 as of Unicode 15.1.

PRI #487: Proposed Update UAX #53 Unicode Arabic Mark Rendering<https://www.unicode.org/review/pri487/> ? This specification was previously published as a Unicode Technical Report. This is a draft for changing the status of the spec, to make it formally part of The Unicode Standard as a Unicode Standard Annex (UAX) starting in Unicode 16.0.


UTC invites you to please take a look and provide feedback on these issues.


Peter Constable
UTC Chair


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20231207/8b26c441/attachment.htm>

From manishsmail at gmail.com  Thu Dec  7 17:56:58 2023
From: manishsmail at gmail.com (Manish Goregaokar)
Date: Thu, 7 Dec 2023 15:56:58 -0800
Subject: Question regarding TR-29
In-Reply-To: <trinity-e235e123-3aa3-4f5c-a257-e407e04bb27b-1701906257696@3c-app-mailcom-lxa06>
References: <trinity-e235e123-3aa3-4f5c-a257-e407e04bb27b-1701906257696@3c-app-mailcom-lxa06>
Message-ID: <CACpkpxkHgW3-e=iBALzNrwc-0k8nSSJ-cMq45abzJMZR1NGdUA@mail.gmail.com>

Hi!

I think a crucial thing to note about interpreting these rules is that they
must be applied in order, WB4 can only be applied after all of the WB3s,
etc. In general the logical model is that each rule is applied to the
entire input string before moving on to the next rule. In practice,
implementations tend to come up with a way of doing this in one or a
handful of loops by retaining some careful state.

The sequences `WSegSpace Format* WSegSpace` or `ZWJ Extend Ext_Pict` won't
have do-not-breaks generated by WB3d/WB3c because those rules apply before
the "ignore Extend/Format"

Since no rules after WB4 mention Extended_Pictographic or WSegSpace, WB4
does not need to try to include them in the "except" clause.

Hope this helps

Thanks,
-Manish


On Wed, Dec 6, 2023, 4:17?PM prospero via Unicode <unicode at corp.unicode.org>
wrote:

>
> unicode.org/reports/tr29
>
> The WB4 rule for word breaks:
>
> > Ignore Format and Extend characters, except after sot, CR, LF, and
> Newline. (See Section 6.2, Replacing Ignore Rules[
> https://unicode.org/reports/tr29/#Grapheme_Cluster_and_Format_Rules].)
> > This also has the effect of: Any ? (Format | Extend | ZWJ)
>
> seems incomplete and ambiguous. First, the "except after" part needs to
> apply to WSegSpace also, otherwise tests fail. And the handling of WB3c
> seems contradicted by the tests, e.g., the one on line 1158:
>
> ? 200D ? 0308 ? 231A ?  #  ? [0.2] ZERO WIDTH JOINER (ZWJ_FE) ? [4.0]
> COMBINING DIAERESIS (     Extend_FE) ? [999.0] WATCH (ExtPict) ? [0.3]
>
> seems to contradict it, since ignoring the 0308 (Extend_FE) should yield a
> ZWJ_FE + ExtPict, which should not break, but the test requires a break. If
> the tests are dispositive, could TR-29 be better clarified to reflect them?
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20231207/e8bc5986/attachment.htm>

From prospero at cyber-wizard.com  Fri Dec  8 13:55:56 2023
From: prospero at cyber-wizard.com (prospero)
Date: Fri, 8 Dec 2023 20:55:56 +0100
Subject: Question regarding TR-29
In-Reply-To: <CACpkpxkHgW3-e=iBALzNrwc-0k8nSSJ-cMq45abzJMZR1NGdUA@mail.gmail.com>
References: <trinity-e235e123-3aa3-4f5c-a257-e407e04bb27b-1701906257696@3c-app-mailcom-lxa06>
 <CACpkpxkHgW3-e=iBALzNrwc-0k8nSSJ-cMq45abzJMZR1NGdUA@mail.gmail.com>
Message-ID: <trinity-f31d1e3e-ac80-4d0e-96a1-41bb37d7e9d1-1702065356593@3c-app-mailcom-lxa10>

An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20231208/2b0f6f1f/attachment.htm>

From doug at ewellic.org  Mon Dec 18 10:31:23 2023
From: doug at ewellic.org (Doug Ewell)
Date: Mon, 18 Dec 2023 16:31:23 +0000
Subject: UDHR in Unicode
Message-ID: <SJ0PR03MB65988A3534C3C0E29D468854CA90A@SJ0PR03MB6598.namprd03.prod.outlook.com>

I noticed that the ?UDHR in Unicode? link has been removed from the Technical Site web page. The actual site, <https://unicode.org/udhr/>, is still present.

I?m wondering whether this is part of a simple reorganization, or whether this long-running project is being dismantled ? and if so, why.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org


From doug at ewellic.org  Mon Dec 18 23:31:40 2023
From: doug at ewellic.org (Doug Ewell)
Date: Tue, 19 Dec 2023 05:31:40 +0000
Subject: UTC public review issues to close January 2
In-Reply-To: <DS0PR12MB7535F37EC93B6593C36535C5868BA@DS0PR12MB7535.namprd12.prod.outlook.com>
References: <CAJ3fhbgse73kmyxzpORxpx-rCATun-GP9TnKwGsGRn+L_gBzug@mail.gmail.com>
 <CAJ3fhbjFBxEU-6A0u3nG3D0Oz_2sArbnkaONbh_60nZXr9aAwQ@mail.gmail.com>
 <DS0PR12MB7535F37EC93B6593C36535C5868BA@DS0PR12MB7535.namprd12.prod.outlook.com>
Message-ID: <SJ0PR03MB6598CA663A2886A603080B46CA97A@SJ0PR03MB6598.namprd03.prod.outlook.com>

Peter Constable wrote:

> https://www.unicode.org/review/pri486/https://www.unicode.org/review/pri486/
> ? UAX #42 provides the data for the Unicode Character Database in XML
> format. (UCD is character property data for use in processing
> algorithms that is provide with each version of Unicode. This PRI is
> for feedback on a planned UTC action to freeze UAX #42 as of Unicode
> 15.1.

This is a shame. I don?t know how widely the XML files were adopted, but I certainly found them easier to process than the traditional Unicode data files.

I imagine creating these files was a matter of auto-generation with custom tools, combined with human fine-tuning and judgment (i.e. where to draw the line when grouping characters). It would be great if Eric and/or Lauren?iu could donate any tools, but the human effort is probably what could not be replaced.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org


From daniel.buenzli at erratique.ch  Tue Dec 19 08:37:59 2023
From: daniel.buenzli at erratique.ch (=?utf-8?Q?Daniel_B=C3=BCnzli?=)
Date: Tue, 19 Dec 2023 15:37:59 +0100
Subject: UTC public review issues to close January 2
In-Reply-To: <SJ0PR03MB6598CA663A2886A603080B46CA97A@SJ0PR03MB6598.namprd03.prod.outlook.com>
References: <CAJ3fhbgse73kmyxzpORxpx-rCATun-GP9TnKwGsGRn+L_gBzug@mail.gmail.com>
 <CAJ3fhbjFBxEU-6A0u3nG3D0Oz_2sArbnkaONbh_60nZXr9aAwQ@mail.gmail.com>
 <DS0PR12MB7535F37EC93B6593C36535C5868BA@DS0PR12MB7535.namprd12.prod.outlook.com>
 <SJ0PR03MB6598CA663A2886A603080B46CA97A@SJ0PR03MB6598.namprd03.prod.outlook.com>
Message-ID: <etPan.6581aacc.10de1ce9.546@erratique.ch>

On 19 December 2023 at 06:34:55, Doug Ewell via Unicode (unicode at corp.unicode.org) wrote:

> This is a shame. I don?t know how widely the XML files were adopted, but I certainly found  
> them easier to process than the traditional Unicode data files.

For me this is only half the story. As I wrote in my feedback on the PRI, UAX42 is the only place where you can easily find out the type of a property and where their evolution from version to version is carefuly chronicled. This is a golden ressource if you maintain APIs that expose or make use of these properties.

Regarding how much it is used, it?s unclear but if you search for the various compressed and uncompressed file names on code hosting platforms, it?s far from anecdotic.

Best,

Daniel


From pgcon6 at msn.com  Tue Dec 19 13:24:34 2023
From: pgcon6 at msn.com (Peter Constable)
Date: Tue, 19 Dec 2023 19:24:34 +0000
Subject: UTC public review issues to close January 2
In-Reply-To: <SJ0PR03MB6598CA663A2886A603080B46CA97A@SJ0PR03MB6598.namprd03.prod.outlook.com>
References: <CAJ3fhbgse73kmyxzpORxpx-rCATun-GP9TnKwGsGRn+L_gBzug@mail.gmail.com>
 <CAJ3fhbjFBxEU-6A0u3nG3D0Oz_2sArbnkaONbh_60nZXr9aAwQ@mail.gmail.com>
 <DS0PR12MB7535F37EC93B6593C36535C5868BA@DS0PR12MB7535.namprd12.prod.outlook.com>
 <SJ0PR03MB6598CA663A2886A603080B46CA97A@SJ0PR03MB6598.namprd03.prod.outlook.com>
Message-ID: <DS0PR12MB753523810531B7C56337074C8697A@DS0PR12MB7535.namprd12.prod.outlook.com>

Human effort ? a committed volunteer ? was, indeed, the missing factor that led to asking whether it was worth continuing to maintain UCDXML.

Peter

-----Original Message-----
From: Doug Ewell <doug at ewellic.org>
Sent: Monday, December 18, 2023 10:32 PM
To: Peter Constable <pgcon6 at msn.com>; unicode at unicode.org <unicode at corp.unicode.org>
Subject: RE: UTC public review issues to close January 2

Peter Constable wrote:

> https://www.u/
> nicode.org%2Freview%2Fpri486%2Fhttps%3A%2F%2Fwww.unicode.org%2Freview%
> 2Fpri486%2F&data=05%7C02%7C%7Cb50c6c78d2774ef006e308dc0053ca3f%7C84df9
> e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638385607077909013%7CUnknown%7CT
> WFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> 6Mn0%3D%7C3000%7C%7C%7C&sdata=zwMU3kvBDqgLESjedkZ3c6akN0L%2FxhndyHurzI
> ZBzyI%3D&reserved=0 ? UAX #42 provides the data for the Unicode
> Character Database in XML format. (UCD is character property data for
> use in processing algorithms that is provide with each version of
> Unicode. This PRI is for feedback on a planned UTC action to freeze
> UAX #42 as of Unicode 15.1.

This is a shame. I don?t know how widely the XML files were adopted, but I certainly found them easier to process than the traditional Unicode data files.

I imagine creating these files was a matter of auto-generation with custom tools, combined with human fine-tuning and judgment (i.e. where to draw the line when grouping characters). It would be great if Eric and/or Lauren?iu could donate any tools, but the human effort is probably what could not be replaced.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org


From pgcon6 at msn.com  Tue Dec 19 13:29:12 2023
From: pgcon6 at msn.com (Peter Constable)
Date: Tue, 19 Dec 2023 19:29:12 +0000
Subject: UDHR in Unicode
In-Reply-To: <SJ0PR03MB65988A3534C3C0E29D468854CA90A@SJ0PR03MB6598.namprd03.prod.outlook.com>
References: <SJ0PR03MB65988A3534C3C0E29D468854CA90A@SJ0PR03MB6598.namprd03.prod.outlook.com>
Message-ID: <DS0PR12MB753583694E17EB691BD47E038697A@DS0PR12MB7535.namprd12.prod.outlook.com>

That project is being closed down. I'm not certain of all the exact reasons.

Peter

-----Original Message-----
From: Unicode <unicode-bounces at corp.unicode.org> On Behalf Of Doug Ewell via Unicode
Sent: Monday, December 18, 2023 9:31 AM
To: unicode at corp.unicode.org
Subject: UDHR in Unicode

I noticed that the "UDHR in Unicode" link has been removed from the Technical Site web page. The actual site, <https://unicode.org/udhr/>, is still present.

I'm wondering whether this is part of a simple reorganization, or whether this long-running project is being dismantled - and if so, why.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org


From ashpilkin at gmail.com  Tue Dec 19 15:10:01 2023
From: ashpilkin at gmail.com (Alexander Shpilkin)
Date: Tue, 19 Dec 2023 23:10:01 +0200
Subject: UDHR in Unicode
In-Reply-To: <DS0PR12MB753583694E17EB691BD47E038697A@DS0PR12MB7535.namprd12.prod.outlook.com>
References: <SJ0PR03MB65988A3534C3C0E29D468854CA90A@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <DS0PR12MB753583694E17EB691BD47E038697A@DS0PR12MB7535.namprd12.prod.outlook.com>
Message-ID: <CAAiuFs-+06ckWuOTWuMQGvfQN5y8S-9Uau4UWhVRFpeOidU-vQ@mail.gmail.com>

So, um, does anybody have an up-to-date copy of the Git repository? Because
apparently outright deleting the data when a project is shut down is
something the Unicode Consortium considers a good and proper thing to do.
?Alex

On Tue, 19 Dec 2023, 21:33 Peter Constable via Unicode, <
unicode at corp.unicode.org> wrote:

> That project is being closed down. I'm not certain of all the exact
> reasons.
>
> Peter
>
> -----Original Message-----
> From: Unicode <unicode-bounces at corp.unicode.org> On Behalf Of Doug Ewell
> via Unicode
> Sent: Monday, December 18, 2023 9:31 AM
> To: unicode at corp.unicode.org
> Subject: UDHR in Unicode
>
> I noticed that the "UDHR in Unicode" link has been removed from the
> Technical Site web page. The actual site, <https://unicode.org/udhr/>, is
> still present.
>
> I'm wondering whether this is part of a simple reorganization, or whether
> this long-running project is being dismantled - and if so, why.
>
> --
> Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20231219/f6762e34/attachment.htm>

From doug at ewellic.org  Tue Dec 19 15:16:34 2023
From: doug at ewellic.org (Doug Ewell)
Date: Tue, 19 Dec 2023 21:16:34 +0000
Subject: UDHR in Unicode
In-Reply-To: <CAAiuFs-+06ckWuOTWuMQGvfQN5y8S-9Uau4UWhVRFpeOidU-vQ@mail.gmail.com>
References: <SJ0PR03MB65988A3534C3C0E29D468854CA90A@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <DS0PR12MB753583694E17EB691BD47E038697A@DS0PR12MB7535.namprd12.prod.outlook.com>
 <CAAiuFs-+06ckWuOTWuMQGvfQN5y8S-9Uau4UWhVRFpeOidU-vQ@mail.gmail.com>
Message-ID: <SJ0PR03MB65982C1165AB1F05EC22DA9CCA97A@SJ0PR03MB6598.namprd03.prod.outlook.com>

Alexander Shpilkin wrote:

> So, um, does anybody have an up-to-date copy of the Git repository?
> Because apparently outright deleting the data when a project is shut
> down is something the Unicode Consortium considers a good and proper
> thing to do.

As of a couple of minutes ago, the site at https://unicode.org/udhr/ is still up and the aggregate files and bulk downloads, at least, still appear to be available.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org


From moyogo at gmail.com  Tue Dec 19 16:12:09 2023
From: moyogo at gmail.com (Denis Jacquerye)
Date: Tue, 19 Dec 2023 23:12:09 +0100
Subject: UDHR in Unicode
In-Reply-To: <SJ0PR03MB65982C1165AB1F05EC22DA9CCA97A@SJ0PR03MB6598.namprd03.prod.outlook.com>
References: <SJ0PR03MB65988A3534C3C0E29D468854CA90A@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <DS0PR12MB753583694E17EB691BD47E038697A@DS0PR12MB7535.namprd12.prod.outlook.com>
 <CAAiuFs-+06ckWuOTWuMQGvfQN5y8S-9Uau4UWhVRFpeOidU-vQ@mail.gmail.com>
 <SJ0PR03MB65982C1165AB1F05EC22DA9CCA97A@SJ0PR03MB6598.namprd03.prod.outlook.com>
Message-ID: <CAJKta0wGEAg4=m8HvzBZsyb7NvDmN43oBk47Ckoh_Bef7TQxGA@mail.gmail.com>

On Tue, 19 Dec 2023 at 22:20, Doug Ewell via Unicode <
unicode at corp.unicode.org> wrote:

> Alexander Shpilkin wrote:
>
> > So, um, does anybody have an up-to-date copy of the Git repository?
> > Because apparently outright deleting the data when a project is shut
> > down is something the Unicode Consortium considers a good and proper
> > thing to do.
>
> As of a couple of minutes ago, the site at https://unicode.org/udhr/ is
> still up and the aggregate files and bulk downloads, at least, still appear
> to be available.
>
>
There are some forks as well, for example https://github.com/moyogo/udhr or
https://github.com/sffc/udhr
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20231219/dfbf74ba/attachment.htm>

From ashpilkin at gmail.com  Tue Dec 19 16:40:44 2023
From: ashpilkin at gmail.com (Alexander Shpilkin)
Date: Wed, 20 Dec 2023 00:40:44 +0200
Subject: UDHR in Unicode
In-Reply-To: <CAJKta0wGEAg4=m8HvzBZsyb7NvDmN43oBk47Ckoh_Bef7TQxGA@mail.gmail.com>
References: <SJ0PR03MB65988A3534C3C0E29D468854CA90A@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <DS0PR12MB753583694E17EB691BD47E038697A@DS0PR12MB7535.namprd12.prod.outlook.com>
 <CAAiuFs-+06ckWuOTWuMQGvfQN5y8S-9Uau4UWhVRFpeOidU-vQ@mail.gmail.com>
 <SJ0PR03MB65982C1165AB1F05EC22DA9CCA97A@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <CAJKta0wGEAg4=m8HvzBZsyb7NvDmN43oBk47Ckoh_Bef7TQxGA@mail.gmail.com>
Message-ID: <CAAiuFs-Q9bkTfYd84q4et==+LnVW=bP-JoFBswD1YQwLnJp3GQ@mail.gmail.com>

On Wed, 20 Dec 2023, 00:12 Denis Jacquerye, <moyogo at gmail.com> wrote:

> On Tue, 19 Dec 2023 at 22:20, Doug Ewell via Unicode <
> unicode at corp.unicode.org> wrote:
>
>> Alexander Shpilkin wrote:
>>
>> > So, um, does anybody have an up-to-date copy of the Git repository?
>> > Because apparently outright deleting the data when a project is shut
>> > down is something the Unicode Consortium considers a good and proper
>> > thing to do.
>>
>> As of a couple of minutes ago, the site at https://unicode.org/udhr/ is
>> still up and the aggregate files and bulk downloads, at least, still appear
>> to be available.
>>
>>
> There are some forks as well, for example https://github.com/moyogo/udhr
> or https://github.com/sffc/udhr
>

Yes, and apparently you updated yours (the first one) to the last released
version as I was scouring various archives and search engine caches for
commit SHAs; thank you for that! ?Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20231220/ba15a3bf/attachment-0001.htm>

From jameskass at code2001.com  Mon Dec 25 17:10:24 2023
From: jameskass at code2001.com (James Kass)
Date: Mon, 25 Dec 2023 23:10:24 +0000
Subject: UDHR in Unicode
In-Reply-To: <DS0PR12MB753583694E17EB691BD47E038697A@DS0PR12MB7535.namprd12.prod.outlook.com>
References: <SJ0PR03MB65988A3534C3C0E29D468854CA90A@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <DS0PR12MB753583694E17EB691BD47E038697A@DS0PR12MB7535.namprd12.prod.outlook.com>
Message-ID: <8c9926c0-265a-4114-b930-de22ed21902b@code2001.com>


On 2023-12-19 7:29 PM, Peter Constable via Unicode wrote:
> That project is being closed down. I'm not certain of all the exact reasons.
>
> Peter
That's a shame.? Was any effort made to ask the UN if they had any 
interest in hosting the project?

From wjgo_10009 at btinternet.com  Tue Dec 26 01:53:23 2023
From: wjgo_10009 at btinternet.com (William_J_G Overington)
Date: Tue, 26 Dec 2023 07:53:23 +0000 (GMT)
Subject: The Rescue Project (from Re: UDHR in Unicode)
In-Reply-To: <8c9926c0-265a-4114-b930-de22ed21902b@code2001.com>
References: <SJ0PR03MB65988A3534C3C0E29D468854CA90A@SJ0PR03MB6598.namprd03.prod.outlook.com>
 <DS0PR12MB753583694E17EB691BD47E038697A@DS0PR12MB7535.namprd12.prod.outlook.com>
 <8c9926c0-265a-4114-b930-de22ed21902b@code2001.com>
Message-ID: <38563ac1.814.18ca51d3641.Webtop.95@btinternet.com>


James Kass wrote:

> That's a shame.

Yes.

I wonder if, even though the stuff appears to be being sent to the 
scrapyard, can it be rescued and restored by enthusiasts, like in 
England many steam locomotives were rescued and restored after being 
sent to what is known informally as Barry Scrapyard?

And, just like there are some new build steam locomotive projects in 
England, can the number of languages for which there is a translation be 
increased please?

William Overington

Tuesday 26 December 2023


------ Original Message ------
From: "James Kass via Unicode" <unicode at corp.unicode.org>
To: unicode at corp.unicode.org
Sent: Monday, 2023 Dec 25 At 23:10
Subject: Re: UDHR in Unicode

On 2023-12-19 7:29 PM, Peter Constable via Unicode wrote:
That project is being closed down. I'm not certain of all the exact 
reasons.
Peter
That's a shame.  Was any effort made to ask the UN if they had any 
interest in hosting the project?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20231226/55aa867f/attachment.htm>

From wjgo_10009 at btinternet.com  Tue Dec 26 02:27:36 2023
From: wjgo_10009 at btinternet.com (William_J_G Overington)
Date: Tue, 26 Dec 2023 08:27:36 +0000 (GMT)
Subject: Bing Chat AI Artificial Intelligence and Unicode
Message-ID: <7c2d5b54.81e.18ca53c8a51.Webtop.95@btinternet.com>


Recently I have been experimenting using Bing Chat AI, just as an end 
user using Bing Chat AI from within the Edge browser running on my home 
computer.

After some experiments produced amazing results I decided to try 
requesting content that is in a language other than English, namely 
Portuguese, and it worked well. Later I tried an experiment that 
produced results not only in several languages but also in several 
scripts too.

Unicode other than in English and a few words in Welsh is used from the 
second post on page 4 of the thread.

Here is a link.

https://punster.me/serif/viewtopic.php?id=516&p=4

William Overington

Tuesday 26 December 2023


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20231226/17566471/attachment.htm>