From christoph.paeper at crissov.de  Mon Aug  7 09:58:37 2023
From: christoph.paeper at crissov.de (=?utf-8?Q?Christoph_P=C3=A4per?=)
Date: Mon, 7 Aug 2023 16:58:37 +0200
Subject: Squared T-shirt sizes
Message-ID: <F8232D7C-1D41-4FB8-AC46-CFC31FF0A369@crissov.de>

Dear Unicoders

I was almost sure that I had seen squared XL and XS as ideographic legacy characters in the code charts before, but I can?t find them (e.g. in U+1F1xy or U+33xy), so I?m probably slightly delusional. XL is not included as a Number Form for the roman numeral of 40 (~ U+216x).

1. Did I miss anything?
2. Are such Latin size labeling characters (also LL or SS from JIS L 4004/4005) written within a single ideographic square in East Asia?
3. Could they be added to the standard without any such prior use?

Cheers,

Christoph P?per


From wjgo_10009 at btinternet.com  Tue Aug  1 09:16:41 2023
From: wjgo_10009 at btinternet.com (William_J_G Overington)
Date: Tue, 1 Aug 2023 15:16:41 +0100 (BST)
Subject: Expressing any Unicode character using Morse code
Message-ID: <f580060.10ff2.189b1754d98.Webtop.94@btinternet.com>


https://punster.me/serif/viewtopic.php?id=455

William Overington

Tuesday 1 August 2023

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230801/45527a93/attachment.htm>

From beckiergb at gmail.com  Mon Aug  7 15:49:30 2023
From: beckiergb at gmail.com (Rebecca Bettencourt)
Date: Mon, 7 Aug 2023 13:49:30 -0700
Subject: Squared T-shirt sizes
In-Reply-To: <F8232D7C-1D41-4FB8-AC46-CFC31FF0A369@crissov.de>
References: <F8232D7C-1D41-4FB8-AC46-CFC31FF0A369@crissov.de>
Message-ID: <CAH=y87ZN8oCg6os7a6JFempX844nAqQ=2gV_KA+zUphoMyqntg@mail.gmail.com>

There is a U+1F14D ? SQUARED SS but none of the other characters you
mentioned currently exist.

-- Rebecca Bettencourt


On Mon, Aug 7, 2023 at 8:04?AM Christoph P?per via Unicode <
unicode at corp.unicode.org> wrote:

> Dear Unicoders
>
> I was almost sure that I had seen squared XL and XS as ideographic legacy
> characters in the code charts before, but I can?t find them (e.g. in
> U+1F1xy or U+33xy), so I?m probably slightly delusional. XL is not included
> as a Number Form for the roman numeral of 40 (~ U+216x).
>
> 1. Did I miss anything?
> 2. Are such Latin size labeling characters (also LL or SS from JIS L
> 4004/4005) written within a single ideographic square in East Asia?
> 3. Could they be added to the standard without any such prior use?
>
> Cheers,
>
> Christoph P?per
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230807/41fba722/attachment.htm>

From sosipiuk at gmail.com  Mon Aug  7 16:51:32 2023
From: sosipiuk at gmail.com (=?UTF-8?Q?S=C5=82awomir_Osipiuk?=)
Date: Mon, 07 Aug 2023 21:51:32 +0000
Subject: Expressing any Unicode character using Morse code
In-Reply-To: <f580060.10ff2.189b1754d98.Webtop.94@btinternet.com>
References: <f580060.10ff2.189b1754d98.Webtop.94@btinternet.com>
Message-ID: <1691442167308.2261732728.4185916792@gmail.com>

Compactness is of great benefit in Morse Code. I would therefore recommend against any padding or necessitating any additional character to specify length, or indeed worrying about "metadata precision" generally. For the same reason I would also use some flavour of base32 (I prefer Cockford's over the RFC, though that detail doesn't matter so much). This allows all planes except 16 to be encoded using only 4 Morse letters in the sequence.


The fundamental idea of a "unicode character introducer" sequence is solid. In the spirit of Morse shorthand, I recommend a simple concatenation of "U" and "+", that is the sequnce "..-.-.-." treated as a single letter, without spaces. This would be followed by the base32 sequence, made as short as possible, and terminated with a word-space.


Thus we have:


? (U+7FBD):  ..-.-.-.   --..   -..-   -..-  (U?ZXX)
? (U+1FAE5):  ..-.-.-.   ...--   -.--   --.-   .....   (U?3YQ5) 


Hopefully I did not mess those examples up, but I think the point gets across regardless.


In most cases, the ambiguity of whether the terminating word-space should be read as a word-space or letter-space (i.e. the current word continues following the unicode character) can be determined contextually. However, if absolutely necessary, another plus sign can be added to the sequence indicating word-continuation (i.e. the terminating space should be read as a letter-space).


Cheers,
S?awomir Osipiuk

On Tuesday, 01 August 2023, 10:16:41 (-04:00), William_J_G Overington via Unicode wrote:


https://punster.me/serif/viewtopic.php?id=455


William Overington


Tuesday 1 August 2023


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230807/6bfc6d69/attachment-0001.htm>

From mark at kli.org  Mon Aug  7 17:12:14 2023
From: mark at kli.org (Mark E. Shoulson)
Date: Mon, 7 Aug 2023 18:12:14 -0400
Subject: Expressing any Unicode character using Morse code
In-Reply-To: <1691442167308.2261732728.4185916792@gmail.com>
References: <f580060.10ff2.189b1754d98.Webtop.94@btinternet.com>
 <1691442167308.2261732728.4185916792@gmail.com>
Message-ID: <fc724019-f91e-6d3f-150b-0ae846c932a6@shoulson.com>

Yes, compactness is of great benefit in Morse Code.? To such an extent 
that I find myself thinking that "expressing any Unicode character" and 
"Morse Code" are somewhat at odds with one another. 
https://en.wikipedia.org/wiki/Morse_code_for_non-Latin_alphabets speaks 
of non-Latin alphabets using their own encodings of many of the same 
dot-dash sequences that ASCII uses for non-ASCII characters.? I guess in 
a way it's rather like the old ISO-8859 code pages: you use the same bit 
sequences but they mean different things depending on what alphabet 
you're speaking.? A big part of Unicode's purpose was precisely to 
supplant ISO-8859 (right?) so that each character could stand on its own 
and not have to have code-page metadata attached to it.


I can sort of see some logic to allowing the same for Morse Code, but 
again, Morse Code needs its compactness and needs to be short enough for 
humans to send and receive.? The "code-page" approach sounds eminently 
practical and usable for most purposes for Morse Code.? Still, what 
you're talking about is some kind of "unicode escape sequence" that you 
can use for one-off insertions of a character here and there (one 
hopes), and I can see some utility to that.? But who gets to decide how 
that's done?? Unicode doesn't control International Morse Code.? 
Probably you need to take this up with the International 
Telecommunication Union to make it official, or else find a bunch of 
Morse Code enthusiasts who'll use it unofficially until it becomes a de 
facto standard.


Note that there are already Chinese and Japanese telegraph codes (about 
which I know nothing, but Wikipedia does), so there are already Morse 
Codes that have to represent largish character sets.


~mark


On 8/7/23 17:51, S?awomir Osipiuk via Unicode wrote:
> Compactness is of great benefit in Morse Code.
...
> On Tuesday, 01 August 2023, 10:16:41 (-04:00), William_J_G Overington 
> via Unicode wrote:
>
>     https://punster.me/serif/viewtopic.php?id=455
>
>
>     William Overington
>
>
>     Tuesday 1 August 2023
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230807/58763335/attachment.htm>

From doug at ewellic.org  Mon Aug  7 18:40:07 2023
From: doug at ewellic.org (Doug Ewell)
Date: Mon, 7 Aug 2023 23:40:07 +0000
Subject: Squared T-shirt sizes
In-Reply-To: <F8232D7C-1D41-4FB8-AC46-CFC31FF0A369@crissov.de>
References: <F8232D7C-1D41-4FB8-AC46-CFC31FF0A369@crissov.de>
Message-ID: <SJ0PR03MB659877924F20D909AA64099FCA0CA@SJ0PR03MB6598.namprd03.prod.outlook.com>

JIS L 4004:2001 seems to refer to other two-letter size codes (PB, SA, SB, MY, MA, etc.) and does not enclose them in a square.

Encoding these as squared symbols would probably require substantial evidence that the symbols are already in use. The same would be true for the English S, M, L, XL... and any codes additionally specified in EN 13402.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org


-----Original Message-----
From: Unicode <unicode-bounces at corp.unicode.org> On Behalf Of Christoph P?per via Unicode
Sent: Monday, August 7, 2023 8:59
To: via Unicode <unicode at corp.unicode.org>
Subject: Squared T-shirt sizes

Dear Unicoders

I was almost sure that I had seen squared XL and XS as ideographic legacy characters in the code charts before, but I can?t find them (e.g. in U+1F1xy or U+33xy), so I?m probably slightly delusional. XL is not included as a Number Form for the roman numeral of 40 (~ U+216x).

1. Did I miss anything?
2. Are such Latin size labeling characters (also LL or SS from JIS L 4004/4005) written within a single ideographic square in East Asia?
3. Could they be added to the standard without any such prior use?

Cheers,

Christoph P?per


From textexin at xencraft.com  Tue Aug  8 20:07:55 2023
From: textexin at xencraft.com (Tex)
Date: Tue, 8 Aug 2023 18:07:55 -0700
Subject: =?UTF-8?Q?World=E2=80=99s_Indigenous_Peoples_Day_i?=
 =?UTF-8?Q?s_tomorrow_Aug_9?=
Message-ID: <002701d9ca5d$ee4e52e0$caeaf8a0$@xencraft.com>

Members of this group may be interested in this online event produced by TranslationCommons.org, tomorrow Aug 9, highlighting Indigenous languages of Asia, in celebration of World?s Indigenous Peoples Day. 

 
Presentations will be from India, Indonesia, Malaysia, Nepal, Bangladesh and other parts of Asia. They will be providing cultural performances, case studies and current information on several communities including Tamang, Munda Tribe, Ho Tribe, Chakma, Koya, Sunawar, and others. The speakers will be addressing translation, preservation and revitalization of Asian Indigenous languages. 

 
Join us via YouTube starting at 7am PST, 2pm UTC, 19:30 IST.

https://www.youtube.com/watch?v=uAGh5fRfOuM

 
See the program and other details at https://translationcommons.org/international-day-of-the-worlds-indigenous-peoples/ 

 
Tex

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230808/5eb004d1/attachment.htm>

From jameskass at code2001.com  Tue Aug 22 20:01:19 2023
From: jameskass at code2001.com (James Kass)
Date: Wed, 23 Aug 2023 01:01:19 +0000
Subject: Unicode 15.1 repertoire
Message-ID: <aa7a569f-6ec6-5015-0a9a-6ce53a4be838@code2001.com>


Unicode 15.1 is scheduled to be released September twelfth.

According to this page,
https://www.unicode.org/charts/PDF/Unicode-15.1/

... CJK Extension I adds 622 new characters.

But according to the chart linked from that page,
https://www.unicode.org/charts/PDF/Unicode-15.1/U151-2EBF0.pdf

... there are only 603 additional characters.

Which one is correct?? If the 622 figure is right, is there a chart 
showing the additional 19 characters?

From markus.icu at gmail.com  Tue Aug 22 20:16:00 2023
From: markus.icu at gmail.com (Markus Scherer)
Date: Tue, 22 Aug 2023 18:16:00 -0700
Subject: Unicode 15.1 repertoire
In-Reply-To: <aa7a569f-6ec6-5015-0a9a-6ce53a4be838@code2001.com>
References: <aa7a569f-6ec6-5015-0a9a-6ce53a4be838@code2001.com>
Message-ID: <CAN49p6rseq8_aC7RispBF+DWzSt9igTzBzf2GWP_tEhiS1Nnhg@mail.gmail.com>

Hi James,

Extension I was expanded from 603 characters to 622 as a result of
decisions in UTC #176 <https://www.unicode.org/L2/L2023/23157.htm> based on
https://www.unicode.org/L2/L2023/23114r-unc-extension-i.pdf
https://www.unicode.org/L2/L2023/23163-cjk-unihan-group-utc176.pdf

Best regards,
markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230822/7059db48/attachment.htm>

From jameskass at code2001.com  Wed Aug 23 07:31:04 2023
From: jameskass at code2001.com (James Kass)
Date: Wed, 23 Aug 2023 12:31:04 +0000
Subject: Unicode 15.1 repertoire
In-Reply-To: <CAN49p6rseq8_aC7RispBF+DWzSt9igTzBzf2GWP_tEhiS1Nnhg@mail.gmail.com>
References: <aa7a569f-6ec6-5015-0a9a-6ce53a4be838@code2001.com>
 <CAN49p6rseq8_aC7RispBF+DWzSt9igTzBzf2GWP_tEhiS1Nnhg@mail.gmail.com>
Message-ID: <6ac500d7-ef5e-ef1d-d279-c94d48dffb23@code2001.com>

Hi Markus,

Thank you for the updated information.

It's too bad that the "GIDC23" numbers were revised because renumbering 
from the repertoire presented during the beta review period would have 
been simpler.? Although the metadata.txt file (found in the links you 
sent) has a cross-reference to an apparently earlier Unicode proposal, 
it is not the same repertoire as the beta review charts and data.

Best regards,

James

On 2023-08-23 1:16 AM, Markus Scherer via Unicode wrote:
> Hi James,
>
> Extension I was expanded from 603 characters to 622 as a result of 
> decisions in UTC #176 <https://www.unicode.org/L2/L2023/23157.htm> 
> based on
> https://www.unicode.org/L2/L2023/23114r-unc-extension-i.pdf
> https://www.unicode.org/L2/L2023/23163-cjk-unihan-group-utc176.pdf
>
> Best regards,
> markus


From jameskass at code2001.com  Wed Aug 23 21:46:15 2023
From: jameskass at code2001.com (James Kass)
Date: Thu, 24 Aug 2023 02:46:15 +0000
Subject: Unicode 15.1 repertoire
In-Reply-To: <6ac500d7-ef5e-ef1d-d279-c94d48dffb23@code2001.com>
References: <aa7a569f-6ec6-5015-0a9a-6ce53a4be838@code2001.com>
 <CAN49p6rseq8_aC7RispBF+DWzSt9igTzBzf2GWP_tEhiS1Nnhg@mail.gmail.com>
 <6ac500d7-ef5e-ef1d-d279-c94d48dffb23@code2001.com>
Message-ID: <b430a6bf-bfae-73d7-f5a1-18e724e71c97@code2001.com>

Only one minor anomaly spotted in the new data.? A discrepancy between 
an IDS and its corresponding chart glyph.

U+2EE3B
???? ?? IDS from metadata.txt
???? ?? chart glyph

Hope this is helpful, not sure where to report it.


From markus.icu at gmail.com  Wed Aug 23 23:46:56 2023
From: markus.icu at gmail.com (Markus Scherer)
Date: Wed, 23 Aug 2023 21:46:56 -0700
Subject: Unicode 15.1 repertoire
In-Reply-To: <b430a6bf-bfae-73d7-f5a1-18e724e71c97@code2001.com>
References: <aa7a569f-6ec6-5015-0a9a-6ce53a4be838@code2001.com>
 <CAN49p6rseq8_aC7RispBF+DWzSt9igTzBzf2GWP_tEhiS1Nnhg@mail.gmail.com>
 <6ac500d7-ef5e-ef1d-d279-c94d48dffb23@code2001.com>
 <b430a6bf-bfae-73d7-f5a1-18e724e71c97@code2001.com>
Message-ID: <CAN49p6qebTNdOpAvHCcjVUw-MXS2QQ1uuGGExQvj_f+Vyu51UA@mail.gmail.com>

On Wed, Aug 23, 2023, 19:49 James Kass via Unicode <unicode at corp.unicode.org>
wrote:

> Only one minor anomaly spotted in the new data.  A discrepancy between
> an IDS and its corresponding chart glyph.
>

I passed this along and was told that this is a known issue.

Thanks,
markus

>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230823/dcd11d41/attachment.htm>

From markus.icu at gmail.com  Tue Aug 29 18:29:49 2023
From: markus.icu at gmail.com (Markus Scherer)
Date: Tue, 29 Aug 2023 16:29:49 -0700
Subject: Happy 35th birthday, Unicode!
Message-ID: <CAN49p6qLPvaZ9-Dk+0M-LCi-BWquyTc3iLKLrgg_z_+0JjA0Rg@mail.gmail.com>

Happy 35 years after ?Unicode 88?, and best wishes for many more years of a
successful standard and what has become a standards (plural) development
organization!

>From the 2008 history books:
https://www.unicode.org/history/20thceleb/20thceleb.html

markus
ICU-TC
UTC-PAG
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230829/9131f7f7/attachment.htm>