Re: ◌ in LB28a in UAX14 of Unicode 15.1.0

Andy Heninger andy.heninger at gmail.com
Mon Sep 4 17:55:05 CDT 2023


>
> is there a machine readable version of the rules for all the Unicode
> segmentation standards ?


It would be nice if the rules in the UAX source documents were tagged in
some way such that simple tooling could extract them in a useful form.

I used to have a script that would scrape the line break rules from UAX-14,
for the purpose of partially automating maintenance of the pair table, but
it (and the pair table) are long gone.

  -- Andy

On Mon, Sep 4, 2023 at 11:47 AM Asmus Freytag via Unicode <
unicode at corp.unicode.org> wrote:

> Correct, we don't have a notation for "literal" and we need one.
> A./
>
>
> On 9/4/2023 11:11 AM, Sławomir Osipiuk via Unicode wrote:
>
> It's definitely confusing. At first glance it certainly appears to be some
> kind of special marker or syntax, not a simple literal character. It needs
> at least a note somewhere because this WILL cause confusion and this
> question will come up again elsewhere.
>
> On Monday, 04 September 2023, 06:27:08 (-04:00), Robin Leroy via Unicode
> wrote:
>
> Le lun. 4 sept. 2023 à 11:57, Daniel Bünzli via Unicode <
> unicode at corp.unicode.org> a écrit :
>
>> Hello,
>>
>> I can’t figure out what the ◌ character classification represents in:
>>
>>   https://www.unicode.org/reports/tr14/proposed.html#LB28a
>
> Itself: U+25CC DOTTED CIRCLE.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20230904/de87fd58/attachment.htm>


More information about the Unicode mailing list