<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">The pair table.<br>
<br>
Those were the days!</div>
<div class="moz-cite-prefix"><br>
A./<br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">On 9/4/2023 3:55 PM, Andy Heninger via
Unicode wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAEtzAy6xicRKFJhLE1L9LGxTFw037Fg2REkAkbsmP39HiuLnUA@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">is
there a machine readable version of the rules for all the
Unicode segmentation standards ?</blockquote>
<div><br>
</div>
<div>It would be nice if the rules in the UAX source documents
were tagged in some way such that simple tooling could extract
them in a useful form.</div>
<div><br>
</div>
<div>I used to have a script that would scrape the line break
rules from UAX-14, for the purpose of partially automating
maintenance of the pair table, but it (and the pair table) are
long gone.</div>
<div><br>
</div>
<div> -- Andy</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Sep 4, 2023 at
11:47 AM Asmus Freytag via Unicode <<a
href="mailto:unicode@corp.unicode.org"
moz-do-not-send="true" class="moz-txt-link-freetext">unicode@corp.unicode.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div>Correct, we don't have a notation for "literal" and we
need one.</div>
<div>A./<br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>On 9/4/2023 11:11 AM, Sławomir Osipiuk via Unicode
wrote:<br>
</div>
<blockquote type="cite"> <span></span>It's definitely
confusing. At first glance it certainly appears to be some
kind of special marker or syntax, not a simple literal
character. It needs at least a note somewhere because this
WILL cause confusion and this question will come up again
elsewhere.<br>
<br>
On Monday, 04 September 2023, 06:27:08 (-04:00), Robin
Leroy via Unicode wrote:<br>
<br>
<blockquote style="margin:0px 0px 0.8ex;border-left:2px
solid rgb(0,0,255);padding-left:1ex">
<div dir="ltr">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">Le lun. 4 sept.
2023 à 11:57, Daniel Bünzli via Unicode <<a
href="mailto:unicode@corp.unicode.org"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">unicode@corp.unicode.org</a>>
a écrit :<br>
</div>
<blockquote class="gmail_quote"
style="padding-left:1ex;border-left:1px solid
rgb(204,204,204);margin:0px 0px 0px 0.8ex">Hello, <br>
<br>
I can’t figure out what the ◌ character
classification represents in:<br>
<br>
<a
href="https://www.unicode.org/reports/tr14/proposed.html#LB28a"
rel="noreferrer" target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">https://www.unicode.org/reports/tr14/proposed.html#LB28a</a></blockquote>
</div>
</div>
<div dir="ltr">Itself: U+25CC DOTTED CIRCLE.</div>
</div>
</blockquote>
<span></span> </blockquote>
<p><br>
</p>
</div>
</blockquote>
</div>
</blockquote>
<p><br>
</p>
</body>
</html>