propose th-u-lb-nodict

Mark Davis ☕️ via CLDR-Users cldr-users at unicode.org
Thu May 25 08:09:53 CDT 2017


Interesting idea. You should file a ticket with your proposal.

{phone}

On May 25, 2017 12:14, "Martin Hosken via CLDR-Users" <
cldr-users at unicode.org> wrote:

> Dear All,
>
> When line breaking minority text in, say, the Thai script or any script
> that uses dictionary based breaking, the dictionary used is for the
> dominant language. A while back, we addressed this for the Khmer script and
> I've had no complaints since. Now, we could try to do something similar for
> other dictionary broken languages. But I would like to suggest a simpler
> approach that can address fixed texts very well, and that is to add a
> nodict line break locale property. This property would switch the line
> break iterator to one that uses a set of rules with no dictionary statement
> in it. In other words, SA type characters are treated as one great long
> string and it is up to the source text to have inserted appropriate ZWSP,
> or other kinds of spaces, to control the breaks.
>
> What do folks think? From my perspective, this would solve a bunch of bugs
> that are pointed my way with regard to line breaking and minority
> languages, even if it is not the best possible solution. It's pretty cheap
> to do and it doesn't change anything that is already out there.
>
> Yours,
> Martin
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20170525/d2ff7217/attachment.html>


More information about the CLDR-Users mailing list