Unicode Regex Question

Cameron Dutro cameron at lumoslabs.com
Tue Dec 30 17:22:12 CST 2014


Thanks Mark. Is that documented anywhere?

-Cameron

On Tue, Dec 30, 2014 at 11:40 AM, Mark Davis ☕️ <mark at macchiato.com> wrote:

> $ has a special meaning in the transforms; it means the end of string
> (either end). Unlike normal regex, however, it can occur in character
> classes, eg [[a$b][:script=greek:]]
>
>
> Mark <https://google.com/+MarkDavis>
>
> *— Il meglio è l’inimico del bene —*
>
> On Tue, Dec 30, 2014 at 8:21 PM, Cameron Dutro <cameron at lumoslabs.com>
> wrote:
>
>> Hey cldr-users,
>>
>> I'm looking at this entry
>> <http://unicode.org/cldr/trac/browser/trunk/common/transforms/Any-Publishing.xml#L21>
>> in CLDR transforms. I'm curious why that "$" character is inside the
>> character class. Here's the line reproduced:
>>
>> <tRule>$makeRight = [[:Z:][:Ps:][:Pi:]$] ;</tRule>
>>
>> I see an outer character class that contains three internal unicode
>> character sets and a literal dollar sign. Usually in regular expressions,
>> the dollar sign is used to match the end of the string. When it's included
>> in a character class however, it should be interpreted as a literal
>> character.
>>
>> Was including the dollar sign in the character class intentional? Should
>> it be treated as an end-of-string anchor or a literal string?
>>
>> -Cameron
>>
>> _______________________________________________
>> CLDR-Users mailing list
>> CLDR-Users at unicode.org
>> http://unicode.org/mailman/listinfo/cldr-users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20141230/dec32eac/attachment.html>


More information about the CLDR-Users mailing list