Parsers for the UnicodeSet notation?

Roozbeh Pournader roozbeh at
Wed Jul 23 17:28:51 CDT 2014

On Wed, Jul 23, 2014 at 3:23 PM, Eric Muller <emuller at> wrote:

> I would like to work with the exemplarCharacters data in the CLDR. That
> uses the UnicodeSet notation. Is there somewhere a parser for that
> notation, that would return me just the list of characters in the set?

Note that it's a set of strings, not characters.

I suspect that the exemplarCharacters use a restricted form of the
> UnicodeSet notation (e.g. do not use property values). Is that correct, and
> if so, what's the subset?

I have an Apache-licensed parser in Python here:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list