Canonical block names: spaces vs. underscores

Mathias Bynens mathias at
Thu May 26 12:05:05 CDT 2016

> On 26 May 2016, at 17:47, Mark Davis ☕️ <mark at> wrote:
> The canonical property and property value formats are in the *Alias* files.

Thanks for confirming!

Any chance the canonical names can be used in `Blocks.txt` as well, for consistency? This would simplify scripts that parse the Unicode database text files.

> On 26 May 2016, at 18:03, Ken Whistler <kenwhistler at> wrote:
> […] "canonical block name" is not a defined term in the standard.

I didn’t mean to imply it was — it’s just an English word. I meant “canonical” as in “without loose matching applied”.

> See the matching rules in UAX #44:
> and in particular, the matching rule for symbolic values, which applies in this case:

I know about loose matching, having recently implemented it (

> For enumerated properties, and especially for catalog properties such as Block and Script,
> the value of the property may be multi-word, and the best form to use in one context might
> not be exactly (as in binary string equality exact) the same as in another.

That makes sense, but shouldn’t it be consistent throughout the Unicode database text files?

