Corrigendum #9

Asmus Freytag asmusf at
Sun Jun 1 01:20:16 CDT 2014

On 5/31/2014 10:06 PM, Philippe Verdy wrote:
> I've not proposed to move these characters elsewhere (or ro reencode 
> them), why do you think that?.
> I just challenge your statement that a block cannot be discontinuous,

Well, go ahead and challenge that.

As implemented in the current nameslist and file blocks.txt a block 
would have this definition. "A block is a uniquely named, continuous, 
non-overlapping range of code points, containing a multiple of 16 code 
points, and starting at a location that is a multiple of 16."

Per chapter 3 the definition of the property block is given in Section 
17.1 (Code Charts) - which contains no actual definition, only tells you 
how they are used in organizing the code charts, so, effectively, a 
block is what blocks.txt (and therefore the names list) say it is. The 
way blocks are assigned, has been following the empirically derived 
definition I gave above, and at this point, the production process for 
the code charts has some of these restrictions built in.

Chapter 3 calls blocks an enumerated property, meaning that the names 
must be unique, and blocks.txt associates a single range with a name, in 
concurrence with the glossary, which says blocks represent a range of 
characters (not a collection of ranges). Likewise, changing blocks to 
not starting at or containing multiples of 16 code points (sometimes 
called a "column") is equally not in the cards - it would break the very 
production process for chart production. The description of how blocks 
are used does not contemplate that they can be mutually overlapping, so 
that becomes part of their implicit definition as well.

There's reason behind the madness of not providing an explicit 
definition of "block" in the standard. It has to do with discouraging 
people from relying on what is largely an editorial device (headers on 
charts). However, it does not mean that arbitrary redefinition of a 
block from a single to multiple ranges is something that can or should 
be contemplated.

So, the chances that UTC would agree to such changes, even if not 
formally guaranteed, is de facto nil.


More information about the Unicode mailing list