Corrigendum #9

Karl Williamson public at
Thu Jul 3 22:30:00 CDT 2014

On 07/03/2014 02:48 PM, Asmus Freytag wrote:
> On 7/3/2014 11:02 AM, Richard COOK wrote:
>> On Jul 2, 2014, at 8:02 AM, Karl Williamson <public at>
>> wrote:
>>> Corrigendum #9 has changed this so much that people are coming to me
>>> and saying that inputs may very well have non-characters, and that
>>> the default should be to pass them through.  Since we have no
>>> published wording for how the TUS will absorb Corrigendum #9, I don't
>>> know how this will play out.  But this abrupt a change seems wrong to
>>> me, and it was done without public input or really adequate time to
>>> consider its effects.
>> Asmus,
>> I think you will recall that in late 2012 and early 2013, when the
>> subject of the proposed changes (or clarifications) to text relating
>> to noncharacters first arose, we (at Wenlin) expressed our concerns.
>> Some concerns were grave, and some of the discussion and comments were
>> captured in this web page:
>> <>
>> There was much back and forth on the editorial list. Discussion
>> clarified some of the issues for me, and mollified some of my concerns.
>> At that time we did implement support for noncharacters in Wenlin,
>> controlled by an Advanced Option to:
>>     Replace noncharacters with [U+FFFD]
>> This user preference is turned on by default.
>> Not sure if revisiting any of our prior discussion would help clarify
>> the evolution of thinking on this issue.
>> But I did want to mention that the comment “without public input” is
>> not quite correct.
> Richard,
> "public input" is best understood as PRI or similar process, not
> discussions by members or other people closely associated with the
> project.  Also, in particular, discussions on the editorial list are
> invisible to the public.
>> As is so often the case, and as the web page above shows, there was
>> input and discussion. Whether the amount of time given to this was
>> really adequate is another question. Work required may expand to fill
>> the available time, and perhaps more time is now available.
> Given the wide ranging nature of implementations this "clarification"
> affected, I believe the process failed to provide the necessary safeguards.
> Conformance changes are really significant, and a Corrigendum, no matter
> how much it is presented as harmless clarification, does affect
> conformance.
> The UTC would be well served to formally adopt a process that requires a
> PRI as well as resolutions taken at two separate UTCs to approve any
> Corrigendum.
> There are changes to properties and algorithms that would also benefit
> from such an extended process that has a guaranteed minimum number of
> times for the change to be debated, to surface in minutes and to surface
> in calls for public input, rather than sailing quietly and quickly into
> the standard.
> The threshold for this should really be rather low -- as the standard
> has matured, the number and nature of implementations that depend on it
> have multiplied, to the point where even a diverse membership is no
> guarantee that issues can be correctly identified and averted.
> With the minutes from the UTC only recording decisions, one change, to
> require an initial and a confirming resolution at separate meetings
> would allow more issues to surface. It would also help if proposal
> documents were updated to reflect the initial discussion, much as it is
> done with character encoding proposals that are updated to address
> additional concerns identified or resolved.
> That said, I could imagine a possible exception for true errata (typos),
> where correcting a clear mistake should not be unnecessarily drawn out,
> so the error can be removed promptly. Such cases usually are turning on
> facts (was there an editing mistake, was there new data about how a
> character is used that makes an original property assignment a mistake
> (rather than a less than optimal choice).
> Despite being called a "clarification" this corrigendum is not in the
> nature of an erratum.
> A./

Exactly.  There should have been a PRI before this was approved.  I read 
the unicore list, and I was not aware of the change until after the 
fact.  The first sentence of your more contemporaneous web page
indicates that you too did not know about this until after the fact, and 
undertook this effort upon finding out about it to understand the 
magnitude and cope with the change, which as Asmus said, is indeed a 
change and not a clarification.

More information about the Unicode mailing list