Corrigendum #9

Thu Jul 3 15:48:59 CDT 2014

On 7/3/2014 11:02 AM, Richard COOK wrote:
> On Jul 2, 2014, at 8:02 AM, Karl Williamson <public at khwilliamson.com> wrote:
>
>> Corrigendum #9 has changed this so much that people are coming to me and saying that inputs may very well have non-characters, and that the default should be to pass them through.  Since we have no published wording for how the TUS will absorb Corrigendum #9, I don't know how this will play out.  But this abrupt a change seems wrong to me, and it was done without public input or really adequate time to consider its effects.
> Asmus,
>
> I think you will recall that in late 2012 and early 2013, when the subject of the proposed changes (or clarifications) to text relating to noncharacters first arose, we (at Wenlin) expressed our concerns. Some concerns were grave, and some of the discussion and comments were captured in this web page:
>
> <http://wenlininstitute.org/UnicodeNoncharacters/>
>
> There was much back and forth on the editorial list. Discussion clarified some of the issues for me, and mollified some of my concerns.
>
> At that time we did implement support for noncharacters in Wenlin, controlled by an Advanced Option to:
>
> 	Replace noncharacters with [U+FFFD]
>
> This user preference is turned on by default.
>
> Not sure if revisiting any of our prior discussion would help clarify the evolution of thinking on this issue.
>
> But I did want to mention that the comment “without public input” is not quite correct.

Richard,

"public input" is best understood as PRI or similar process, not 
discussions by members or other people closely associated with the 
project.  Also, in particular, discussions on the editorial list are 
invisible to the public.

> As is so often the case, and as the web page above shows, there was input and discussion. Whether the amount of time given to this was really adequate is another question. Work required may expand to fill the available time, and perhaps more time is now available.

Given the wide ranging nature of implementations this "clarification" 
affected, I believe the process failed to provide the necessary safeguards.

Conformance changes are really significant, and a Corrigendum, no matter 
how much it is presented as harmless clarification, does affect conformance.

The UTC would be well served to formally adopt a process that requires a 
PRI as well as resolutions taken at two separate UTCs to approve any 
Corrigendum.

There are changes to properties and algorithms that would also benefit 
from such an extended process that has a guaranteed minimum number of 
times for the change to be debated, to surface in minutes and to surface 
in calls for public input, rather than sailing quietly and quickly into 
the standard.

The threshold for this should really be rather low -- as the standard 
has matured, the number and nature of implementations that depend on it 
have multiplied, to the point where even a diverse membership is no 
guarantee that issues can be correctly identified and averted.

With the minutes from the UTC only recording decisions, one change, to 
require an initial and a confirming resolution at separate meetings 
would allow more issues to surface. It would also help if proposal 
documents were updated to reflect the initial discussion, much as it is 
done with character encoding proposals that are updated to address 
additional concerns identified or resolved.

That said, I could imagine a possible exception for true errata (typos), 
where correcting a clear mistake should not be unnecessarily drawn out, 
so the error can be removed promptly. Such cases usually are turning on 
facts (was there an editing mistake, was there new data about how a 
character is used that makes an original property assignment a mistake 
(rather than a less than optimal choice).

Despite being called a "clarification" this corrigendum is not in the 
nature of an erratum.

A./
>
> -Richard
>
>
>
>
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
>