The Unicode Standard and ISO
Marcel Schneider via Unicode
unicode at unicode.org
Thu Jun 7 06:25:22 CDT 2018
On Thu, 17 May 2018 09:43:28 -0700, Asmus Freytag via Unicode wrote:
> On 5/17/2018 8:08 AM, Martinho Fernandes via Unicode wrote:
> > Hello,
> > There are several mentions of synchronization with related standards in
> > unicode.org, e.g. in https://www.unicode.org/versions/index.html, and
> > https://www.unicode.org/faq/unicode_iso.html. However, all such mentions
> > never mention anything other than ISO 10646.
> Because that is the standard for which there is an explicit understanding by all involved
> relating to synchronization. There have been occasionally some challenging differences
> in the process and procedures, but generally the synchronization is being maintained,
> something that's helped by the fact that so many people are active in both arenas.
Perhaps the cause-effect relationship is somewhat unclear. I think that many people being
active in both arenas is helped by the fact that there is a strong will to maintain synching.
If there were similar policies notably for ISO/IEC 14651 (collation) and ISO/IEC 15897
(locale data), ISO/IEC 10646 would be far from standing alone in the field of
> There are really no other standards where the same is true to the same extent.
> > I was wondering which ISO standards other than ISO 10646 specify the
> > same things as the Unicode Standard, and of those, which ones are
> > actively kept in sync. This would be of importance for standardization
> > of Unicode facilities in the C++ language (ISO 14882), as reference to
> > ISO standards is generally preferred in ISO standards.
> One of the areas the Unicode Standard differs from ISO 10646 is that its conception
> of a character's identity implicitly contains that character's properties - and those are
> standardized as well and alongside of just name and serial number.
This is probably why, to date, ISO/IEC 10646 features character properties by including
normative references to the Unicode Standard, Standard Annexes, and the UCD.
Bidi-mirroring e.g. is part of ISO/IEC 10646 that specifies in clause 15.1:
“[…] The list of these characters is determined by having the ‘Bidi_Mirrored’ property
set to ‘Y’ in the Unicode Standard. These values shall be determined according to
the Unicode Standard Bidi Mirrored property (see Clause 2).”
> Many of these properties have associated with them algorithms, e.g. the bidi algorithm,
> that are an essential element of data interchange: if you don't know which order in
> the backing store is expected by the recipient to produce a certain display order, you
> cannot correctly prepare your data.
> There is one area where standardization in ISO relates to work in Unicode that I can
> think of, and that is sorting.
Yet UCA conforms to ISO/IEC 14651 (where UCA is cited as entry #28 in the bibliography).
The reverse relationship is irrelevant and would be unfair, given that the Consortium
refused till now to synchronize UCA and ISO/IEC 14651.
Here is a need for action.
> However, sorting, beyond the underlying framework,
> ultimately relates to languages, and language-specific data is now housed in CLDR.
> Early attempts by ISO to standardize a similar framework for locale data failed, in
> part because the framework alone isn't the interesting challenge for a repository,
> instead it is the collection, vetting and management of the data.
For another part it failed because the Consortium refused to cooperate, despite of
repeated proposals for a merger of both instances.
> The reality is that the ISO model and its organizational structures are not well suited
> to the needs of many important area where some form of standardization is needed.
> That's why we have organization like IETF, W3C, Unicode etc..
> Duplicating all or even part of their effort inside ISO really serves nobody's purpose.
An undesirable side-effect of not merging Unicode with ISO/IEC 15897 (locale data) is
to divert many competent contributors from monitoring CLDR data, especially for French.
Here too is a huge need for action.
Thanks in advance.
More information about the Unicode