The Unicode Standard and ISO

Sun Jun 10 10:11:48 CDT 2018

> ... For another part it [sync with ISO/IEC 15897] failed because the Consortium refused to cooperate, despite of
repeated proposals for a merger of both instances.

First, ISO/IEC 15897 is built on a data-format specification, ISO/IEC TR 14652, that never achieved the support needed to become an international standard, and has since been withdrawn. (TRs cannot remain TRs forever.) Now, JTC1/SC35 began work four or five years ago to create data-format specification for this, Approved Work Item 30112. From the outset, Unicode and the US national body tried repeatedly to engage with SC35 and SC35/WG5, informing them of UTS #35 (LDML) and CLDR, but were ignored. SC35 didn’t appear to be interested a pet project and not in what is actually being used in industry. After several failed attempts, Unicode and the USNB gave up trying.

So, any suggestion that Unicode has failed to cooperate or is is dropping the ball with regard to locale data and ISO is simply uninformed.

Peter

From: Unicode <unicode-bounces at unicode.org> On Behalf Of Mark Davis ?? via Unicode
Sent: Thursday, June 7, 2018 6:20 AM
To: Marcel Schneider <charupdate at orange.fr>
Cc: UnicodeMailing <unicode at unicode.org>
Subject: Re: The Unicode Standard and ISO

A few facts.

> ... Consortium refused till now to synchronize UCA and ISO/IEC 14651.

ISO/IEC 14651 and Unicode have longstanding cooperation. Ken Whistler could speak to the synchronization level in more detail, but the above statement is inaccurate.

> ... For another part it [sync with ISO/IEC 15897] failed because the Consortium refused to cooperate, despite of
repeated proposals for a merger of both instances.

I recall no serious proposals for that.

(And in any event — very unlike the synchrony with 10646 and 14651 — ISO 15897 brought no value to the table. Certainly nothing to outweigh the considerable costs of maintaining synchrony. Completely inadequate structure for modern system requirement, no particular industry support, and scant content: see Wikipedia for "The registry has not been updated since December 2001".)

Mark

Mark

On Thu, Jun 7, 2018 at 1:25 PM, Marcel Schneider via Unicode <unicode at unicode.org<mailto:unicode at unicode.org>> wrote:
On Thu, 17 May 2018 09:43:28 -0700, Asmus Freytag via Unicode wrote:
>
> On 5/17/2018 8:08 AM, Martinho Fernandes via Unicode wrote:
> > Hello,
> >
> > There are several mentions of synchronization with related standards in
> > unicode.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Funicode.org&data=02%7C01%7Cpetercon%40microsoft.com%7Cc82f0a9dd1564948d1fe08d5cc7aad2d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636639749650227164&sdata=abUYeqt61H7FnzRXvJTy9NMmlk3ySvcMxyQ0bUDsNHc%3D&reserved=0>, e.g. in https://www.unicode.org/versions/index.html<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.unicode.org%2Fversions%2Findex.html&data=02%7C01%7Cpetercon%40microsoft.com%7Cc82f0a9dd1564948d1fe08d5cc7aad2d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636639749650237177&sdata=jRgnBkmBfcoU9dMrawMXkSpCxLyqz4N6UBgWrg8UZ88%3D&reserved=0>, and
> > https://www.unicode.org/faq/unicode_iso.html<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.unicode.org%2Ffaq%2Funicode_iso.html&data=02%7C01%7Cpetercon%40microsoft.com%7Cc82f0a9dd1564948d1fe08d5cc7aad2d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636639749650237177&sdata=n%2FQc61zUmDnDzSF%2F2mSIXiOeblqnrSs83zRxuKqnuqU%3D&reserved=0>. However, all such mentions
> > never mention anything other than ISO 10646.
>
> Because that is the standard for which there is an explicit understanding by all involved
> relating to synchronization. There have been occasionally some challenging differences
> in the process and procedures, but generally the synchronization is being maintained,
> something that's helped by the fact that so many people are active in both arenas.

Perhaps the cause-effect relationship is somewhat unclear. I think that many people being
active in both arenas is helped by the fact that there is a strong will to maintain synching.

If there were similar policies notably for ISO/IEC 14651 (collation) and ISO/IEC 15897
(locale data), ISO/IEC 10646 would be far from standing alone in the field of
Unicode-ISO/IEC cooperation.

>
> There are really no other standards where the same is true to the same extent.
> >
> > I was wondering which ISO standards other than ISO 10646 specify the
> > same things as the Unicode Standard, and of those, which ones are
> > actively kept in sync. This would be of importance for standardization
> > of Unicode facilities in the C++ language (ISO 14882), as reference to
> > ISO standards is generally preferred in ISO standards.
> >
> One of the areas the Unicode Standard differs from ISO 10646 is that its conception
> of a character's identity implicitly contains that character's properties - and those are
> standardized as well and alongside of just name and serial number.

This is probably why, to date, ISO/IEC 10646 features character properties by including
normative references to the Unicode Standard, Standard Annexes, and the UCD.
Bidi-mirroring e.g. is part of ISO/IEC 10646 that specifies in clause 15.1:

“[…] The list of these characters is determined by having the ‘Bidi_Mirrored’ property
set to ‘Y’ in the Unicode Standard. These values shall be determined according to
the Unicode Standard Bidi Mirrored property (see Clause 2).”

>
> Many of these properties have associated with them algorithms, e.g. the bidi algorithm,
> that are an essential element of data interchange: if you don't know which order in
> the backing store is expected by the recipient to produce a certain display order, you
> cannot correctly prepare your data.
>
> There is one area where standardization in ISO relates to work in Unicode that I can
> think of, and that is sorting.

Yet UCA conforms to ISO/IEC 14651 (where UCA is cited as entry #28 in the bibliography).
The reverse relationship is irrelevant and would be unfair, given that the Consortium
refused till now to synchronize UCA and ISO/IEC 14651.

Here is a need for action.

> However, sorting, beyond the underlying framework,
> ultimately relates to languages, and language-specific data is now housed in CLDR.
>
> Early attempts by ISO to standardize a similar framework for locale data failed, in
> part because the framework alone isn't the interesting challenge for a repository,
> instead it is the collection, vetting and management of the data.

For another part it failed because the Consortium refused to cooperate, despite of
repeated proposals for a merger of both instances.

>
> The reality is that the ISO model and its organizational structures are not well suited
> to the needs of many important area where some form of standardization is needed.
> That's why we have organization like IETF, W3C, Unicode etc..
>
> Duplicating all or even part of their effort inside ISO really serves nobody's purpose.

An undesirable side-effect of not merging Unicode with ISO/IEC 15897 (locale data) is
to divert many competent contributors from monitoring CLDR data, especially for French.

Here too is a huge need for action.

Thanks in advance.

Marcel

-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 27605 bytes
Desc: not available
URL: <http://unicode.org/pipermail/unicode/attachments/20180610/0bf8d0eb/attachment.bin>