Unicode encoding policy

Philippe Verdy verdy_p at wanadoo.fr
Wed Dec 31 01:36:18 CST 2014


One important factor is also stability: some symbols may get a temporary
interest and then raidly abandoned for a new flavor, hardly related to the
previously encoded one.
Stability is also a need when UTC resources and work time is limited to
focus in things that have been already waited for long (even if there were
some difficult discussions, notably when trying to deal with variants and
different usage patterns, or in more complex situations discovered with
difficulties like text layout; or creation of distinctive of contextual
ligatures, or when discussing about some critical character properties such
as word boundaries, or expected specific alignments with other characters
including with some other scripts).
For that the UTC has a useful tool: the roadmap which attempts to organize
the standardization work by topics and communities of interest, in order to
avoid duplicate discussions or create coherent proposals that will also
resist other future additions.
Emojis however exist inly in relation to themselves, and their coherence
really comes from their adoption on a range of devices or OSes and common
applications.

Large vendors (like Google for Android, Apple for iOS, and Microsoft for
Windows, but also some wellknown websites connected to many others like
Yahoo, Twitter or Facebook and their supported applications running on
various OSes and devices, or Baidu in China, or Mail.ru in Russia, are also
desiring to open their own sets to offer support to users communicating
from devices/OSes made by other vendors inclujding in other countries.
There could be also other "killer" apps amde available on various OSes and
devices which could benefir from this standardization, such as keyboard
extensions for smartphones/tablets, or sets of generic icons commnly needed
for user interfaces (e.g. the icons that appear in Gmail for rich text
editing or for managing emails and folders; people want to be able to use
similar looking icons even if their exact design change specifically,
simply because websites and support services will frequently reference them
and people will want to discuss about their use in varous contexts; the
same is true about typical icons found on popular navigation maps).

People will understand those icons/symbols and will use them because they
understand clearly what they mean in similar kind of usages. Those symbols
are good candidate for standardization indepedendantly of their
site-specific or device specific look (which can also evolve across
versions, such as the symbols for buttons at the bottom of Android
displays: having a standardized character for these evolving icons can also
help application authors to describe their own UI and how to use them on a
larger range of devices and versions: users will see the appropriate icon
for their own local device in its current brand and version, but support
pages do not need to be rewritten/modified to show different screenshots;
these visible icons will also work if users have installed a different UI
theme or if these icons are relocated elsewhere than what is displayed in
basic screenshots made on a few devices in some old versions of their
specific UI); the need for this icons is the same across all these devices
and versions for similar functions. So we have icons/symbols with similar
"spirit" across a large range of devices for basic functions: telephone
handset to place a call or to reply, or to close a communication.

In fact this is the same kind of things that have been used since long for
icons for controlling all audio devices : play, stop, rewind, forward,
pause, power up, power off, enter sleep mode, wake up, mute, volume
up/down, icons for activating/deactivating Wifi or Bluetooth, icons for the
headset or the radio, ejecting a media; start recording... Look also on a
wide range of remote TV controlers. Note all of them are using distinctive
glyphs, some are just differentiated by colors such as the
red/yellow/green/blue buttons used in Teletext remote controlers (in my
opinion color is not a requirement, and this could also be buttons with
readable labels in a box, if need for accessibility is a demand: this has
been recently standardized for tinting facial emojis by humane skin color,
with an interesting proposed alternate representation where color can also
be represented by a non ligatured monchromatic glyph).

In all these cases, the demand for it and their use in various contexts
where they can be tuned locally to match user expectations, is an excellent
reason for standardizing them without breaking their intended meaning in
those specific tuning contexts.

Other interesting sets are those standardized on road signs, or warnin
signs on various products (they are frequently international because their
meaning is legally imperative for road users) or because their informative
meaning is about services found almost everywhere in the world (e.g.
luggage disposal, taxis, toilets, shower, hospital, parking, lunch places,
vehicle categories, tolls on motorways, TV sets, ... Some of them are very
country specific (such as the "red carot" used in France signaling tobacco
resellers). Those symbols are not just present on roads or on maps but will
be found aso on published tourism guides or as indicators in websites
showing coherent sets of generic services such as hotels or campings to
list their additional local services.

Those sets initially are inventions but as soon as their usage expands and
their meaning starts being widely understood in a country, they will leak
to other places with minor variants. But they all have in common that they
were initially not encoded as characters, their experimentation and
use developed and then came the time for standardizing more of less these
variants, nationally or internationally. Then they started being used also
in other contexts for which they were not initially meant (e.g. the STOP
side). They also had already several authorities regulating their use in
specific contexts in which they became mandatory or highly recommanded. The
UTC may now encode them.: usage is demonstrated, there's stabiity, there's
already an authority supporting them and ready to accept their use by
everyone.

2014-12-29 20:46 GMT+01:00 Asmus Freytag <asmusf at ix.netcom.com>:

> On 12/29/2014 10:32 AM, Doug Ewell wrote:
>
>> Asmus Freytag wrote:
>>
>>  The "critical mass" of support is now assumed for currency symbols,
>>> some special symbols like emoji, and should be granted to additional
>>> types of symbols, punctuations and letters, whenever there is an
>>> "authority" that controls normative orthography or notation.
>>>
>>> Whether this is for an orthography reform in some country or addition
>>> to the standard math symbols supported by AMS journals, such external
>>> adoption can signify immediate "critical need" and "critical mass of
>>> option" for the relevant characters.
>>>
>>
>> To me, it is remarkable that the "critical mass of support" argument that
>> is applied, entirely appropriately, to new currency symbols (however
>> misguided the motives for such might be) and math symbols and characters
>> for people's names, is now also applied to BURRITO and UNICORN FACE.
>>
>>  Does it - in principle - matter what a symbol is used for? If millions
> of happy users choose to communicate by peppering their messages with
> BURRITO and UNICORN FACE is that any less worthy of standardization than if
> thousands (or hundreds) of linguists use some arcane letterform to mark
> pronunciation differences between neighboring dialects on the Scandinavian
> peninsula?
>
> The "critical mass" argument does not (and should not) make value
> judgements, but instead focus on whether the infrastructure exists to make
> a character code widely available pretty much directly after publication,
> and whether there is implicit or explicit demand that would guarantee that
> such code is actually widely used the minute it comes available.
>
> For currency symbols, or for a new letter form demanded by a new or
> revised, but standard, orthography, the demand is created by some
> "authority" creating a requirement for conforming users. Because of that,
> the evaluation of the "critical mass" requirement is straightforward.
>
> Emoji lack an "authority", but they do not lack demand. For better or for
> worse, they have grabbed significant mind share; the number of news
> reports, blogs, social media posts, shared videos and what not that were
> devoted to Emoji simply dwarfs anything reported on currency symbols in a
> comparative time frame. With tracking applications devoted to them, anyone
> can convince themselves, in real time, that the entire repertoire is being
> used, even, as appropriate for such a collection, with a clear
> differentiation by frequency.
>
> Nevertheless, the indication is clear that any emoji that will be added by
> the relevant vendors is going to be used as soon as it comes available.
> Further, as no vendor has a closed ecosystem, to be usable requires
> agreement on how they are coded.
>
> The critical question, and I fully understand that this gives you pause,
> is one of selection. There are hundreds, if not thousands of potential
> additions to the emoji collection, some fear the set is, in principle,
> endless. Lacking an "authority" how does one come to a principled agreement
> on encoding any emoji now, rather than later.
>
> One would run an experiment, which is to say, create an alternate
> environment where users can use non-standard emoji and then the
> Uni-scientists in white lab coats could count the frequency of usage and
> promote the cream off the top to standardized codes.
>
> Or one could run an experiment where one defines a small number of slots,
> say 40, and opens them up for public discussion, and proceeds on that
> basis. Yes, that would turn the UTC into the "authority".
>
> My personal take is that the former approach is inappropriate for
> something that is in high demand and actively supported; the latter I can
> accept, provisionally, as an experiment to try to deal with an evolving
> system. Because of the ability to track, in real time, the use or non-use
> of any of the new additions it would be a true experiment, the outcome of
> which can be accurately measured. If it should lead to the standardization
> of few dozen symbols that prove not as popular as predicted, then we would
> conclude a failure of the experiment, and retire this process. Otherwise,
> I'd have no problem cautiously continuing with it.
>
>  But then, I remember when folks used to cite the WG2 "Principles and
>> Procedures" document for examples of what was and was not a good candidate
>> for encoding. That seems so long ago now.
>>
>
> The P&P, like most by-laws and constitutions, are living documents. In
> this case, they try to capture best practice, without taking from the UTC
> (or WG2) the ability to deal with new or changed situations.
>
> The degree to which emoji have captured the popular imagination is
> unprecedented. It means the game has changed. Let's give the UTC the space
> to work out appropriate coping mechanisms.
>
> A./
>
> PS: this does not mean that, for all other types of code points, the
> existing wording on the P&P can simply be disregarded. In fact, the end
> result will be to see them updated with additional criteria explicitly
> geared towards the kind of high-profile use case we are discussing here.
>
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20141231/65cddeb0/attachment.html>


More information about the Unicode mailing list