Encoding colour (from Re: Encoding italic)

Philippe Verdy via Unicode unicode at unicode.org
Mon Feb 11 06:19:53 CST 2019

Le dim. 10 févr. 2019 à 02:33, wjgo_10009 at btinternet.com via Unicode <
unicode at unicode.org> a écrit :

> Previously I wrote:
> > A stateful method, though which might be useful for plain text streams
> > in some applications, would be to encode as characters some of the
> > glyphs for indicating colours and the digit characters to go with them
> > from page 5 and from page 3 of the following publication.
> > http://www.users.globalnet.co.uk/~ngo/locse027.pdf
> Thinking about this further, for this application copies of the glyphs
> could be redesigned so as to be square and could be emoji-style and the
> meanings of the characters specifying which colour component is to be
> set could be changed so that they refer to the number previously entered
> using one or more  of the special  digit characters. Thus the setting of
> colour components could be done in the same reverse notation way that
> the FORTH computer language works.

FORTH is not relevant to this discussion. Anyway the usual order for Forth
operators (Forth is a stack-based language, similar to PostScript, and
working like calculators using the Polish reversed order) is to push the
operands from left to right and then use the operator which will pop them
in reverse order from right to left before pushing the result on the stack
(so "a/b/c" becomes "/a get /b get div /c get div"). But colors are just an
operator like "rgb(r,b,g)" and the natural order in stack based languages
should also be "/r get /g get /b get rgb".
Note that C/C++ (with C calling conventions) usually use another order for
its stack, pushing parameters from right to left (if they are not passed
via dedicated registers in fix order, the first parameter from the right
that fits a register being not passed in the stack but on the "main"
accumulator register, possibly a pair or registers for long integer or long
pointers, or a different register for floatting points if floatting point
registers are used).

There's no standard for the order of parameters in stack based languages.
It is arbitrary and specific to each language or specific implementations
of them. So if you want to create your own scripting language to support
your non-standard extension, you can choose any order you want, but this
will still not define a standard related to other languages that have never
been bound to a specific evaluation/encoding order. Then don't pretend it
will be part of the Unicode standard, which is not a scripting language and
that does not offer an "ABI" for stateful encodings with arbitarily long
contexts (Unicode has placed very low limits on the maximum length of
lookahead needed to process text, your extension would not work under these
reasonnable limits, so it will have limited private use and cannot be part
of TUS).

You may create your "proof of concept" (tested on limited configurations)
but it will just be private

[And so it should use PUA for full compatibility and not abuse the other
standardized code points, as your extension would not be
compatible/conforming to the existing rules and limits, without amending
them and discussing a lot how existing conforming applications can be
adapted, and analyzing the effects if they are not updated. Approving this
extension is another thing, and it will need to pass the standard process
to be added to the proposals schedule, pass through the two technical
comities, pass the alpha and beta phases, and then the prepublication.
You'll also need to work on documentations and fix many quirks found in
them, then you'll need supporters to pass the vote (and if you're not an
UTC member or an ISO member, you will never be able to vote for it: you
need then to convince the voters by listening what they remark and refine
your specifications to match their desires, and probably to split your
proposal in several parts or limit your initial goals, leaving the other
problematic poitns for later; if what remains "stable" in your proposal may
not be usable in practice without the additional extensions still in
discussion, and in fact this subset may still remain in the encoding queue
for years, until it reaches a point where it starts being usable for
practical problems; before that, you'll have to experiment with private-use
and should be ready to accept competing proposals, not compatible with your
proposal, and learn from them to reach an acceptable consensus; reaching
that consensus is the longest step but initially most voters will not
decide for or against your proposal if they are not confident enough about
the merit of each proposal, because they want to preserve a resasonnable
compatibility across TUS versions and with existing applications without
adding further problems, notably in terms of confusability/security. But
don't ask them to break the existing stability rules which were even harder
to formalize: these rules is the foundation that allowed TUS/ISO 10646 to
become a successful worldwide standard with lot of applications using them
without much trouble and much more benefit than the older legacy
non-interoperable encodings.]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20190211/dde8ca8a/attachment.html>

More information about the Unicode mailing list