Unicode philosophy - technical symbols
Jim DeLaHunt
list+unicode at jdlh.com
Sat Oct 28 04:26:21 CDT 2023
On 2023-10-28 00:14, William_J_G Overington via Unicode wrote:
> Asmus Freytag wrote as follows.
>
>
> > The fact that a symbol is cataloged in some list is itself not
> sufficient reason to consider it a text element in plain text. Which
> would be a necessary requirement for encoding.
>
>
> Yet it is not just "some list", it is an ISO/IEC list.
>
>
> Yet why is considering a symbol as a text element in plain text a
> necessary requirement for encoding? Apart from that rule being the
> existing rule that was made at sometime in the past, possibly under
> different circumstances than those that exist now.
>
It makes perfect sense to me that the Unicode standard exercises
restraint in its role in the ecosystem. And being a "plain text
encoding" seems like a very helpful kind of restraint.
>
>
> Is that rule limiting progress?
>
Most such decisions are tradeoffs. Most such decisions occur in the
context of an ecosystem which includes fonts, text layout software,
shaping engines, input methods, operating systems, user comprehension,
and more. Most decisions impose various costs on various parts of the
ecosystem. So the question to ask is, will such an addition, in context,
lead to benefits which outweigh the costs, and have an advantage over
alternatives?
> Suppose please, for example, that someone is using a desktop
> publishing program to produce a document, an instruction manual for a
> piece of equipment, the document initially stored in a proprietary
> file format, with the person intending to export the text in a PDF
> document.
>
>
> One frameful of text may perhaps start with "Please consider the
> symbol in Figure 1 ..." and another frameful of text may show the
> symbol together with a text caption and text stating that it is Figure 1.
>
I think that this is a weak example, because a desktop publishing
program has an alternative way to display the symbol in Figure 1: as a
graphic. And the PDF format has comprehensive ways to represent graphics
as well as text.
In North America, there used to be a brand of hand tools called
Craftsman. They had a life-time unconditional replacement guarantee. The
joke used to be that "any Craftsman tool can be used as a hammer". If
you broke your Craftsman screwdriver while banging in nails in with the
handle, get it replaced. In the same token, these discussion make me
think that some people believe that any mark on a page should be made
using character and text mechanisms, rather than graphics or other
mechanisms which might be more appropriate.
> Is it reasonable that the symbol is encoded into Unicode as a
> character, notwithstanding that it is not actually in a run of text
> characters? Plane 5 is currently empty, why not use it?
>
IMHO, no, it is not reasonable. It is a small benefit, given the easy
alternative of representing symbols as graphics, and that good layout
tools can embed graphics in runs of text. Lack of Unicode scalar values
is not the constraint. The discussion of encoding this symbol, and all
the infinity of other symbols which can be justified the same way, have
an opportunity cost in UTC decision bandwidth. The benefit of encoding
the symbol is not unlocked until font makers add the symbol to their
fonts, and users update the fonts in their systems. There is a burden to
font makers to add the symbol, to shaping engines to handle the symbol,
to input methods to find a way to input that symbol, to users to learn
that this symbol exists, and so on. And many users will not learn that
the symbol exists, so will get no benefit.
Put down your Craftsman screwdriver, and learn to use the hammer to
drive nails.
>
> William Overington
>
> Saturday 28 October 2023
>
>
>
--
. --Jim DeLaHunt,jdlh at jdlh.com http://blog.jdlh.com/ (http://jdlh.com/)
multilingual websites consultant, Vancouver, B.C., Canada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20231028/28540193/attachment.htm>
More information about the Unicode
mailing list