Unicode philosophy - technical symbols

Jim DeLaHunt list+unicode at jdlh.com
Sat Oct 28 04:26:21 CDT 2023


On 2023-10-28 00:14, William_J_G Overington via Unicode wrote:
> Asmus Freytag wrote as follows.
>
>
> > The fact that a symbol is cataloged in some list is itself not 
> sufficient reason to consider it a text element in plain text. Which 
> would be a necessary requirement for encoding.
>
>
> Yet it is not just "some list", it is an ISO/IEC list.
>
>
> Yet why is considering a symbol as a text element in plain text a 
> necessary requirement for encoding? Apart from that rule being the 
> existing rule that was made at sometime in the past, possibly under 
> different circumstances than those that exist now.
>
It makes perfect sense to me that the Unicode standard exercises 
restraint in its role in the ecosystem. And being a "plain text 
encoding" seems like a very helpful kind of restraint.
>
>
> Is that rule limiting progress?
>
Most such decisions are tradeoffs. Most such decisions occur in the 
context of an ecosystem which includes fonts, text layout software, 
shaping engines, input methods, operating systems, user comprehension, 
and more. Most decisions impose various costs on various parts of the 
ecosystem. So the question to ask is, will such an addition, in context, 
lead to benefits which outweigh the costs, and have an advantage over 
alternatives?


> Suppose please, for example, that someone is using a desktop 
> publishing program to produce a document, an instruction manual for a 
> piece of equipment, the document initially stored in a proprietary 
> file format, with the person intending to export the text in a PDF 
> document.
>
>
> One frameful of text may perhaps start with "Please consider the 
> symbol in Figure 1 ..." and another frameful of text may show the 
> symbol together with a text caption and text stating that it is Figure 1.
>
I think that this is a weak example, because a desktop publishing 
program has an alternative way to display the symbol in Figure 1: as a 
graphic. And the PDF format has comprehensive ways to represent graphics 
as well as text.


In North America, there used to be a brand of hand tools called 
Craftsman. They had a life-time unconditional replacement guarantee. The 
joke used to be that "any Craftsman tool can be used as a hammer". If 
you broke your Craftsman screwdriver while banging in nails in with the 
handle, get it replaced. In the same token, these discussion make me 
think that some people believe that any mark on a page should be made 
using character and text mechanisms, rather than graphics or other 
mechanisms which might be more appropriate.


> Is it reasonable that the symbol is encoded into Unicode as a 
> character, notwithstanding that it is not actually in a run of text 
> characters? Plane 5 is currently empty, why not use it?
>
IMHO, no, it is not reasonable. It is a small benefit, given the easy 
alternative of representing symbols as graphics, and that good layout 
tools can embed graphics in runs of text. Lack of Unicode scalar values 
is not the constraint.  The discussion of encoding this symbol, and all 
the infinity of other symbols which can be justified the same way, have 
an opportunity cost in UTC decision bandwidth. The benefit of encoding 
the symbol is not unlocked until font makers add the symbol to their 
fonts, and users update the fonts in their systems. There is a burden to 
font makers to add the symbol, to shaping engines to handle the symbol, 
to input methods to find a way to input that symbol, to users to learn 
that this symbol exists, and so on. And many users will not learn that 
the symbol exists, so will get no benefit.


Put down your Craftsman screwdriver, and learn to use the hammer to 
drive nails.

>
> William Overington
>
> Saturday 28 October 2023
>
>
>
-- 
.   --Jim DeLaHunt,jdlh at jdlh.com      http://blog.jdlh.com/  (http://jdlh.com/)
       multilingual websites consultant, Vancouver, B.C., Canada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20231028/28540193/attachment.htm>


More information about the Unicode mailing list