Tag characters and in-line graphics (from Tag characters)

Chris idou747 at gmail.com
Tue Jun 2 21:09:17 CDT 2015

> On 3 Jun 2015, at 11:22 am, Martin J. Dürst <duerst at it.aoyama.ac.jp> wrote:
> On 2015/05/29 11:37, John wrote:
>> If I had a large document that reused a particular character thousands of times,
> Then it would be either a very boring document (containing almost only that same character) or it would be a very large document.

If you have a daughter, look at her Facebook messenger, and then get back to me.

>> would this HTML markup require embedding that character thousands of times, or could I define the character once at the beginning of the sequence, and then refer back to it in a space efficient way?
> If you want space efficiency, the best thing to do is to use generic compression. Many generic compression methods are available, many of them are widely supported, and all of them will be dealing with your case in a very efficient way

You can’t ask the entire computing universe to compress everything all the time. And that is what your comment amounts to. Because the whole point under discussion is how can we encode stuff such that you can hope to universally move it around between different documents, formats, applications, input fields and platforms without any massage.

> Given that its been agreed that private use ranges are a good thing,
> That's not agreed upon. I'd say that the general agreement is that the private ranges are of limited usefulness for some very limited use cases (such as designing encodings for new scripts).

They are of limited usefulness precisely because it is pathologically hard to make use of them in their current state of technological evolution. If they were easy to make use of, people would be using them all the time. I’d bet good money that if you surveyed a lot of applications where custom characters are being used, they are not using private use ranges. Now why would that be?

>> and given that we can agree that exchanging data is a good thing,
> Yes, but there are many other ways to do that besides Unicode. And for many purposes, these other ways are better suited.

The point is a universally recognised way. Of course you, me or anybody could design many good ways to solve any problem we might come up with. That doesn’t mean it will interoperate with anybody else though.

>> maybe something should bring those two things together. Just a thought.
> Just a 'non sequitur'.
> Regards,   Martin.

More information about the Unicode mailing list