Ecma-48 proposed styling controls update updated & math expression representation proposal update

Asmus Freytag asmusf at ix.netcom.com
Fri Jan 12 17:58:48 CST 2024


ECMA-48 is not plain text. It is a form of markup that uses syntax 
characters other than those from the printable ASCII range, but that's 
about the only distinction.

It's different from a true binary format as well, which would use things 
like addresses and lengths to mark the location of text runs and styling 
info. Instead, like any other markup, it uses character codes inserted 
into the data stream.

Now that we have that out of the way, let's look at the clipboard.

The clipboard contains both data and metadata. By telling a recipient 
that data is in HTML format it can be displayed as rich text, instead of 
as HTML source. The same is true for rtf or ECMA-48.

The same data can be present in multiple formats on the clipboard. 
That's what's behind the ability to paste "just the text" from a copied 
section, discarding the styling.

Logically, for that to work, either the sender or the recipient of the 
clipboard data must understand what the "just the text" part of the data 
represents and how to discard the styling. It's been too long, but from 
what I remember, it was the sender that had the option of offering 
multiple formats and the recipient could pick any that it understood.

That's the only logical approach, because only the sender can be assumed 
to know the format the data is in. The receiver could do post-processing 
only on data formats already known to it.

Your ECMA-48 terminal app would presumably want to offer both the 
ECMA-48 stream with suitable metadata defining it as such, as well a 
plain-text stream, which discards the styling.

For nested styling syntax I don't know whether sending applications 
would perform an "auto close" of any open styling commands when 
packaging up the selected text, or whether that would be done by the 
receiving app, assuming it understands the format. The problem how to 
handle selection at the boundary of a style run when the style commands 
themselves are not visible to the user is the same for markup languages 
as for ECMA-48.

Nothing new to see here, move right along.

A./

On 1/12/2024 3:26 PM, Marius Spix via Unicode wrote:
> Applications like Word or web browsers are able to preserve formatting
> by using rich text formats like HTML or RTF in the clipboard. ECMA-48
> proposed styling controls work on the plaintext layer, independenlty
> from the application, as long the renderer (e. g. Uniscribe or HarfBuzz)
> supports them. That would require the clipboard handler of the
> operating system to be aware of these sequences.
>
>
> Am Fri, 12 Jan 2024 22:08:19 +0000
> schrieb Doug Ewell<doug at ewellic.org>:
>
>> Eli Zaretskii wrote:
>>
>>> Sorry, I'm probably missing something, because I don't see the
>>> relevance.  My point is that copy/paste through the clipboard uses
>>> formats that are not plain text, and encode the styles and typefaces
>>> by using methods that are not compatible with plain text.
>> I think Marius will have to address what he meant, as you and I are
>> talking past each other.
>>
>> If ECMA-48 markup is part of the plain-text stream, and it is copied
>> from one app to another in a plain-text Clipboard, then all of the
>> ECMA-48 sequences should survive the transit.
>>
>>>> Alternatively, why is the stated user-experience problem for
>>>> ECMA-48 not a problem for Word?
>>> I thought I answered that?  Or what do you mean by "user
>>> experience"?
>> That question was semi-rhetorical, and was for Marius, who again will
>> need to respond. I thought he was talking about the human user trying
>> to select text to be copied, and inadvertently failing to select a
>> starting or ending ECMA-48 sequence because they are not
>> human-visible.
>>
>>> If pasting between applications, the answer is again clipboard
>>> format that is not plain text.  If you copy plain text, the
>>> formatting is lost.
>> Wait: are we saying that ECMA-48 sequences like CSI 31m are plain
>> text, or that they are not?
>>
>> --
>> Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20240112/675565cd/attachment.htm>


More information about the Unicode mailing list