Unicode "no-op" Character?

Tue Jul 2 23:08:59 CDT 2019

I don’t think you understood me at all. I can packetize a string with any character that is guaranteed not to appear in the text. Suggestions of TAB or EQUALS don’t even meet that simple criterion; they often appear in text. They require some kind of special escaping mechanism.

But assume my string has a chosen character for indicating packets. But before I send it out, I want to show the string to the user. I can’t just throw it into a display method. I’d have TABs or EQUALs or UNKNOWN GLYPHs all over the place visible to the user. I don’t want that. So now I have to make a new copy of the string with my special boundary-char removed, then display that copied string. Or I could keep the original string, from before I added the packet boundaries, but that’s if I predict or assume ahead of time that I will need to display it, which in reality I might not. But that still means two copies of the string, one of which might be a waste. More code. More processing.

I can do all that. But why?

This thread is about a tool for convenience. I don’t “need” it, in the sense that a task is insoluble without it. I’m a programmer, I know how to code. I “want” it, because a tool like that would make some tasks much faster and simpler. Your proposed solution doesn’t.

From: Philippe Verdy [mailto:verdy_p at wanadoo.fr] 
Sent: Saturday, June 29, 2019 15:47
To: Sławomir Osipiuk
Cc: Shawn Steele; unicode Unicode Discussion
Subject: Re: Unicode "no-op" Character?

If you want to "packetize" arbitrarily long Unicode text, you don't need any new magic character. Just prepend your packet with a base character used as a syntaxic delimiter, that does not combine with what follows in any normalization.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20190703/a7bdb4ec/attachment.html>