public at khwilliamson.com
Sat Jun 7 23:19:51 CDT 2014
On 06/02/2014 11:00 AM, Shawn Steele wrote:
> To further my understanding, can someone provide examples of how these are used in actual practice? I can't think of any offhand and the closest I get is like the old escape characters to get a dot matrix printer to shift modes, or old word processor internal formatting sequences.
Here's an example of a possible use. 20 some years ago I wrote a
front-end to the Unix diff utility. Showing the differences between
files (usually 2 versions of the same program's code) is an extremely
common programming activity. I do it many times a day. One reason is
to try to find out why a bug has crept in. In doing so, there are some
differences that are not relevant to the task at hand, and their being
shown is a significant distraction. For example, in programming, one
might have renamed a variable (identifier) because its purpose has
changed somewhat and the name should accurately reflect its new function
so the reader is not subconsciously misled. It would be nice to be able
to suppress the variable name changes from the difference display.
There could be thousands of them. By changing the name in each file
version to the same noncharacter during the diff, these differences
won't be displayed, and there would not be any possible conflict with
the input files having that noncharacter in them. (For display the
noncharacter is changed back to the original value in its respective
file) Further, one might want to ignore the name changes of two
variables. Just use a second noncharacter, up to 66.
I wrote this long before noncharacters were available. What I do
instead is scan the files for rarely used characters until I find enough
ones that aren't in the files. For example U+9F is unlikely to appear.
Scanning the files takes time. This step could be omitted for
noncharacters that are known to be illegal in the input.
More information about the Unicode