Why do binary files contain text but text files don't contain binary?

Costello, Roger L. via Unicode unicode at unicode.org
Fri Feb 21 09:53:52 CST 2020


Based on a private correspondence, I now realize that this statement:



> Text files do not contain binary



is  not correct.



Text files may indeed contain binary (i.e., bytes that are not interpretable as characters). Namely, text files may contain newlines, tabs, and some other invisible things.



Question: "characters" are defined as only the visible things, right?



I conclude:



Binary files may contain arbitrary text.

Text files may contain binary, but only a restricted set of binary.



Do you agree?



/Roger


From: Costello, Roger L. <costello at mitre.org>
Sent: Friday, February 21, 2020 7:22 AM
To: unicode at unicode.org
Subject: Why do binary files contain text but text files don't contain binary?

Hi Folks,

There are binary files and there are text files.

Binary files often contain portions that are text. For example, the start of Windows executable files is the text MZ.

To the best of my knowledge, text files never contain binary, i.e., bytes that cannot be interpreted as characters. (Of course, text files may contain a text-encoding of binary, such as base64-encoded text.)

Why the asymmetry?

/Roger
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20200221/a84b3447/attachment.html>


More information about the Unicode mailing list