Does regular Unicode have a character that looks like a space to a human yet is not treated as a space by software please?
Jukka K. Korpela
jkorpela at cs.tut.fi
Thu Mar 27 03:42:20 CDT 2014
2014-03-27 10:13, William_J_G Overington wrote:
> Does regular Unicode have a character that looks like a space to a
> human yet is not treated as a space by software please?
It depends, among other things, on what you mean by “space”.
There’s U+00A0 NO-BREAK SPACE, which surely isn’t the same as U+0020
SPACE, but might be called a space. Programs can do different things to
> Please consider my use of U+E001 in the following thread.
As far as I can see, the question is about indenting text in e-books.
What I do in my e-books is a simple CSS setting, margin-left (or
padding-left) with a suitable value. There are many other ways too.
Or you could even use a sequence of U+00A0 characters at the star of a
line. There is no exact definition of what should happen, but in
practice, HTML user agents, including e-book readers, treat U+00A0 as
yet another graphic character, which just happens to have an empty
glyph. Well, they may also seem to be honoring the non-breaking
property, but this might be just incidental (they generally don’t break
before or after graphic characters except whitespace characters, and
U+00A0 is by HTML definition not whitespace).
There are also other characters that can be called “spaces”, such as
U+2002 EN SPACE. But they have properties similar to the properties of
U+0020 SPACE, so we can expect some programs to handle them the same way
as SPACE, in some respect. Sorry for this vagueness, but it reflects the
vagueness of the question.
More information about the Unicode