Does regular Unicode have a character that looks like a space to a human yet is not treated as a space by software please?

Jukka K. Korpela jkorpela at cs.tut.fi
Thu Mar 27 03:42:20 CDT 2014


2014-03-27 10:13, William_J_G Overington wrote:

> Does regular Unicode have a character that looks like a space to a
> human yet is not treated as a space by software please?

It depends, among other things, on what you mean by “space”.

There’s U+00A0 NO-BREAK SPACE, which surely isn’t the same as U+0020 
SPACE, but might be called a space. Programs can do different things to 
different characters.

> Please consider my use of U+E001 in the following thread.
>
> https://community.serif.com/forum/pageplus/9646/formatting-poetry-for-e-books

As far as I can see, the question is about indenting text in e-books. 
What I do in my e-books is a simple CSS setting, margin-left (or 
padding-left) with a suitable value. There are many other ways too.

Or you could even use a sequence of U+00A0 characters at the star of a 
line. There is no exact definition of what should happen, but in 
practice, HTML user agents, including e-book readers, treat U+00A0 as 
yet another graphic character, which just happens to have an empty 
glyph. Well, they may also seem to be honoring the non-breaking 
property, but this might be just incidental (they generally don’t break 
before or after graphic characters except whitespace characters, and 
U+00A0 is by HTML definition not whitespace).

There are also other characters that can be called “spaces”, such as 
U+2002 EN SPACE. But they have properties similar to the properties of 
U+0020 SPACE, so we can expect some programs to handle them the same way 
as SPACE, in some respect. Sorry for this vagueness, but it reflects the 
vagueness of the question.

Yucca






More information about the Unicode mailing list