UTF-8 display (was: Re: a mug)

Marcel Schneider charupdate at orange.fr
Tue Jul 21 03:46:33 CDT 2015


On 13 Jul 2015, at 11:28, I wrote:

> The only time I saw UTF-8 like on the T-shirt, was when opening UTF-8 files that didn't specify charset=UTF-8. The thing to do was to add the charset in the file header.

Now I see that this issue is much more tricky. I've just stumbled over a no-display page instead of (or at the URL of) http://www-01.ibm.com/software/globalization/topics/keyboards/physical.jsp where I read:
Our apologies…
while the source as displayed by Firefox shows:
charset=utf-8

Our apologies
(The markup comes from the header 1 tags.)

The trick is that the real HTML file as saved by Zotero contains:

Our apologies…
(with a U+2026)
and is encoded in... 
charset=windows-1252

Once changed this to utf-8, the page displays correctly:
Our apologies…

This may be why people are puzzled with UTF-8 up to the end we've seen.

So I would like to present my apologies to the List, and ask if anyone would help us to know the real problem (browsers, web editors, or else) and how to fix it. I don't think it's a mere HTML issue, as it concerns the Unicode Transformation Format.

Best regards,

Marcel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150721/8397f282/attachment.html>


More information about the Unicode mailing list