Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8
Alastair Houghton via Unicode
unicode at unicode.org
Tue May 16 11:13:33 CDT 2017
On 16 May 2017, at 17:07, Hans Åberg <haberg-1 at telia.com> wrote:
>>>> HFS(+), NTFS and VFAT long filenames are all encoded in some variation on UCS-2/UTF-16. ...
>>> The filesystem directory is using octet sequences and does not bother passing over an encoding, I am told. Someone could remember one that to used UTF-16 directly, but I think it may not be current.
>> No, that’s not true. All three of those systems store UTF-16 on the disk (give or take).
> I am not speaking about what they store, but how the filesystem identifies files.
Well, quite clearly none of those systems treat the UTF-16 strings as binary either - they’re case insensitive, so how could they? HFS+ even normalises strings using a variant of a frozen version of the normalisation spec.
More information about the Unicode