Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8
Hans Åberg via Unicode
unicode at unicode.org
Tue May 16 10:44:54 CDT 2017
> On 16 May 2017, at 17:30, Alastair Houghton via Unicode <unicode at unicode.org> wrote:
>
> On 16 May 2017, at 14:23, Hans Åberg via Unicode <unicode at unicode.org> wrote:
>>
>> You don't. You have a filename, which is a octet sequence of unknown encoding, and want to deal with it. Therefore, valid Unicode transformations of the filename may result in that is is not being reachable.
>>
>> It only matters that the correct octet sequence is handed back to the filesystem. All current filsystems, as far as experts could recall, use octet sequences at the lowest level; whatever encoding is used is built in a layer above.
>
> HFS(+), NTFS and VFAT long filenames are all encoded in some variation on UCS-2/UTF-16. ...
The filesystem directory is using octet sequences and does not bother passing over an encoding, I am told. Someone could remember one that to used UTF-16 directly, but I think it may not be current.
More information about the Unicode
mailing list