Odp: RE: What to do if a legacy compatibility character is defective?

Thu Dec 4 14:15:27 CST 2025

On 12/4/2025 4:35 AM, piotrunio-2004 at wp.pl via Unicode wrote:
> I have investigated the situation further and it seems that defect in 
> the Unicode 13.0—17.0 mapping is even more fundamental than I 
> previously thought. In particular, the proposal L2/25-037 does not 
> acknowledge the proposal L2/00-159, which had already been 
> incorporated into Unicode 3.2. In that proposal, the description of 
> characters U+23B8 (LEFT VERTICAL BOX LINE) and U+23B9 (RIGHT VERTICAL 
> BOX LINE) exactly matches the proposed characters L2/25-037:1FBFC (BOX 
> DRAWINGS LIGHT LEFT EDGE) and L2/25-037:1FBFD (BOX DRAWINGS LIGHT 
> RIGHT EDGE). In both proposals, those two characters are specified to 
> be aligned to left or right edge, span the entire edge (extending to 
> the top and bottom), and match the thickness of Box Drawings Light 
> lines. The description of the characters U+23BA (HORIZONTAL SCAN 
> LINE-1) and U+23BD (HORIZONTAL SCAN LINE-9) also exactly matches the 
> proposed characters L2/25-037:1FBFA (BOX DRAWINGS LIGHT TOP EDGE) and 
> L2/25-037:1FBFB (BOX DRAWINGS LIGHT BOTTOM EDGE). In both proposals, 
> those two characters are specified to be aligned to top and bottom 
> edges, span the entire edge (extending to the left and right), and 
> match the thickness of Box Drawings Light lines. However, the proposal 
> L2/00-159 had already set precedent for usage of [U+23BA, U+23BD, 
> U+23B8, U+23B9] (and not the 1÷8 blocks or 1÷4 blocks) in mapping to 
> certain platforms such as The Heath/Zenith 19 Graphics Character Set 
> and The DEC Special Graphics Character Set. This contrasts with the 
> usage of 1÷8 blocks [U+2594, U+2581, U+258F, U+2595] and other related 
> 1÷8 or 7÷8 block characters in the mapping to PETSCII and Apple II. 
> Therefore there is a discrepancy between the legacy platforms added in 
> Unicode 3.2 (which use the box drawing lines 23B8, 23B9, 23BA, 23BD) 
> and the legacy platforms added in Unicode 13.0—17.0 (which use 1÷8 
> blocks 2594, 2581, 258F, 2595).
>
> Dnia 25 października 2025 10:27 piotrunio-2004 at wp.pl via Unicode 
> <unicode at corp.unicode.org> napisał(a):
>
>
>     Dnia 25 października 2025 08:29 Asmus Freytag via Unicode
>     <unicode at corp.unicode.org> napisał(a):
>
>         Again, the identity of the Unicode character is giving by
>         encoding the intended mappings. If Unicode decides to map the
>         same character to similar characters on different platforms,
>         that is not a problem, as long as implementers know that the
>         intent is to use a platform-specific rendering (and not assume
>         that there is only one possible rendering per character).
>
>         If you feel that the guidance available to implementers in the
>         text of the standard or in an annotation of the nameslist is
>         not sufficent, then the remedy would be to ask for the
>         explanation to be updated. We are unfortunately locked in as
>         far as character names are concerned, but we can add a note
>         (best in the text of the standard) that explains that
>         emulators for some systems will need an adjusted design so a
>         sequence or other arrangement of these characters looks correct.
>
>     Indeed the character names cannot be changed due to stability
>     policies. An explanation note has been provided for U+1FB81 that
>     claims "The lines corresponding to 3 and 5 are not actually block
>     elements, but can show any horizontally repeating pattern", but
>     still implicitly enforces 1÷8 blocks for top and bottom. However,
>     this doesn't address other cases such as the PETSCII C64
>     variation. And if 1FB70—1FB81 1FBB5—1FBB8 1FBBC were all noted to
>     no longer require exact 1÷8 blocks, that would also not remedy the
>     issue because it would introduce an inconsistency with the
>     existing 1÷8 or 7÷8 block characters 2581 2589 258F 2594—2595,
>     which already have established compatibility precedents that
>     require the exact fraction, but are also used in the Unicode 13.0
>     mapping to PETSCII and Apple II character sets despite those
>     platforms using varying thickness (consistent with light box
>     drawings, except for the 1÷8 top and bottom blocks in C64, where
>     the 1÷4 top and bottom blocks are made consistent instead).
>
>
>
>
What is missing is an actual proposal. That is, not just analysis or 
exposition, but actual proposed wording or proposed encoding that would 
fix the issue.

That would need to be provided as a UTC document (aka L2 document) 
submission, with the analysis appended in a background section.

A./

PS: I am not convinced that platform-specific mappings (glyphs) are an 
issue, because the scenario where these data are reliably transferred 
*between* legacy implementations can't have existed then, so it's 
questionably why it needs to be perfect today. My assumption would be 
that the use case is lossless round trip from (each) legacy emulator to 
Unicode and back. Having PETSII / Apple II specific characters does not 
improve things, because any data stream containing those could not be 
displayed on any other emulator. This is different from legacy 
characters mapped to letters and common text symbols because we have an 
expectation that we can share text across devices (or emulators).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20251204/4cadf205/attachment.htm>