Re: Odp: RE: What to do if a legacy compatibility character is defective?
piotrunio-2004@wp.pl
piotrunio-2004 at wp.pl
Thu Dec 4 16:37:29 CST 2025
Dnia 04 grudnia 2025 21:28 Asmus Freytag via Unicode <unicode at corp.unicode.org> napisał(a): On 12/4/2025 4:35 AM, piotrunio-2004 at wp.pl via Unicode wrote: I have investigated the situation further and it seems
that defect in the Unicode 13.0—17.0 mapping is even more
fundamental than I previously thought. In particular, the
proposal L2/25-037 does not acknowledge the proposal
L2/00-159, which had already been incorporated into
Unicode 3.2. In that proposal, the description of
characters U+23B8 (LEFT VERTICAL BOX LINE) and U+23B9
(RIGHT VERTICAL BOX LINE) exactly matches the proposed
characters L2/25-037:1FBFC (BOX DRAWINGS LIGHT LEFT EDGE)
and L2/25-037:1FBFD (BOX DRAWINGS LIGHT RIGHT EDGE). In
both proposals, those two characters are specified to be
aligned to left or right edge, span the entire edge
(extending to the top and bottom), and match the thickness
of Box Drawings Light lines. The description of the
characters U+23BA (HORIZONTAL SCAN LINE-1) and U+23BD
(HORIZONTAL SCAN LINE-9) also exactly matches the proposed
characters L2/25-037:1FBFA (BOX DRAWINGS LIGHT TOP EDGE)
and L2/25-037:1FBFB (BOX DRAWINGS LIGHT BOTTOM EDGE). In
both proposals, those two characters are specified to be
aligned to top and bottom edges, span the entire edge
(extending to the left and right), and match the thickness
of Box Drawings Light lines. However, the proposal
L2/00-159 had already set precedent for usage of [U+23BA,
U+23BD, U+23B8, U+23B9] (and not the 1÷8 blocks or 1÷4
blocks) in mapping to certain platforms such as The
Heath/Zenith 19 Graphics Character Set and The DEC Special
Graphics Character Set. This contrasts with the usage of
1÷8 blocks [U+2594, U+2581, U+258F, U+2595] and other
related 1÷8 or 7÷8 block characters in the mapping to
PETSCII and Apple II. Therefore there
is a discrepancy between the legacy platforms added
in Unicode 3.2 (which use the box drawing lines
23B8, 23B9, 23BA, 23BD) and the legacy platforms
added in Unicode 13.0—17.0 (which use 1÷8 blocks
2594, 2581, 258F, 2595). Dnia 25 października 2025 10:27 piotrunio-2004 at wp.pl via Unicode <unicode at corp.unicode.org> napisał(a): Dnia 25 października 2025 08:29 Asmus
Freytag via Unicode <unicode at corp.unicode.org> napisał(a): Again, the identity of the
Unicode character is giving by
encoding the intended mappings.
If Unicode decides to map the
same character to similar
characters on different
platforms, that is not a
problem, as long as implementers
know that the intent is to use a
platform-specific rendering (and
not assume that there is only
one possible rendering per
character). If you feel that the guidance
available to implementers in the
text of the standard or in an
annotation of the nameslist is
not sufficent, then the remedy
would be to ask for the
explanation to be updated. We
are unfortunately locked in as
far as character names are
concerned, but we can add a note
(best in the text of the
standard) that explains that
emulators for some systems will
need an adjusted design so a
sequence or other arrangement of
these characters looks correct. Indeed the character names cannot be
changed due to stability policies. An
explanation note has been provided for
U+1FB81 that claims "The lines
corresponding to 3 and 5 are not
actually block elements, but can show any
horizontally
repeating pattern", but still implicitly
enforces 1÷8 blocks for top and bottom.
However, this doesn't address other cases
such as the PETSCII C64 variation. And
if 1FB70—1FB81 1FBB5—1FBB8 1FBBC were all
noted to no longer require exact 1÷8
blocks, that would also not remedy the
issue because it would introduce an
inconsistency with the existing 1÷8 or 7÷8
block characters 2581 2589 258F 2594—2595,
which already have established
compatibility precedents that require the
exact fraction, but are also used in the
Unicode 13.0 mapping to PETSCII and Apple
II character sets despite those platforms
using varying thickness (consistent with
light box drawings, except for the 1÷8 top
and bottom blocks in C64, where the 1÷4
top and bottom blocks are made consistent
instead). What is missing is an actual proposal. That
is, not just analysis or exposition, but actual proposed wording
or proposed encoding that would fix the issue. That would need to be provided as a UTC
document (aka L2 document) submission, with the analysis
appended in a background section. A./ PS: I am not convinced that
platform-specific mappings (glyphs) are an issue, because the
scenario where these data are reliably transferred *between*
legacy implementations can't have existed then, so it's
questionably why it needs to be perfect today. My assumption
would be that the use case is lossless round trip from (each)
legacy emulator to Unicode and back. Having PETSII / Apple II
specific characters does not improve things, because any data
stream containing those could not be displayed on any other
emulator. This is different from legacy characters mapped to
letters and common text symbols because we have an expectation
that we can share text across devices (or emulators). I have a draft of a follow up of L2/25-037 that analyzes the character sets thoroughly with the additional context provided by L2/00-159 characters (including the particularly complex relationship between box drawings, 1÷8 blocks, and 1÷4 blocks in PETSCII), provides additional explanation and screenshot of evidence of HP 264x character in both isolated and in connected usage, and arrives at the conclusion that 23 characters (that is, all in L2/25-037 except for the 4 that were already added by L2/00-159) should be added. However, the SEW announced that they will not be discussing these characters any further, so how could any follow up of the proposal possibly get incorporated into Unicode?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/pipermail/unicode/attachments/20251204/f6d75029/attachment-0001.htm>
More information about the Unicode
mailing list