A last missing link for interoperable representation

Marcel Schneider via Unicode unicode at unicode.org
Tue Jan 15 16:16:16 CST 2019


On 15/01/2019 13:25, Philippe Verdy via Unicode wrote:
> 
> Note that even if this NNBSP character is not mapped in a font, it
> should be rendered correctly with all modern renderers (the mapping
> is necessary only when a font design wants to tune its metrics,
> because its width varies between 1/8 and 1/6 em (the narrow space is
> a bit narrower in traditional English typography than in French, so
> typical English design set it at about 1/8 em, typical French design
> set it at 1/6 em, and neutral fonts may set it somewhere in the
> middle); the measure in em may however vary with some fonts (notably
> those using "narrow" or "wide" letters by default (because the font
> size in em indicates only its height) and in decorated/cursive styles
> (e.g. fonts with swashes need a higher line gap, the font design of
> the em size may be smaller than for modern simplified styles for
> display).
> 
> But a renderer should have no problem using a default metric for all
> whitespace characters, that actually don't need any glyph to be
> drawn: All what is needed is metrics, everything else, inclusing
> character properties like breaking are infered by the renderer
> independantly of the font and other per-language tuning, or controled
> by styling effects applied on top of the font

Indeed, since every Unicode implementation must rely on the character
properties, and given keeping this library up-to-date is straightforward
and easy, there is really no point in displaying a .notdef box in lieu
of whatever whitespace.

As a consequence, prior to assessing the impact of the group separator
migration from (wrong) <NBSP> to (correct) <NNBSP> on implementations
and interoperability, Unicode would be well advised to start assessing
the impact of implementations (and, of course, the backing vendors) on
correct rendering of <NNBSP>, and on the related usability and
interoperability of the digital representation of those many locales
that should rely on <NNBSP>.

> 
> A renderer may expand the kerning/approach if needed for example to
> generate "hollow" or "shadow" effects, or to generate synthetic
> weights, including with "variable" fonts support, typically the
> renderer will base the metrics of all missing/unmapped whitespaces
> from the metrics given to the normal SPACE or NBSP which are
> typically both mapped to the same glyph; NNBSP will be synthetized
> easily using half the advance width of SPACE, and it's fine;
> renderers can also synthetize all other whitespaces for ideographic
> usages, or will adapt the rendering if instructed to synthetize a
> monospaced variant: here there's a choice for NNBSP to be rendered
> like NBSP, typically for French as it is normally a bit wider, or as
> a zero-width space like in English, or contextually for example
> zero-width near punctuations or NBSP between letters/digits).

In a monospaced font, NNBSP has normally the width of a character,
but it has been designed for proportional fonts, and there, it must
not have the width of a digit, as that would annihilate the required
effect. The group separator must never have the width of a full digit,
not even of digit 1 in variable-width digits, but just a slight gap
ensuring correct readability, BTW also after the decimal separator
as per ISO 80000.

Between punctuation, <NNBSP> mustn’t be zero-wide, as it is used in
English to separate closing single and double quotation marks when
a nested quotation ends the first level quotation. I don’t think
that English does use <NNBSP> elsewhere around punctuation except
dashes if appropriate according to the applied style manual, but
Canadian French does, unlike an urban legend saying it doesn’t.
It does only prefer not to space off punctuation *if* <NNBSP> is
unavailable. That is another proof of the inappropriateness of
the <NBSP> for the purpose of spacing off tall punctuation marks.

> 
> Fonts only specify defaults that alter the rendering produced by a
> renderer, but a renderer is not required to use all infos and all
> glyphs in a specific font, it has to adapt to the context and choose
> what is more relevant and which kind of data it recognizeds and
> implements/uses at runtime. The font just provides the best settings
> according to the font designer, if all features are enabled, but most
> work is done by the renderer (and fonts are completely unaware of
> tyhe actual encoding of documents, fonts are only a database
> containing multiple features/settings, all of them bneing optional
> and selectable individually).

Good point, indeed. Currently we are too much concerned with fonts,
while actually it’s all up to the renderer. Today as most devices
are permanently connected to the internet, a decent rendering engine
could as well grab missing glyphs from an online repository, at
Google Fonts or at the application vendor’s website. All that
missing-glyph-whining seems completely outdated and very detrimental
to the user experience. It is so anachronistic that people shouldn’t
be surprised about suspicions of intentional bugs for the purpose of
unlawful lobbying by messing up user experience outside of certain
DTP applications. The French locale is the most heavily impacted
victim of those operating modes.

> 
> If your fonts behave incorrectly on your system because it does not
> map any glyph for NNBSP, don't blame the font or Unicode about this
> problem, blame the renderer (or the application or OS using it, may
> be they are very outdated and were not aware of these features, theyt
> are probably based on old versions of Unicode when NNBSP was still
> not present even if it was requested since very long at least for
> French and even English, before even Unicode, and long before
> Mongolian was then encoded, only in Unicode and not in any known
> supported legacy charset: Mongolian was specified by borrowing the
> same NNBSP already designed for Latin, because the Mongolian space
> had no known specific behavior: the encoded whitespaces in Unicode
> are compeltely script-neutral, they are generic, and are even
> BiDi-neutral, they are all usable with any script).
> 

Completely agreed. If I blame Unicode it’s for keeping the NNBSP off
the Standard during almost a decade, which translates to two decades
of delay due to the loss of dynamics past the early rush, and to
people who keep bullying the NNBSP 20 years after it was encoded,
and despite it is now widely supported. Also the ignorance related
to NNBSP is still abysmal despite the very popular style manual of
the French Imprimerie Nationale requires it’s use explicitly:

                     EXCLAMATION MARK
espace fine insécable      !      justifying space

(quoted/translated from figure p. 149; ISBN 9782743304829).


Many thanks to all who took part in this thread – that is very
instructive and has brought up many new insights – and likewise
to those spinning of child threads and sharing material.
Keep on the good work and be successful!

Best regards,

Marcel


More information about the Unicode mailing list