Fonts and Unicode conformance (was Re: Use of tag ,,,)
Giacomo Catenazzi
cate at cateee.net
Thu May 9 00:33:15 CDT 2024
On 09.05.2024 06:26, Doug Ewell via Unicode wrote:
> Mark E. Shoulson wrote:
>
>> I regularly (amuse myself and) make fonts render "www" as a ligature,
>> etc.
>
> Microsoft’s Cascadia Code does this sort of thing on the regular, which to me is a great reason to use Cascadia Mono instead:
>
> https://github.com/microsoft/cascadia-code?tab=readme-ov-file#font-features
Fira is similar, but I think they changed the default. And IDE should
have options to turn it off (IMHO: the default should be off).
But that hints an additional features of fonts: they may include the
same character twice or more, with different glyphs or spacing, and
selectable with font options (which are often not easily accessible to
writers). Very commons are "normal digits", "tabular digits". On some
cases Unicode provide some control: Variant Selector, e.g. for digit 0
(but why only on one way? I can force to have the *slash* but not force
not to have it).
If there will be a "Unifont", I assume, just for Latin scripts, a
printed version will take the space of many multi-volume encyclopedias.
In any case, the most funny/annoying part is Turkish support: same
character (and same script: Latin) but different glyph (compared other
languages), an also the contrary: same glyph (Turkish vs. most of rest
of languages using Latin scripts) but different Unicode character (and
also read as different character). The scope of Unicode is interchange
and semantic (and it is already a huge task).
Note: Compiler community uses such method: they define semantic, the
rest is left as "quality of implementation": you cannot rule on
everything (and in every details): users should choose sensible
compilers (e.g. no compiler will allows a infinite long source file, but
is a compiler conformant if they accept only file shorter then 10
Unicode codepoints?). Or if the implementation of dynamic memory is just
"fail with no-memory left".
giacomo
More information about the Unicode
mailing list