Fonts and Unicode conformance (was Re: Use of tag ,,,)

Giacomo Catenazzi cate at cateee.net
Thu May 9 00:33:15 CDT 2024


On 09.05.2024 06:26, Doug Ewell via Unicode wrote:
> Mark E. Shoulson wrote:
> 
>> I regularly (amuse myself and) make fonts render "www" as a ligature,
>> etc.
> 
> Microsoft’s Cascadia Code does this sort of thing on the regular, which to me is a great reason to use Cascadia Mono instead:
> 
> https://github.com/microsoft/cascadia-code?tab=readme-ov-file#font-features

Fira is similar, but I think they changed the default. And IDE should 
have options to turn it off (IMHO: the default should be off).

But that hints an additional features of fonts: they may include the 
same character twice or more, with different glyphs or spacing, and 
selectable with font options (which are often not easily accessible to 
writers). Very commons are "normal digits", "tabular digits". On some 
cases Unicode provide some control: Variant Selector, e.g. for digit 0 
(but why only on one way? I can force to have the *slash* but not force 
not to have it).

If there will be a "Unifont", I assume, just for Latin scripts, a 
printed version will take the space of many multi-volume encyclopedias.

In any case, the most funny/annoying part is Turkish support: same 
character (and same script: Latin) but different glyph (compared other 
languages), an also the contrary: same glyph (Turkish vs. most of rest 
of languages using Latin scripts) but different Unicode character (and 
also read as different character). The scope of Unicode is interchange 
and semantic (and it is already a huge task).

Note: Compiler community uses such method: they define semantic, the 
rest is left as "quality of implementation": you cannot rule on 
everything (and in every details): users should choose sensible 
compilers (e.g. no compiler will allows a infinite long source file, but 
is a compiler conformant if they accept only file shorter then 10 
Unicode codepoints?). Or if the implementation of dynamic memory is just 
"fail with no-memory left".

giacomo


More information about the Unicode mailing list