Odp: Why is (vv) -> (w) not amongst the Confusables?

Giacomo Catenazzi cate at cateee.net
Wed Nov 29 01:40:08 CST 2023


Confusable is a very important property for security (phishing, 
scamming, etc.), and it is important also because it is used by non 
"Unicode" people. Other properties of Unicode may requires some 
knowledge of Unicode standard/book and so scripts and language 
difference, but confusable should be usable by people without deep 
knowledge of e.g. Cyrillic or Indic scripts.

In any case, I think we should be careful on adding confusable on 
character on the same "scripts": OK where the glyph is or was the same 
(zero and O, one and el in typewriter, ev. I and el), or when language 
mix with scripts (Turkish the I without dot is not the uppercase i). 
Else the font/script designer should make things readable. If we can 
confuse w and vv, I think we must change font.

Note: cursive writings are exceptions: in cursive many characters are 
confusable.

PS: was W3C maintaining the confusable list?

cate



On 28 Nov 2023 19:38, piotrunio-2004 at wp.pl via Unicode wrote:
> The confusables property is rather subjective and not very commonly used 
> and due to the very large size of Unicode character set it is therefore 
> highly likely to be incomplete at all times and should not be 
> exclusively relied on. For instance in my opinion U+23AE and U+2502 are 
> identical glyphs in virtually all typographically meaningful uses 
> (especially due to the CP437/WGL4 heritage of U+2320 and U+2321 
> characters), and yet they're not linked in there. Since there is no 
> objective criteria for what qualifies as a 'confusable' it doesn't seem 
> appropriate to rely on that. I myself wouldn't like to rely on official 
> normalization and case folding rules, let alone confusables.
> 
> 
> Dnia 28 listopada 2023 18:04 Richter, Andrew MR via Unicode 
> <unicode at corp.unicode.org <mailto:unicode at corp.unicode.org>> napisał(a):
> 
> 
>     *OFFICIAL*
> 
> 
> 
>     Hi Unicode ML,
> 
> 
> 
>                                      I’m trying to determine why “vv”
>     (two of the letter “v”) as a confusable for “w” (a single letter
>     “w”) is not included in the latest Confusables list whereas “rn”
>     (“r” followed by “n”) is included as a confusable for “m” (a single
>     letter “m”)? It looks like it was up to version 9.0 but was removed
>     from version 10.0 onwards.
> 
> 
> 
>     IMPORTANT: This email remains the property of the Department of
>     Defence. Unauthorised communication and dealing with the information
>     in the email may be a serious criminal offence. If you have received
>     this email in error, you are requested to contact the sender and
>     delete the email immediately.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> That is highly likely a fatal messaging error because this is a public 
> mailing list.


More information about the Unicode mailing list