UAX 31 for C++ Identifiers

Asmus Freytag (c) asmusf at ix.netcom.com
Sat Jun 20 01:44:59 CDT 2020


My meta point had been about possibly different levels security issues 
between compile time and runtime.
A./

On 6/19/2020 8:22 PM, Steve Downey wrote:
> On Fri, Jun 19, 2020 at 10:44 PM Asmus Freytag via Unicode
> <unicode at unicode.org> wrote:
>> In source code, having ambiguous identifiers may not be worse than C-style obfuscation.
>>
> Until recently (the last release 10.1), gcc rejected much of allowed
> unicode in UTF-8 input, even in places it would allow \u
> universal-character-names. So this all becomes easier now. As a
> Standard, we should have handled this better earlier, but the second
> best time is now. The XID_ properties make this a lot more palatable
> w.r.t. stability, though, and I'm not going to second guess people 10
> or 20 or more years ago, too much. Ambiguity in external identifiers
> is already ill-formed no diagnostic required, which means broken but
> in ways that compilers can't treat as undefined.
>
>> But with module names, etc. you may run into security issues if naming allows / facilitates spoofing.
>>
> I, and other people doing tools, both won and lost this battle
> already. Module names in source do not correspond with anything
> physical. `import some.module` connects you to whatever exported
> `some.module` by magic as far as the standard is concerned. We're
> working on the actual mechanics as a Technical Report, and compiler
> vendors are participating and aren't, as far as I can tell, more
> insane than the average infrastructure engineer. So I have hope.
>
> Mapping anything to file paths is fraught beyond belief, and there are
> many experienced engineers providing war stories and parades of
> horribles, although I'd personally like to have more stories to tell.
>
> The entire disconnect between logical and physical actually is
> hopeful, in a way that `#include <ha/hahahahaha.h>` isn't. Even though
> we have a lot of understanding of how that maps to filesystem
> searches.
>
> Province of wg21/sg15 , which I also participate in.
>
> I suspect that trying to fix up anything with #include is infeasible
> since it's currently the wild west, changes will break, and C++
> depends in practice on system provided headers that at best conform to
> old C standards.
>
> Thanks!
>
> -SMD


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://corp.unicode.org/mailman/private/unicode/attachments/20200619/a2bdef8b/attachment.htm>


More information about the Unicode mailing list