Need reference to good ABNF for \uXXXX syntax

Doug Ewell doug at ewellic.org
Fri Apr 16 15:38:11 CDT 2021


Again, the object of this exercise was not to redefine the CDDL syntax, but to find good, debugged ABNF to describe the existing syntax.

It does, however, seem reasonable that backslash (%5C) should be included in the list. Also, as Rebecca pointed out, solidus (%2F) should apparently be changed to single-quote (%27). These are helpful corrections, but orthogonal to the question of how best to represent the \u syntax in ABNF.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org


-----Original Message-----
From: Kent Karlsson <kent.b.karlsson at bahnhof.se> 

> 16 apr. 2021 kl. 17:25 skrev Doug Ewell via Unicode <unicode at unicode.org>:
> 
> Martin J. Dürst wrote:
> 
>> What bothers me in this grammar is that the first "\u" isn't anywhere 
>> in sight, but the second one is there. It would be much clearer if 
>> either the first "\u" is at the start of hexchar, i.e.
> 
> Sorry, I neglected to include this line, which precedes everything I did quote:
> 
> SESC = "\" ( %x22 / %x2F / %x5C / %x62 / %x66 / %x6E / %x72 / %x74 /
>             (%x75 hexchar) )

1) Why are some ”very plain letters in ASCII” given as hex escapes here? Esp. since the not so plain (it is used as an escape, which is the point here…) ”\” has not warranted a hex escape. (The grammar even uses it to escape ”, which is a bit ironic).

2) Apart from the second line there, these have nothing to do with ”\u” escapes, and in addition the set of these other escapes vary (a bit) by programming language (or other context), and technically aren’t needed when \u escapes are allowed (though still practical).

/Kent K





More information about the Unicode mailing list