Need reference to good ABNF for \uXXXX syntax
Martin J. Dürst
duerst at it.aoyama.ac.jp
Wed Apr 14 18:50:43 CDT 2021
On 2021-04-15 01:41, Doug Ewell via Unicode wrote:
> Is anyone aware of an existing RFC or other specification that includes complete, correct, and clear ABNF for Unicode escape sequences using the UTF-16 encoding scheme?
> \uD801\uDC02 (NOT: \U0001042A)
> This type of sequence is described in Section 6.3 of RFC 5137, but that RFC does not recommend this syntax and does not include ABNF for it.
> "Correct" implies, for instance, that the ABNF excludes unpaired surrogates.
> To be clear, I'm NOT looking for someone on this list to contribute their own code, but rather a pointer to code that is already published, and easy for another document, such as an I-D, to reference.
So I guess you are looking for something like the regular expression on
https://www.w3.org/International/questions/qa-forms-utf-8, but for the
above syntax (rather than byte sequences in UTF-8) and in ABNF.
The closest I was able to come up from memory may be
https://tools.ietf.org/html/rfc5137, but it's not exactly what you want.
I'd guess it might be quicker for you to put something together on your
own (and then maybe run it by this list).
> Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org
More information about the Unicode