Need reference to good ABNF for \uXXXX syntax

Martin J. Dürst duerst at
Wed Apr 14 18:50:43 CDT 2021

Hello Doug,

On 2021-04-15 01:41, Doug Ewell via Unicode wrote:
> Is anyone aware of an existing RFC or other specification that includes complete, correct, and clear ABNF for Unicode escape sequences using the UTF-16 encoding scheme?
> Examples:
> \u0041
> \u3042
> \uD801\uDC02  (NOT: \U0001042A)
> This type of sequence is described in Section 6.3 of RFC 5137, but that RFC does not recommend this syntax and does not include ABNF for it.
> "Correct" implies, for instance, that the ABNF excludes unpaired surrogates.
> To be clear, I'm NOT looking for someone on this list to contribute their own code, but rather a pointer to code that is already published, and easy for another document, such as an I-D, to reference.

So I guess you are looking for something like the regular expression on, but for the 
above syntax (rather than byte sequences in UTF-8) and in ABNF.

The closest I was able to come up from memory may be, but it's not exactly what you want. 
I'd guess it might be quicker for you to put something together on your 
own (and then maybe run it by this list).

Regards,   Martin.

> --
> Doug Ewell, CC, ALB | Lakewood, CO, US |

More information about the Unicode mailing list