Pure Regular Expression Engines and Literal Clusters

Hans Åberg via Unicode unicode at unicode.org
Sun Oct 13 03:04:34 CDT 2019


> On 13 Oct 2019, at 00:37, Richard Wordingham via Unicode <unicode at unicode.org> wrote:
> 
> On Sat, 12 Oct 2019 21:36:45 +0200
> Hans Åberg via Unicode <unicode at unicode.org> wrote:
> 
>>> On 12 Oct 2019, at 14:17, Richard Wordingham via Unicode
>>> <unicode at unicode.org> wrote:
>>> 
>>> But remember that 'having longer first' is meaningless for a
>>> non-deterministic finite automaton that does a single pass through
>>> the string to be searched.  
>> 
>> It is possible to identify all submatches deterministically in linear
>> time without backtracking — I a made an algorithm for that.
> 
> That's impressive, as the number of possible submatches for a*(a*)a* is
> quadratic in the string length.

That is probably after the possibilities in the matching graph have been expanded, which can even be exponential. As an analogy, think of a polynomial product, I compute the product, not the expansion.




More information about the Unicode mailing list