Pure Regular Expression Engines and Literal Clusters
Hans Åberg via Unicode
unicode at unicode.org
Sun Oct 13 03:04:34 CDT 2019
> On 13 Oct 2019, at 00:37, Richard Wordingham via Unicode <unicode at unicode.org> wrote:
> On Sat, 12 Oct 2019 21:36:45 +0200
> Hans Åberg via Unicode <unicode at unicode.org> wrote:
>>> On 12 Oct 2019, at 14:17, Richard Wordingham via Unicode
>>> <unicode at unicode.org> wrote:
>>> But remember that 'having longer first' is meaningless for a
>>> non-deterministic finite automaton that does a single pass through
>>> the string to be searched.
>> It is possible to identify all submatches deterministically in linear
>> time without backtracking — I a made an algorithm for that.
> That's impressive, as the number of possible submatches for a*(a*)a* is
> quadratic in the string length.
That is probably after the possibilities in the matching graph have been expanded, which can even be exponential. As an analogy, think of a polynomial product, I compute the product, not the expansion.
More information about the Unicode