Skip to content

Add compile-time validation of regular expressions.#2281

Draft
rsmmr wants to merge 2 commits intomainfrom
topic/robin/gh-2131-capture-check
Draft

Add compile-time validation of regular expressions.#2281
rsmmr wants to merge 2 commits intomainfrom
topic/robin/gh-2131-capture-check

Conversation

@rsmmr
Copy link
Member

@rsmmr rsmmr commented Mar 18, 2026

So far we would catch pattern errors only at runtime. Now we report
the following at compile time if it can be statically determined:

  • Patterns that don't compile (e.g., due to syntax errors)
  • Use of $N referring to a capture group that exceeds the number a pattern defines.
  • Use of `$Nq with a regular expression specifying multiple patterns
    for parallel matching (in which case group numbers are ill-defined).
  • Use of $N with a &nosub field.

Closes #2131.

  • Add a couple of static analysis methods to regular expression patterns.
  • Add compile-time validation of regular expressions.

@codspeed-hq
Copy link

codspeed-hq bot commented Mar 18, 2026

Merging this PR will not alter performance

✅ 28 untouched benchmarks


Comparing topic/robin/gh-2131-capture-check (c6917d9) with main (1c5b216)

Open in CodSpeed

@rsmmr rsmmr force-pushed the topic/robin/gh-2131-capture-check branch 2 times, most recently from d9c35a0 to 23c0565 Compare March 19, 2026 12:40
These allow to (1) validate a regular expression and (2) determine its
number of capture groups. They work by temporarily compiling the
patterns.

Also includes a tiny cleanup removing an unused function parameter.
@rsmmr rsmmr force-pushed the topic/robin/gh-2131-capture-check branch from 23c0565 to 2de6e90 Compare March 19, 2026 13:01
So far we would catch pattern errors only at runtime. Now we report
the following at compile time if it can be statically determined:

- Patterns that don't compile (e.g., due to syntax errors)
- Use of `$N` referring to a capture group that exceeds the number a pattern defines.
- Use of `$N' with a regular expression specifying multiple patterns
  for parallel matching (in which case group numbers are ill-defined).
- Use of `$N' with a `&nosub` field.

Closes #2131.
@rsmmr rsmmr force-pushed the topic/robin/gh-2131-capture-check branch from 2de6e90 to c6917d9 Compare March 19, 2026 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Statically detect out-of-range regex captures

1 participant