Skip to content

A PCRE2 compatible regular expression engine written in Lean 4.

License

MIT and 2 other licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE-RUST-REGEX
Unknown
LICENSE-UNICODE
Notifications You must be signed in to change notification settings

bergmannjg/regex

Repository files navigation

Lean 4 Regex

A regular expression engine written in Lean 4.

This library is based on the Rust regex crate and extended for compatibility with PCRE2 (see Syntax).

Main restrictions:

  • focus on unicode data,
  • simple api and few configurations,
  • BoundedBacktracker as the single regex engine,
  • no optimizations.

Installation

Add the following dependency to lakefile.lean:

require Regex from git
  "/~https://github.com/bergmannjg/regex" @ "main"

Documentation

The main documentation is in Regex.lean

Example

Get captures of "∀ (n : Nat), 0 ≤ n" :

def Main : IO Unit := do
  let re := regex% r"^\p{Math}\s*.(?<=\()([a-z])[^,]+,\s*(\p{Nd})\s*(\p{Math})\s*\1$"
  let captures := Regex.captures "∀ (n : Nat), 0 ≤ n" re
  IO.println s!"{captures}"

Output is

fullMatch: '∀ (n : Nat), 0 ≤ n', 0, 22
groups: #[(some ('n', 5, 6)), (some ('0', 15, 16)), (some ('≤', 17, 20))]

Api

  • regex% : notation to build the regular expression at compile time
  • captures : searches for the first match of the regular expression

Components of regular expression:

  • \p{Math} : match all characters with the Math property
  • (?<=\() : lookbehind of char '('
  • (\p{Nd}) : capturing group of numeric characters
  • \1 : backreference to first capturing group

Test

The library is tested with the testdata of the PCRE2 library and the testdata of the Rust Regex crate.

The tests are run with

lake test

License

This project is licensed under the MIT license.

The data in Regex/Unicode/data/ is licensed under the Unicode License Agreement (LICENSE-UNICODE).

The .toml files in testdata/rust are licensed under the (LICENSE-RUST-REGEX).

About

A PCRE2 compatible regular expression engine written in Lean 4.

Topics

Resources

License

MIT and 2 other licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE-RUST-REGEX
Unknown
LICENSE-UNICODE

Stars

Watchers

Forks

Packages

No packages published

Languages