Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump regex from 1.9.6 to 1.10.1 #7980

Closed
wants to merge 1 commit into from

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Oct 16, 2023

Bumps regex from 1.9.6 to 1.10.1.

Changelog

Sourced from regex's changelog.

1.10.1 (2023-10-14)

This is a new patch release with a minor increase in the number of valid patterns and a broadening of some literal optimizations.

New features:

  • FEATURE 04f5d7be: Loosen ASCII-compatible rules such that regexes like (?-u:☃) are now allowed.

Performance improvements:

  • PERF 8a8d599f: Broader the reverse suffix optimization to apply in more cases.

1.10.0 (2023-10-09)

This is a new minor release of regex that adds support for start and end word boundary assertions. That is, \< and \>. The minimum supported Rust version has also been raised to 1.65, which was released about one year ago.

The new word boundary assertions are:

  • \< or \b{start}: a Unicode start-of-word boundary (\W|\A on the left, \w on the right).
  • \> or \b{end}: a Unicode end-of-word boundary (\w on the left, \W|\z on the right)).
  • \b{start-half}: half of a Unicode start-of-word boundary (\W|\A on the left).
  • \b{end-half}: half of a Unicode end-of-word boundary (\W|\z on the right).

The \< and \> are GNU extensions to POSIX regexes. They have been added to the regex crate because they enjoy somewhat broad support in other regex engines as well (for example, vim). The \b{start} and \b{end} assertions are aliases for \< and \>, respectively.

The \b{start-half} and \b{end-half} assertions are not found in any other regex engine (although regex engines with general look-around support can certainly express them). They were added principally to support the implementation of word matching in grep programs, where one generally wants to be a bit more flexible in what is considered a word boundary.

New features:

... (truncated)

Commits
  • 5dff4bd 1.10.1
  • d242ede deps: bump regex-automata to 0.4.2
  • 488604d regex-automata-0.4.2
  • ee01ec2 deps: bump regex-syntax to 0.8.2
  • 1dbeee7 regex-syntax-0.8.2
  • 049d063 changelog: 1.10.1
  • 8a8d599 automata/meta: tweak reverse suffix prefilter strategy
  • 04f5d7b syntax: loosen ASCII compatible rules
  • cfd0ca2 automata/meta: force some prefilter inlining
  • 25ad29f bench: add a redirect
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot dependabot bot added the internal An internal refactor or improvement label Oct 16, 2023
Bumps [regex](/~https://github.com/rust-lang/regex) from 1.9.6 to 1.10.1.
- [Release notes](/~https://github.com/rust-lang/regex/releases)
- [Changelog](/~https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](rust-lang/regex@1.9.6...1.10.1)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot force-pushed the dependabot/cargo/regex-1.10.1 branch from 9e1c7ec to e32af55 Compare October 16, 2023 08:46
@charliermarsh charliermarsh self-requested a review October 16, 2023 13:48
@charliermarsh
Copy link
Member

@BurntSushi - I haven't investigated yet but in case you find it surprising, we see some behavior changes here. I'm guessing it's related to this (weird) code we have to hackily escape curly braces by surrounding them in an extra set of curly braces:

static CURLY_BRACES: Lazy<Regex> = Lazy::new(|| Regex::new(r"(\\N\{[^}]+})|([{}])").unwrap());

pub(super) fn curly_escape(text: &str) -> String {
    // Match all curly braces. This will include named unicode escapes (like
    // \N{SNOWMAN}), which we _don't_ want to escape, so take care to preserve them.
    CURLY_BRACES
        .replace_all(text, |caps: &Captures| {
            if let Some(match_) = caps.get(1) {
                match_.as_str().to_string()
            } else {
                if &caps[2] == "{" {
                    "{{".to_string()
                } else {
                    "}}".to_string()
                }
            }
        })
        .to_string()
}

@charliermarsh
Copy link
Member

(No obligation to do anything here, only flagging if helpful.)

BurntSushi added a commit to rust-lang/regex that referenced this pull request Oct 16, 2023
This reverts commit 8a8d599 and
includes a regression test, as well as a tweak to a log message.

Essentially, the broadening was improper. We have to be careful when
dealing with suffixes as opposed to prefixes. Namely, my logic
previously was that the broadening was okay because we were already
doing it for the reverse inner optimization. But the reverse inner
optimization works with prefixes, not suffixes. So the comparison wasn't
quite correct.

This goes back to only applying the reverse suffix optimization when
there is a non-empty single common suffix.

Fixes #1110
Ref astral-sh/ruff#7980
BurntSushi added a commit to rust-lang/regex that referenced this pull request Oct 16, 2023
This reverts commit 8a8d599 and
includes a regression test, as well as a tweak to a log message.

Essentially, the broadening was improper. We have to be careful when
dealing with suffixes as opposed to prefixes. Namely, my logic
previously was that the broadening was okay because we were already
doing it for the reverse inner optimization. But the reverse inner
optimization works with prefixes, not suffixes. So the comparison wasn't
quite correct.

This goes back to only applying the reverse suffix optimization when
there is a non-empty single common suffix.

Fixes #1110
Ref astral-sh/ruff#7980
@BurntSushi
Copy link
Member

BurntSushi commented Oct 16, 2023

Nice find! (It seems I am doomed to never escape the pit of literal optimization correctness bugs.) I reported it here and fixed it here. It's now fixed in regex 1.10.2.

@charliermarsh
Copy link
Member

Amazing, thanks @BurntSushi :)

@dependabot @github
Copy link
Contributor Author

dependabot bot commented on behalf of github Oct 16, 2023

OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let me know by commenting @dependabot ignore this major version or @dependabot ignore this minor version. You can also ignore all major, minor, or patch releases for a dependency by adding an ignore condition with the desired update_types to your config file.

If you change your mind, just re-open this PR and I'll resolve any conflicts on it.

@dependabot dependabot bot deleted the dependabot/cargo/regex-1.10.1 branch October 16, 2023 16:20
charliermarsh added a commit that referenced this pull request Oct 16, 2023
Recreating #7980 with regex's
latest fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal An internal refactor or improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants