Skip to content

implement better code generation for the regex plugin #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rust-highfive opened this issue Jan 25, 2015 · 2 comments
Closed

implement better code generation for the regex plugin #26

rust-highfive opened this issue Jan 25, 2015 · 2 comments

Comments

@rust-highfive
Copy link

Issue by comex
Thursday May 08, 2014 at 05:23 GMT

For earlier discussion, see rust-lang/rust#14029

This issue was labelled with: A-libs in the Rust repository


Consider this code:

#![feature(phase)]
extern crate regex;
#[phase(syntax)]
extern crate regex_macros;

pub fn is_all_a(s: &str) -> bool {
    return regex!("^a+$").is_match(s);
}

Ideally this would optimize away to a small function that just iterates over the string and checks for characters other than 'a'.

Instead, it:

  • calls malloc several times to start out;
  • goes through an indirect call unless LTO is enabled - might not usually be a big deal, but I would like to eventually be able to efficiently match a regex on a single character in lieu of writing out all the possibilities manually
  • to the 'exec' function, which itself, even with LTO (and -O) enabled, makes many non-inlined calls, including to malloc, char_range_at, char_range_at_reverse, etc.

Without LTO, it generates about 7kb of code for one regex, or 34kb if I put 8 regexes in that function. Not the end of the world, but it adds up.

I recognize the regex implementation is new, but I thought this was worth filing anyway as room for improvement.

rustc 0.11-pre-nightly (2dcbad5 2014-05-06 22:01:43 -0700)

@BurntSushi BurntSushi changed the title regex is less efficient than it could be implement better code generation for the regex plugin Jun 19, 2015
@BurntSushi
Copy link
Member

In my latest round of performance tweaks, I added more analysis on the compiled regex program, particularly for discovering prefixes. We could do more of this analysis, particularly when generating code for the regex! macro.

I think a more concrete item here is to go and study Ragel and see if we can learn something from how it does code generation for matching a regex.

@BurntSushi
Copy link
Member

The plugin was removed.

There may come a day when compile time regexes make a come back, but that day is neither today nor in the immediate future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants