add a regexp crate to the Rust distribution by BurntSushi · Pull Request #42 · rust-lang/rfcs (original) (raw)

A couple of other ideas that I have had with regards to regular expressions are:

I would expect that these would lead to somewhat larger compiled code, but to code that should run more efficiently. I'm not sure if it's a good trade-off or not.

Anyway, these lead to something like this:

re!(FancyIdentifier, r"^(?P<letters>[a-z]+)(?P<numbers>[0-9]+)?$")

expanding to something approximating this, plus quite a bit more (I recognise that it isn't a valid expansion in a static value and has various other issues, but it gives the general idea of what I think would be really nice):

struct FancyIdentifier<'a> {
    all: &'a str,
    letters: &'a str,
    numbers: Option<&'a str>,
}

impl<'a> Index<uint, Option<&'a str>> for FancyIdentifier<'a> {
    fn index(&'a self, index: &uint) -> Option<&'a str> {
        if *index == 0u {
            Some(self.all)
        } else if *index == 1u {
            Some(self.letters)
        } else if *index == 2u {
            self.numbers
        } else {
            fail!("no such group {}", *index);
        }
    }
}

impl<'a> FancyIdentifier<'a> {
    pub fn captures<'t>(text: &'t str) -> Option<FancyIdentifier<'t>> {
        let mut chars = text.chars();
        loop {
            // go through, byte/char by byte/char, keeping track of position
            if b < 'a' || b > 'z' {
                return None;
            }
        }
        loop {
            // … get numbers in much the same way …
        }
        Some(FancyIdentifier {
            all: text,
            letters: letters,
            numbers: numbers,
        })
    }
}

This allows nicer usage:

let foo12 = FancyIdentifier::captures("foo12");
assert_eq!(foo12.letters, "foo");
assert_eq!(foo12.numbers, Some("12"));
assert_eq!(foo12[0], Some("foo12"));
assert_eq!(foo12[2], Some("12"));

I expect this would be rather difficult to implement, too. Still, just thought I'd toss the idea into the ring as I haven't seen it suggested, but it's been sitting in my mind the whole time the discussion has gone on.