add a regexp crate to the Rust distribution by BurntSushi · Pull Request #42 · rust-lang/rfcs (original) (raw)
A couple of other ideas that I have had with regards to regular expressions are:
- Truly compiled regular expressions: as in, no dependency on a
regexp
library at runtime at all, but rather expanding it to approximately what a person might have written by hand without a regular expressions library. - Create anonymous structs for matches, with direct field access (or indexed access) for groups.
I would expect that these would lead to somewhat larger compiled code, but to code that should run more efficiently. I'm not sure if it's a good trade-off or not.
Anyway, these lead to something like this:
re!(FancyIdentifier, r"^(?P<letters>[a-z]+)(?P<numbers>[0-9]+)?$")
expanding to something approximating this, plus quite a bit more (I recognise that it isn't a valid expansion in a static value and has various other issues, but it gives the general idea of what I think would be really nice):
struct FancyIdentifier<'a> {
all: &'a str,
letters: &'a str,
numbers: Option<&'a str>,
}
impl<'a> Index<uint, Option<&'a str>> for FancyIdentifier<'a> {
fn index(&'a self, index: &uint) -> Option<&'a str> {
if *index == 0u {
Some(self.all)
} else if *index == 1u {
Some(self.letters)
} else if *index == 2u {
self.numbers
} else {
fail!("no such group {}", *index);
}
}
}
impl<'a> FancyIdentifier<'a> {
pub fn captures<'t>(text: &'t str) -> Option<FancyIdentifier<'t>> {
let mut chars = text.chars();
loop {
// go through, byte/char by byte/char, keeping track of position
if b < 'a' || b > 'z' {
return None;
}
}
loop {
// … get numbers in much the same way …
}
Some(FancyIdentifier {
all: text,
letters: letters,
numbers: numbers,
})
}
}
This allows nicer usage:
let foo12 = FancyIdentifier::captures("foo12");
assert_eq!(foo12.letters, "foo");
assert_eq!(foo12.numbers, Some("12"));
assert_eq!(foo12[0], Some("foo12"));
assert_eq!(foo12[2], Some("12"));
I expect this would be rather difficult to implement, too. Still, just thought I'd toss the idea into the ring as I haven't seen it suggested, but it's been sitting in my mind the whole time the discussion has gone on.