Regular Expressions' Journal (original) (raw)
5:29 pm
[owenblacker]
For reasons too dull to explain, I’m trying to use regular expressions to postprocess an HTML-stream. I want to find all anchor (<a/>
) tags that link within our site, in this case using the domain name.
My regular expression looks right to me, but .Net is convinced I don’t have enough close-parentheses. I’ve added line breaks for clarity:
(?<=<a[^>]* href=['"]?)
(?https?://[a-z0-9.-]uswitch.[a-z]+/[-\w_,.%/]+)
(???[-\w&=])
(?#?[-\w&=~])
(?=['"]?[^>]>)
I’ve tested both the above code with the line breaks removed and the original code (which is compiled with RegexOptions.IgnorePatternWhitespace and has embedded comments for ease of maintenance. Each time, I get a System.ArgumentException: parsing "..." - Not enough )'s
.
Despite that I’m quite certain they’re perfectly matched.
Anyone?
Cross-posted to ms_dot_net.
Current Mood: frustrated