[wplug] regex help

Embery, Nathan Nathan.Embery at crowncastle.com
Mon Jan 12 15:21:51 EST 2004


greediness isn't that bad until you get the feel for it.

anytime you use a * perl will want to match the preceding character 0 or
more times, as many times as it can. So, '.*' will match the entire string
if you let it. '.*a' will match up until the last a in the string. '.*?a'
will match only until the first a. while '.*a?' would also match the whole
line, ( everything up to 0 or more a's )... fun stuff.


On Mon, 12 Jan 2004, Embery, Nathan wrote:

> Close. That will match everything up until the last ">" that the string
> contains, which might not be what you want. You probably want to add the
> non-greedy modifier on like this:
> /<td.*?>/
> /<td[^>]*>/

I tested both with <td align=foo><br> and they both matched just the <td> 
tag. Mine grabs every non > character, and the > matches the close to <td>

I've just recently started working with the non-greedy modifier, and what 
I'm not clear on how it is non-greedy. Since yours worked 
on my test string, I'm guessing that the .* matches until it finds the 
first token after the .*? in this case a >. The other way I could forsee 
it working is for the .* to grab until the end of the string, when that 
match fails, it moves backwards returning character to the unmatched 
group. With this method, it would match the whole thing.

The regex I'm writing now is trying to match parts of a webpage where it 
seems each line everything is optional. :(

-james


_______________________________________________
wplug mailing list
wplug at wplug.org
http://www.wplug.org/mailman/listinfo/wplug



More information about the wplug mailing list