[wplug] regex help
Embery, Nathan
Nathan.Embery at crowncastle.com
Mon Jan 12 15:21:51 EST 2004
greediness isn't that bad until you get the feel for it.
anytime you use a * perl will want to match the preceding character 0 or
more times, as many times as it can. So, '.*' will match the entire string
if you let it. '.*a' will match up until the last a in the string. '.*?a'
will match only until the first a. while '.*a?' would also match the whole
line, ( everything up to 0 or more a's )... fun stuff.
On Mon, 12 Jan 2004, Embery, Nathan wrote:
> Close. That will match everything up until the last ">" that the string
> contains, which might not be what you want. You probably want to add the
> non-greedy modifier on like this:
> /<td.*?>/
> /<td[^>]*>/
I tested both with <td align=foo><br> and they both matched just the <td>
tag. Mine grabs every non > character, and the > matches the close to <td>
I've just recently started working with the non-greedy modifier, and what
I'm not clear on how it is non-greedy. Since yours worked
on my test string, I'm guessing that the .* matches until it finds the
first token after the .*? in this case a >. The other way I could forsee
it working is for the .* to grab until the end of the string, when that
match fails, it moves backwards returning character to the unmatched
group. With this method, it would match the whole thing.
The regex I'm writing now is trying to match parts of a webpage where it
seems each line everything is optional. :(
-james
_______________________________________________
wplug mailing list
wplug at wplug.org
http://www.wplug.org/mailman/listinfo/wplug
More information about the wplug
mailing list