[wplug] regex help
Mike Kuentz (2)
JunkEmail at rapidigm.com
Mon Jan 12 15:10:40 EST 2004
I was having a similar problem a while back and found this paragraph
that seems to explain it well:
"What can the trailing '?' be modifying? if * means 0 or more
of the previous thing and ? means 0 or 1 of the previous thing then what
logical meaning has '*?' ? 0 or or 0 or 1 of the previous thing? useful
semantics, eh? so larry [Wall] in his infinite wisdom (all bow down to
larry
now :-) made that useless combination into something VERY valuable in
the regex world which is the ability to choose greediness and we all
thank him for it. it has saved many a regex from being more
complicated..."
Also, you'd want to do:
/<td\s.*?>/
because the other two will match <tdfoo> which isn't what you're looking
for.
The \s matches white space. See
<http://www.contactor.se/~dast/mail2sms/regex.shtml> for a good regex
guide.
Mike
> -----Original Message-----
> From: wplug-admin at wplug.org [mailto:wplug-admin at wplug.org] On
> Behalf Of James O'Kane
> Sent: Monday, January 12, 2004 2:54 PM
> To: 'wplug at wplug.org'
> Subject: RE: [wplug] regex help
>
>
> On Mon, 12 Jan 2004, Embery, Nathan wrote:
>
> > Close. That will match everything up until the last ">"
> that the string
> > contains, which might not be what you want. You probably
> want to add the
> > non-greedy modifier on like this:
> > /<td.*?>/
> > /<td[^>]*>/
>
> I tested both with <td align=foo><br> and they both matched
> just the <td>
> tag. Mine grabs every non > character, and the > matches the
> close to <td>
>
> I've just recently started working with the non-greedy
> modifier, and what
> I'm not clear on how it is non-greedy. Since yours worked
> on my test string, I'm guessing that the .* matches until it
> finds the
> first token after the .*? in this case a >. The other way I
> could forsee
> it working is for the .* to grab until the end of the string,
> when that
> match fails, it moves backwards returning character to the unmatched
> group. With this method, it would match the whole thing.
>
> The regex I'm writing now is trying to match parts of a
> webpage where it
> seems each line everything is optional. :(
>
> -james
>
>
> _______________________________________________
> wplug mailing list
> wplug at wplug.org
> http://www.wplug.org/mailman/listinfo/wplug
>
More information about the wplug
mailing list