[wplug] regex help

Mike Kuentz (2) JunkEmail at rapidigm.com
Mon Jan 12 15:10:40 EST 2004


I was having a similar problem a while back and found this paragraph
that seems to explain it well:

	"What can the trailing '?' be modifying? if * means 0 or more
of the previous thing and ? means 0 or 1 of the previous thing then what
logical meaning has '*?' ?  0 or or 0 or 1 of the previous thing? useful
semantics, eh? so larry [Wall] in his infinite wisdom (all bow down to
larry
now :-) made that useless combination into something VERY valuable in
the regex world which is the ability to choose greediness and we all
thank him for it. it has saved many a regex from being more
complicated..."


Also, you'd want to do:
/<td\s.*?>/
because the other two will match <tdfoo> which isn't what you're looking
for.
The \s matches white space.  See
<http://www.contactor.se/~dast/mail2sms/regex.shtml> for a good regex
guide.

Mike


> -----Original Message-----
> From: wplug-admin at wplug.org [mailto:wplug-admin at wplug.org] On 
> Behalf Of James O'Kane
> Sent: Monday, January 12, 2004 2:54 PM
> To: 'wplug at wplug.org'
> Subject: RE: [wplug] regex help
> 
> 
> On Mon, 12 Jan 2004, Embery, Nathan wrote:
> 
> > Close. That will match everything up until the last ">" 
> that the string
> > contains, which might not be what you want. You probably 
> want to add the
> > non-greedy modifier on like this:
> > /<td.*?>/
> > /<td[^>]*>/
> 
> I tested both with <td align=foo><br> and they both matched 
> just the <td> 
> tag. Mine grabs every non > character, and the > matches the 
> close to <td>
> 
> I've just recently started working with the non-greedy 
> modifier, and what 
> I'm not clear on how it is non-greedy. Since yours worked 
> on my test string, I'm guessing that the .* matches until it 
> finds the 
> first token after the .*? in this case a >. The other way I 
> could forsee 
> it working is for the .* to grab until the end of the string, 
> when that 
> match fails, it moves backwards returning character to the unmatched 
> group. With this method, it would match the whole thing.
> 
> The regex I'm writing now is trying to match parts of a 
> webpage where it 
> seems each line everything is optional. :(
> 
> -james
> 
> 
> _______________________________________________
> wplug mailing list
> wplug at wplug.org
> http://www.wplug.org/mailman/listinfo/wplug
> 



More information about the wplug mailing list