[wplug] Text searching

Steve Kudlak chromexa at ovis.net
Mon Mar 17 11:38:56 EST 2003


Thanks for the info. I did a scan and these all
look useful. They also are things helping to get
linux out of being a sort of "techies only" system.
Note it is a good idea to look at the other titles
Developing Bioinformatics Computer Skills.
Here is a pointer to a chapter from that book.

http://www.oreilly.com/catalog/bioskills/chapter/ch01.html

Have Fun,
Sends Steve


Paul Cantalupo wrote:

> Doug Green wrote:
> >
> > Hi all-
> >
> > I have some large text files that I need to search. They are genomic
> > sequences, and consist of 4 letters in a block of 10, separated by a
> > space. There are 6 such blocks on a line, and each line is numbered
> > for the order of the first letter (maybe 20,000+ lines per file?).
> > Essentially, the format looks like this (obviously, the content is
> > different):
> >
> > 1       atacaatagg atacaatagg atacaatagg atacaatagg atacaatagg
> > atacaatagg
> > 61     atacaatagg atacaatagg atacaatagg atacaatagg atacaatagg
> > atacaatagg
> >
> > I need to be able to search within this kind of text file for a string
> > of letters that is maybe 30-40 letters long, ignoring the spaces and
> > numbers. The whole point is that I need to locate the position of my
> > search string within the original text. Is there some fancy way to
> > grep the file, ignoring spaces and numbers? Or to somehow filter out
> > the spaces and numbers, creating a new file (maybe some cat option
> > piped into a new file??)?
> >
> > Any help/suggestions are greatly appreciated! Thanks!
> >
> > Doug
>
> Doug,
>
> I don't have a nifty filter command for you but you can find the
> solution to this problem somewhere in the O'Reilly book "Beginning Perl
> for Bioinformatics" (http://www.oreilly.com/catalog/begperlbio/). It is
> at the Carneige library. The book provides a open source perl module,
> BeginPerlBioinfo.pm, at
> http://examples.oreilly.com/begperlbio/BeginPerlBioinfo.pm.
>
> Also, if you want more powerful Perl stuff for Bioinformatics, check out
> www.bioperl.org.
>
> Good luck,
>
> Paul
>
> _______________________________________________
> wplug mailing list
> wplug at wplug.org
> http://www.wplug.org/mailman/listinfo/wplug




More information about the wplug mailing list