[Bioperl-l] Findpatterns

Fernandez-Capetillo, Oscar (NIH/NCI) fernando@mail.nih.gov
Sun, 24 Nov 2002 12:37:26 -0500


Hi there,
Not sure I need the Bio part of Perl to do this. Anyway, maybe somebody has
tried this before and can help me out.
I am trying to run the search of a short nucleotide pattern against the
human and mouse genome databases. Exactly, I will want to find where a
nucleotide pattern is present in the genome. My nucleotide pattern is of
medium complexity (I can only represent it as a regular expression otherways
the combinatios will be huge). Lets say something like:
ACTCTATCANNNNNNNNNNNNNNACTATCTTGGCATCGACNNNNNNNNCATGCTAGCATCGGG
I know that years ago the freely usable GCG package had a tool named
findpatterns which you could use to do so. Unfortunately, people is not only
driven by the shake of Science and now GCG is commercial (I wonder what will
happen if Newton would have pattented the differential calculus). So I am
all alone. I could try a Blast for short sequences, but it does not accept
vaguities as NNNNN... 
I'd want to run it against both mouse and human genome databases, which I
don't think I can access through the bioperl interface.
I'd appreciate any help.Thanks,
Oskar
Oskar Fernandez-Capetillo, Ph. D.
 NCI Build., 10 Room 4A01
National Institute of Health
10 Center Drive
Bethesda, MD
20892, 1360

Phone: 301-496-4673
Fax:      301-496-0887
e-mail: fernando@mail.nih.gov
www: http://usuarios.lycos.es/h2ax/