[Bioperl-l] Small word sizes with BLAST (WU, NCBI)

Joseph Bedell jbedell at oriongenomics.com
Wed Mar 3 15:59:16 EST 2004


Hi Andrew,

I'm cross-posting your question to the Sequence Search Mailing List
(SSML). This should be a good place for a discussion of your problem.

https://bioinformatics.org/mailman/listinfo/ssml-general

Are you looking for only 5-7bp matches with no extension? How big is
your oligo? One parameter that would need adjustment is E which should
probably be set outrageously high (1e-10?). Can you share the seq3.fasta
sequence? I could try blasting against refseq too or against some
sequence that you know it should hit.

Regards,
Joey

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Joseph A Bedell, Ph.D.
Director, Bioinformatics
Orion Genomics, LLC
4041 Forest Park Ave.
St. Louis, MO 63108
Office:(314)615-6979; Fax:(314)615-6975
Mobile:(314)518-1343
http://www.oriongenomics.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 

>-----Original Message-----
>From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
>bounces at portal.open-bio.org] On Behalf Of Andrew Walsh
>Sent: Wednesday, March 03, 2004 1:52 PM
>To: bioperl-l at portal.open-bio.org
>Subject: [Bioperl-l] Small word sizes with BLAST (WU, NCBI)
>
>Hello,
>
>My question is not really related to a specific Bioperl library, so I
>apologize.  If there is a specific 'BLAST' newsgroup, I will be happy
to
>post there.  But I was hoping somebody on the Bioperl list had some
>experience doing nucleic acid searches with small word sizes.
>
>I would like to search for  small (5-7) bp matches between an oligo
>sequence
>and a ~100,000 mRNA database.  I've tried doing this with WU-BLAST and
>NCBI-BLAST.  NCBI-BLAST does not allow word sizes below 7, so I've
tried
>lots of different command line parameters for WU-BLAST.
>
>I've tried these searches with versions 2.0a19 (alpha) and 2.0 of
WU-BLAST.
>
>I get quite strange results when I start lowering the word size below
the
>default (11).  For example, with the alpha version, I get more hits
with a
>word size of 10 than I do with a word size of 7.  With the beta
version, I
>get the same number of hits with word sizes 10 and 7.  I've checked
this by
>hand, and the 'missing' hits do in fact have stretches of 7 continuous
bps
>matching.
>
>Here is an example of one of the command lines I've tried running:
>blastn human_refseq.fasta seq3.fasta W=5 S=5 M=1 V=100000 B=100000
>
>I've tried adjusting every parameter I thought would affect the search
>results, but still cannot  recover the 'missing' hits.
>
>Maybe BLAST is the wrong tool for this.  I'd just like something that's
>fast.  If anyone has some advice, it would be greatly appreciated.
>
>Thanks a lot,
>
>Andrew
>
>_________________________________________________________________
>Add photos to your messages with MSN 8. Get 2 months FREE*.
>http://join.msn.com/?page=dept/features&pgmarket=en-
>ca&RU=http%3a%2f%2fjoin.msn.com%2f%3fpage%3dmisc%2fspecialoffers%26pgma
rket
>%3den-ca
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list