[Bioperl-l] Query for parsing BLAST with BioGraphics

Lincoln Stein lstein@cshl.org
Tue, 17 Dec 2002 15:48:52 -0500


As documented in the HOWTO, the script filters out all blast hits with 
e-values less than 1E-20.  None of your hits are even remotely significant. 
The minimum e-value is 2.4, meaning that at random there you would expect 
more than 2 hits in the database.

If you want to show all blast hits, no matter how meaningless, just find the 
place where the script filters out blast scores by significance, and comment 
it out.

Lincoln

PS: in the future, it would help to turn off word-wrap in your e-mail message, 
since both the script and the example blast file were wrapped and made it 
hard to read.  There is also a formal bug report system for bioperl, accessed 
via the "Bugs" link on the home page.  I recommend you use it.

On Tuesday 17 December 2002 03:40 am, Soumyadeep nandi wrote:
> BLASTN 2.2.3 [Apr-24-2002]
>
>
> Reference: Altschul, Stephen F., Thomas L. Madden,
> Alejandro A. Schaffer,
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
> Lipman (1997),
> "Gapped BLAST and PSI-BLAST: a new generation of
> protein database search
> programs",  Nucleic Acids Res. 25:3389-3402.
>
> Query= test query
>          (178 letters)
>
> Database: /home/soumya/Application/BLAST/data/ecoli.nt
>
>            400 sequences; 4,662,239 total letters
>
> Searching.done
>
>                                                      
>           Score    E
> Sequences producing significant alignments:          
>           (bits) Value
>
> gb|AE000180.1|AE000180 Escherichia coli K-12 MG1655
> section 70 o...    28   2.4  
> gb|AE000161.1|AE000161 Escherichia coli K-12 MG1655
> section 51 o...    28   2.4  
> gb|AE000454.1|AE000454 Escherichia coli K-12 MG1655
> section 344 ...    26   9.4  
> gb|AE000418.1|AE000418 Escherichia coli K-12 MG1655
> section 308 ...    26   9.4  
> gb|AE000241.1|AE000241 Escherichia coli K-12 MG1655
> section 131 ...    26   9.4
> gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655
> section 1 of...    26   9.4  
>
> >gb|AE000180.1|AE000180 Escherichia coli K-12 MG1655
>
> section 70 of 400 of the complete genome
>           Length = 11022
>
>  Score = 28.2 bits (14), Expect = 2.4
>  Identities = 14/14 (100%)
>  Strand = Plus / Minus
>
>                          
> Query: 94   tcatctgctcgcgt 107
>             ||||||||||||||
> Sbjct: 4297 tcatctgctcgcgt 4284
>
> >gb|AE000161.1|AE000161 Escherichia coli K-12 MG1655
>
> section 51 of 400 of the complete genome
>           Length = 16170
>
>  Score = 28.2 bits (14), Expect = 2.4
>  Identities = 14/14 (100%)
>  Strand = Plus / Plus
>
>
> Query: 126   tagctacgatagct 139
>              ||||||||||||||
> Sbjct: 11520 tagctacgatagct 11533
>
> >gb|AE000454.1|AE000454 Escherichia coli K-12 MG1655
>
> section 344 of 400 of the complete genome
>           Length = 12175
>
>  Score = 26.3 bits (13), Expect = 9.4
>  Identities = 13/13 (100%)
>  Strand = Plus / Minus
>
>                          
> Query: 153   catatccattagc 165
>              |||||||||||||
> Sbjct: 10774 catatccattagc 10762
>
> >gb|AE000418.1|AE000418 Escherichia coli K-12 MG1655
>
> section 308 of 400 of the complete
>            genome
>           Length = 10776
>
>  Score = 26.3 bits (13), Expect = 9.4
>  Identities = 13/13 (100%)
>  Strand = Plus / Minus
>
>                        
> Query: 74  tgatcagatgata 86
>            |||||||||||||
> Sbjct: 611 tgatcagatgata 599
>
> >gb|AE000241.1|AE000241 Escherichia coli K-12 MG1655
>
> section 131 of 400 of the complete
>             genome
>           Length = 10160
>
>  Score = 26.3 bits (13), Expect = 9.4
>  Identities = 13/13 (100%)
>  Strand = Plus / Minus
>
>                          
> Query: 96   atctgctcgcgta 108
>             |||||||||||||
> Sbjct: 1288 atctgctcgcgta 1276
>
> >gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655
>
> section 1 of 400 of the complete genome
>           Length = 10596
>
>  Score = 26.3 bits (13), Expect = 9.4
>  Identities = 13/13 (100%)
>  Strand = Plus / Minus
>
>                          
> Query: 78   cagatgatattct 90
>             |||||||||||||
> Sbjct: 1974 cagatgatattct 1962
>
>
>   Database:
> /home/soumya/Application/BLAST/data/ecoli.nt
>     Posted date:  Oct 21, 2002  3:48 PM
>   Number of letters in database: 4,662,239
>   Number of sequences in database:  400
>  
> Lambda     K      H
>     1.37    0.711     1.31
>
> Gapped
> Lambda     K      H
>     1.37    0.711     1.31
>
>
> Matrix: blastn matrix:1 -3
> Gap Penalties: Existence: 5, Extension: 2
> Number of Hits to DB: 169
> Number of Sequences: 400
> Number of extensions: 169
> Number of successful extensions: 6
> Number of sequences better than 10.0: 6
> length of query: 178
> length of database: 4,662,239
> effective HSP length: 15
> effective length of query: 163
> effective length of database: 4,656,239
> effective search space: 758966957
> effective search space used: 758966957
> T: 0
> A: 40
> X1: 6 (11.9 bits)
> X2: 15 (29.7 bits)
> S1: 12 (24.3 bits)
> S2: 13 (26.3 bits)

-- 
========================================================================
Lincoln D. Stein                           Cold Spring Harbor Laboratory
lstein@cshl.org			                  Cold Spring Harbor, NY
========================================================================