[Bioperl-l] BPpsilite possible bug?

David García Cortés davidg at lsi.upc.edu
Thu Jan 27 09:21:27 EST 2005


Hello.

I'm using BPpsilite to parse a PsiBlast results file, and I've noticed something strange that seems to be a bug. The thing is: it doesn't get the HSP length in some concrete cases, while it works correctly in others.

I obtain the HSP length this way:

while ( (my $sbjct =  $last_iteration ->nextSbjct) ) 
  {
       while (my $hsp = $sbjct->nextHSP) { 
         my $hlength  = $hsp->length;
         print "$hlength";
       }
 }

And it works fine for many cases, but in other ones it doesn't. I've seen that, when parsing result files where there are more than one sequence producing significant alignments versus the query sequence, everything works OK. But when there's only one sequence producing significant alignments, then it $hsp->length doesn't get the HSP size correctly.

For example, when parsing the results file I include at the end of this mail, the HSP lenghts are wrong. Is it a bug or am I doing something wrong?

Thanks in advance.

--
David García Cortés
Instituto Nacional de Bioinformática (INB)
Nodo Computacional GNHC-2 UPC-CIRI
c/. Jordi Girona 1-3              
Modul C6-E201                   Tel.  : 934 011 650
E-08034 Barcelona               Fax   : 934 017 014
Catalunya (Spain)               e-mail: davidg at lsi.upc.edu





RESULTS FILE: 

***********************

BLASTP 2.2.6 [Apr-09-2003]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Query= gi|18676612|dbj|BAB84958.1|
         (359 letters)

Database: nr-0.fa 
           175 sequences; 100,812 total letters

Searching.........done


Results from round 1


                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

dbj|BAB84958.1| FLJ00205 protein [Homo sapiens]                       710   0.0  

>dbj|BAB84958.1| FLJ00205 protein [Homo sapiens]
          Length = 359

 Score =  710 bits (1832), Expect = 0.0
 Identities = 359/359 (100%), Positives = 359/359 (100%)

Query: 1   LLQAVALVLAALVLLPNVGLWALYRERQPDGTPGGSGAAVAPAAGQGSHSRQKKTFFLGD 60
           LLQAVALVLAALVLLPNVGLWALYRERQPDGTPGGSGAAVAPAAGQGSHSRQKKTFFLGD
Sbjct: 1   LLQAVALVLAALVLLPNVGLWALYRERQPDGTPGGSGAAVAPAAGQGSHSRQKKTFFLGD 60

Query: 61  GQKLKDWHDKEAIRRDAQRVGNGEQGRPYPMTDAERVDQAYRENGFNIYVSDKISLNRSL 120
           GQKLKDWHDKEAIRRDAQRVGNGEQGRPYPMTDAERVDQAYRENGFNIYVSDKISLNRSL
Sbjct: 61  GQKLKDWHDKEAIRRDAQRVGNGEQGRPYPMTDAERVDQAYRENGFNIYVSDKISLNRSL 120

Query: 121 PDIRHPNCNSKRYLETLPNTSIIIPFHNEGWSSLLRTVHSVLNRSPPELVAEIVLVDDFS 180
           PDIRHPNCNSKRYLETLPNTSIIIPFHNEGWSSLLRTVHSVLNRSPPELVAEIVLVDDFS
Sbjct: 121 PDIRHPNCNSKRYLETLPNTSIIIPFHNEGWSSLLRTVHSVLNRSPPELVAEIVLVDDFS 180

Query: 181 DREHLKKPLEDYMALFPSVRILRTKKREGLIRTRMLGASVATGDVITFLDSHCEANVNWL 240
           DREHLKKPLEDYMALFPSVRILRTKKREGLIRTRMLGASVATGDVITFLDSHCEANVNWL
Sbjct: 181 DREHLKKPLEDYMALFPSVRILRTKKREGLIRTRMLGASVATGDVITFLDSHCEANVNWL 240

Query: 241 PPLLDRIARNRKTIVCPMIDVIDHDDFRYETQAGDAMRGAFDWEMYYKRIPIPPELQKAD 300
           PPLLDRIARNRKTIVCPMIDVIDHDDFRYETQAGDAMRGAFDWEMYYKRIPIPPELQKAD
Sbjct: 241 PPLLDRIARNRKTIVCPMIDVIDHDDFRYETQAGDAMRGAFDWEMYYKRIPIPPELQKAD 300

Query: 301 PSDPFESPVMAGGLFAVDRKWFWELGGYDPGLEIWGGEQYEISFKVSQLSRRPVLGTAS 359
           PSDPFESPVMAGGLFAVDRKWFWELGGYDPGLEIWGGEQYEISFKVSQLSRRPVLGTAS
Sbjct: 301 PSDPFESPVMAGGLFAVDRKWFWELGGYDPGLEIWGGEQYEISFKVSQLSRRPVLGTAS 359


Searching.........done


Results from round 2


                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value
Sequences used in model and found again:

dbj|BAB84958.1| FLJ00205 protein [Homo sapiens]                       758   0.0  

Sequences not found previously or not previously below threshold:


CONVERGED!
>dbj|BAB84958.1| FLJ00205 protein [Homo sapiens]
          Length = 359

 Score =  758 bits (1956), Expect = 0.0
 Identities = 359/359 (100%), Positives = 359/359 (100%)

Query: 1   LLQAVALVLAALVLLPNVGLWALYRERQPDGTPGGSGAAVAPAAGQGSHSRQKKTFFLGD 60
           LLQAVALVLAALVLLPNVGLWALYRERQPDGTPGGSGAAVAPAAGQGSHSRQKKTFFLGD
Sbjct: 1   LLQAVALVLAALVLLPNVGLWALYRERQPDGTPGGSGAAVAPAAGQGSHSRQKKTFFLGD 60

Query: 61  GQKLKDWHDKEAIRRDAQRVGNGEQGRPYPMTDAERVDQAYRENGFNIYVSDKISLNRSL 120
           GQKLKDWHDKEAIRRDAQRVGNGEQGRPYPMTDAERVDQAYRENGFNIYVSDKISLNRSL
Sbjct: 61  GQKLKDWHDKEAIRRDAQRVGNGEQGRPYPMTDAERVDQAYRENGFNIYVSDKISLNRSL 120

Query: 121 PDIRHPNCNSKRYLETLPNTSIIIPFHNEGWSSLLRTVHSVLNRSPPELVAEIVLVDDFS 180
           PDIRHPNCNSKRYLETLPNTSIIIPFHNEGWSSLLRTVHSVLNRSPPELVAEIVLVDDFS
Sbjct: 121 PDIRHPNCNSKRYLETLPNTSIIIPFHNEGWSSLLRTVHSVLNRSPPELVAEIVLVDDFS 180

Query: 181 DREHLKKPLEDYMALFPSVRILRTKKREGLIRTRMLGASVATGDVITFLDSHCEANVNWL 240
           DREHLKKPLEDYMALFPSVRILRTKKREGLIRTRMLGASVATGDVITFLDSHCEANVNWL
Sbjct: 181 DREHLKKPLEDYMALFPSVRILRTKKREGLIRTRMLGASVATGDVITFLDSHCEANVNWL 240

Query: 241 PPLLDRIARNRKTIVCPMIDVIDHDDFRYETQAGDAMRGAFDWEMYYKRIPIPPELQKAD 300
           PPLLDRIARNRKTIVCPMIDVIDHDDFRYETQAGDAMRGAFDWEMYYKRIPIPPELQKAD
Sbjct: 241 PPLLDRIARNRKTIVCPMIDVIDHDDFRYETQAGDAMRGAFDWEMYYKRIPIPPELQKAD 300

Query: 301 PSDPFESPVMAGGLFAVDRKWFWELGGYDPGLEIWGGEQYEISFKVSQLSRRPVLGTAS 359
           PSDPFESPVMAGGLFAVDRKWFWELGGYDPGLEIWGGEQYEISFKVSQLSRRPVLGTAS
Sbjct: 301 PSDPFESPVMAGGLFAVDRKWFWELGGYDPGLEIWGGEQYEISFKVSQLSRRPVLGTAS 359


  Database: nr-0.fa
    Posted date:  Jan 13, 2005  6:32 PM
  Number of letters in database: 100,812
  Number of sequences in database:  175
  
Lambda     K      H
   0.320    0.139    0.427 

Lambda     K      H
   0.267   0.0424    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 174,609
Number of Sequences: 175
Number of extensions: 8789
Number of successful extensions: 19
Number of sequences better than  1.0: 1
Number of HSP's better than  1.0 without gapping: 2
Number of HSP's successfully gapped in prelim test: 0
Number of HSP's that attempted gapping in prelim test: 17
Number of HSP's gapped (non-prelim): 2
length of query: 359
length of database: 100,812
effective HSP length: 69
effective length of query: 290
effective length of database: 88,737
effective search space: 25733730
effective search space used: 25733730
T: 11
A: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 53 (25.0 bits)

*********************************************


More information about the Bioperl-l mailing list