[Bioperl-l] BPpsilite possible bug?
David García Cortés
davidg at lsi.upc.edu
Thu Jan 27 09:21:27 EST 2005
Hello.
I'm using BPpsilite to parse a PsiBlast results file, and I've noticed something strange that seems to be a bug. The thing is: it doesn't get the HSP length in some concrete cases, while it works correctly in others.
I obtain the HSP length this way:
while ( (my $sbjct = $last_iteration ->nextSbjct) )
{
while (my $hsp = $sbjct->nextHSP) {
my $hlength = $hsp->length;
print "$hlength";
}
}
And it works fine for many cases, but in other ones it doesn't. I've seen that, when parsing result files where there are more than one sequence producing significant alignments versus the query sequence, everything works OK. But when there's only one sequence producing significant alignments, then it $hsp->length doesn't get the HSP size correctly.
For example, when parsing the results file I include at the end of this mail, the HSP lenghts are wrong. Is it a bug or am I doing something wrong?
Thanks in advance.
--
David García Cortés
Instituto Nacional de Bioinformática (INB)
Nodo Computacional GNHC-2 UPC-CIRI
c/. Jordi Girona 1-3
Modul C6-E201 Tel. : 934 011 650
E-08034 Barcelona Fax : 934 017 014
Catalunya (Spain) e-mail: davidg at lsi.upc.edu
RESULTS FILE:
***********************
BLASTP 2.2.6 [Apr-09-2003]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Query= gi|18676612|dbj|BAB84958.1|
(359 letters)
Database: nr-0.fa
175 sequences; 100,812 total letters
Searching.........done
Results from round 1
Score E
Sequences producing significant alignments: (bits) Value
dbj|BAB84958.1| FLJ00205 protein [Homo sapiens] 710 0.0
>dbj|BAB84958.1| FLJ00205 protein [Homo sapiens]
Length = 359
Score = 710 bits (1832), Expect = 0.0
Identities = 359/359 (100%), Positives = 359/359 (100%)
Query: 1 LLQAVALVLAALVLLPNVGLWALYRERQPDGTPGGSGAAVAPAAGQGSHSRQKKTFFLGD 60
LLQAVALVLAALVLLPNVGLWALYRERQPDGTPGGSGAAVAPAAGQGSHSRQKKTFFLGD
Sbjct: 1 LLQAVALVLAALVLLPNVGLWALYRERQPDGTPGGSGAAVAPAAGQGSHSRQKKTFFLGD 60
Query: 61 GQKLKDWHDKEAIRRDAQRVGNGEQGRPYPMTDAERVDQAYRENGFNIYVSDKISLNRSL 120
GQKLKDWHDKEAIRRDAQRVGNGEQGRPYPMTDAERVDQAYRENGFNIYVSDKISLNRSL
Sbjct: 61 GQKLKDWHDKEAIRRDAQRVGNGEQGRPYPMTDAERVDQAYRENGFNIYVSDKISLNRSL 120
Query: 121 PDIRHPNCNSKRYLETLPNTSIIIPFHNEGWSSLLRTVHSVLNRSPPELVAEIVLVDDFS 180
PDIRHPNCNSKRYLETLPNTSIIIPFHNEGWSSLLRTVHSVLNRSPPELVAEIVLVDDFS
Sbjct: 121 PDIRHPNCNSKRYLETLPNTSIIIPFHNEGWSSLLRTVHSVLNRSPPELVAEIVLVDDFS 180
Query: 181 DREHLKKPLEDYMALFPSVRILRTKKREGLIRTRMLGASVATGDVITFLDSHCEANVNWL 240
DREHLKKPLEDYMALFPSVRILRTKKREGLIRTRMLGASVATGDVITFLDSHCEANVNWL
Sbjct: 181 DREHLKKPLEDYMALFPSVRILRTKKREGLIRTRMLGASVATGDVITFLDSHCEANVNWL 240
Query: 241 PPLLDRIARNRKTIVCPMIDVIDHDDFRYETQAGDAMRGAFDWEMYYKRIPIPPELQKAD 300
PPLLDRIARNRKTIVCPMIDVIDHDDFRYETQAGDAMRGAFDWEMYYKRIPIPPELQKAD
Sbjct: 241 PPLLDRIARNRKTIVCPMIDVIDHDDFRYETQAGDAMRGAFDWEMYYKRIPIPPELQKAD 300
Query: 301 PSDPFESPVMAGGLFAVDRKWFWELGGYDPGLEIWGGEQYEISFKVSQLSRRPVLGTAS 359
PSDPFESPVMAGGLFAVDRKWFWELGGYDPGLEIWGGEQYEISFKVSQLSRRPVLGTAS
Sbjct: 301 PSDPFESPVMAGGLFAVDRKWFWELGGYDPGLEIWGGEQYEISFKVSQLSRRPVLGTAS 359
Searching.........done
Results from round 2
Score E
Sequences producing significant alignments: (bits) Value
Sequences used in model and found again:
dbj|BAB84958.1| FLJ00205 protein [Homo sapiens] 758 0.0
Sequences not found previously or not previously below threshold:
CONVERGED!
>dbj|BAB84958.1| FLJ00205 protein [Homo sapiens]
Length = 359
Score = 758 bits (1956), Expect = 0.0
Identities = 359/359 (100%), Positives = 359/359 (100%)
Query: 1 LLQAVALVLAALVLLPNVGLWALYRERQPDGTPGGSGAAVAPAAGQGSHSRQKKTFFLGD 60
LLQAVALVLAALVLLPNVGLWALYRERQPDGTPGGSGAAVAPAAGQGSHSRQKKTFFLGD
Sbjct: 1 LLQAVALVLAALVLLPNVGLWALYRERQPDGTPGGSGAAVAPAAGQGSHSRQKKTFFLGD 60
Query: 61 GQKLKDWHDKEAIRRDAQRVGNGEQGRPYPMTDAERVDQAYRENGFNIYVSDKISLNRSL 120
GQKLKDWHDKEAIRRDAQRVGNGEQGRPYPMTDAERVDQAYRENGFNIYVSDKISLNRSL
Sbjct: 61 GQKLKDWHDKEAIRRDAQRVGNGEQGRPYPMTDAERVDQAYRENGFNIYVSDKISLNRSL 120
Query: 121 PDIRHPNCNSKRYLETLPNTSIIIPFHNEGWSSLLRTVHSVLNRSPPELVAEIVLVDDFS 180
PDIRHPNCNSKRYLETLPNTSIIIPFHNEGWSSLLRTVHSVLNRSPPELVAEIVLVDDFS
Sbjct: 121 PDIRHPNCNSKRYLETLPNTSIIIPFHNEGWSSLLRTVHSVLNRSPPELVAEIVLVDDFS 180
Query: 181 DREHLKKPLEDYMALFPSVRILRTKKREGLIRTRMLGASVATGDVITFLDSHCEANVNWL 240
DREHLKKPLEDYMALFPSVRILRTKKREGLIRTRMLGASVATGDVITFLDSHCEANVNWL
Sbjct: 181 DREHLKKPLEDYMALFPSVRILRTKKREGLIRTRMLGASVATGDVITFLDSHCEANVNWL 240
Query: 241 PPLLDRIARNRKTIVCPMIDVIDHDDFRYETQAGDAMRGAFDWEMYYKRIPIPPELQKAD 300
PPLLDRIARNRKTIVCPMIDVIDHDDFRYETQAGDAMRGAFDWEMYYKRIPIPPELQKAD
Sbjct: 241 PPLLDRIARNRKTIVCPMIDVIDHDDFRYETQAGDAMRGAFDWEMYYKRIPIPPELQKAD 300
Query: 301 PSDPFESPVMAGGLFAVDRKWFWELGGYDPGLEIWGGEQYEISFKVSQLSRRPVLGTAS 359
PSDPFESPVMAGGLFAVDRKWFWELGGYDPGLEIWGGEQYEISFKVSQLSRRPVLGTAS
Sbjct: 301 PSDPFESPVMAGGLFAVDRKWFWELGGYDPGLEIWGGEQYEISFKVSQLSRRPVLGTAS 359
Database: nr-0.fa
Posted date: Jan 13, 2005 6:32 PM
Number of letters in database: 100,812
Number of sequences in database: 175
Lambda K H
0.320 0.139 0.427
Lambda K H
0.267 0.0424 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 174,609
Number of Sequences: 175
Number of extensions: 8789
Number of successful extensions: 19
Number of sequences better than 1.0: 1
Number of HSP's better than 1.0 without gapping: 2
Number of HSP's successfully gapped in prelim test: 0
Number of HSP's that attempted gapping in prelim test: 17
Number of HSP's gapped (non-prelim): 2
length of query: 359
length of database: 100,812
effective HSP length: 69
effective length of query: 290
effective length of database: 88,737
effective search space: 25733730
effective search space used: 25733730
T: 11
A: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 53 (25.0 bits)
*********************************************
More information about the Bioperl-l
mailing list