[Bioperl-l] Possible bug in Bio::SearchIO
Charles Hauser
chauser@duke.edu
16 Sep 2002 13:24:10 -0400
Ewan,
I don't know if this is a bug, or by design, or my ignorance.
When using Bio::SearchIO to parse blast results and a hit has 2 HSPs
(example below):
$hsp->frac_identical()
$hsp->frac_conserved()
return the values for the 2nd HSP (Iden-> 40, Sim-> 64, in case below).
>gi|15237321|ref|NP_197133.1| (NM_121634) acetolactate synthase-like protein [Arabidopsis
thaliana]
gi|9759111|dbj|BAB09596.1| (AB005242) acetolactate synthase-like protein [Arabidopsis
thaliana]
Length = 477
Score = 230 bits (586), Expect = 2e-59
Identities = 142/318 (44%), Positives = 195/318 (60%), Gaps = 6/318 (1%)
Frame = +1
Query: 4 RGIISIFVADEPGLINRVAGVFARRGANIESLAVGLTVDKALFTVVVAGKANVVANLVKQ 183
R IS+FV DE G+INR+AGVFARRG NIESLAVGL DKALFT+VV G V+ +V+Q
Sbjct: 76 RHTISVFVGDESGIINRIAGVFARRGYNIESLAVGLNEDKALFTIVVLGTDKVLQQVVEQ 135
Query: 184 LGKLVKVRYVEDITSTNRIEREMLLLKLRVPAGSTRAEVLELAAVFRARVVDVGDETLSL 363
L KLV V VED++ +ERE++L+KL STR+E++ L +FRA++VD +++L++
Sbjct: 136 LNKLVNVIKVEDLSKEPHVERELMLIKLNADP-STRSEIMWLVDIFRAKIVDTSEQSLTI 194
Query: 364 CVTGDPGKLTAMIKVMSKFGIEQLTRTXRICLRRGEALLERSAGIPEQIAVPLPEAAKVK 543
VTGDPGK+ A+ + KFGI+++ RT +I LRR + + +A A P K +
Sbjct: 195 EVTGDPGKMVALTTNLEKFGIKEIARTGKIALRREK--MGETAPFWRFSAASYPHLVK-E 251
Query: 544 AASSNGAPKAAAA-----GEERGADVYVVDD-ADLKGVWDVDNVLSPTYSASGAGALPAD 705
++ A K A G DVY V+ D K V D + +SG
Sbjct: 252 SSHETVAEKTKLALTGNGNASSGGDVYPVEPYNDFKPVLDAHWGMVYDEDSSG------- 304
Query: 706 FKPYTLSIEVQDVPGVLNQVTMVFSRRGYNVQSLAVGPSEREGLSRIVMVVPGKVSSPDG 885
+ +TLS+ V +VPGVLN +T SRRGYN+QSLAVGP+E+EGLSRI V+PG
Sbjct: 305 LRSHTLSLLVANVPGVLNLITGAISRRGYNIQSLAVGPAEKEGLSRITTVIPGT------ 358
Query: 886 SSGISPLLKQLSKLVFVQ 939
I L++QL KL+ +Q
Sbjct: 359 DENIDKLVRQLQKLIDLQ 376
Score = 110 bits (276), Expect = 2e-23
Identities = 61/153 (39%), Positives = 98/153 (63%), Gaps = 2/153 (1%)
Frame = +1
Query: 13 ISIFVADEPGLINRVAGVFARRGANIESLAVGLTVDKAL--FTVVVAGKANVVANLVKQL 186
+S+ VA+ PG++N + G +RRG NI+SLAVG + L T V+ G + LV+QL
Sbjct: 310 LSLLVANVPGVLNLITGAISRRGYNIQSLAVGPAEKEGLSRITTVIPGTDENIDKLVRQL 369
Query: 187 GKLVKVRYVEDITSTNRIEREMLLLKLRVPAGSTRAEVLELAAVFRARVVDVGDETLSLC 366
KL+ ++ +++IT ERE++L+K+ S R +VL++A VFRA+ +DV D T++L
Sbjct: 370 QKLIDLQEIQNITHMPFAERELMLIKVAADT-SARRDVLDIAQVFRAKAIDVSDHTITLE 428
Query: 367 VTGDPGKLTAMIKVMSKFGIEQLTRTXRICLRR 465
VTGD K++A+ + +GI ++ RT R+ L R
Sbjct: 429 VTGDLRKMSALQTQLEAYGICEVARTGRVALVR 461