[Bioperl-l] Possible bug in Bio::SearchIO

Charles Hauser chauser@duke.edu
16 Sep 2002 13:24:10 -0400


Ewan,

I don't know if this is a bug, or by design, or my ignorance.  

When using Bio::SearchIO to parse blast results and a hit has 2 HSPs
(example below):
	$hsp->frac_identical()  
	$hsp->frac_conserved()

return the values for the 2nd HSP  (Iden-> 40, Sim-> 64, in case below).



>gi|15237321|ref|NP_197133.1| (NM_121634) acetolactate synthase-like protein [Arabidopsis
           thaliana]
 gi|9759111|dbj|BAB09596.1| (AB005242) acetolactate synthase-like protein [Arabidopsis
           thaliana]
          Length = 477

 Score =  230 bits (586), Expect = 2e-59
 Identities = 142/318 (44%), Positives = 195/318 (60%), Gaps = 6/318 (1%)
 Frame = +1

Query: 4   RGIISIFVADEPGLINRVAGVFARRGANIESLAVGLTVDKALFTVVVAGKANVVANLVKQ 183
           R  IS+FV DE G+INR+AGVFARRG NIESLAVGL  DKALFT+VV G   V+  +V+Q
Sbjct: 76  RHTISVFVGDESGIINRIAGVFARRGYNIESLAVGLNEDKALFTIVVLGTDKVLQQVVEQ 135

Query: 184 LGKLVKVRYVEDITSTNRIEREMLLLKLRVPAGSTRAEVLELAAVFRARVVDVGDETLSL 363
           L KLV V  VED++    +ERE++L+KL     STR+E++ L  +FRA++VD  +++L++
Sbjct: 136 LNKLVNVIKVEDLSKEPHVERELMLIKLNADP-STRSEIMWLVDIFRAKIVDTSEQSLTI 194

Query: 364 CVTGDPGKLTAMIKVMSKFGIEQLTRTXRICLRRGEALLERSAGIPEQIAVPLPEAAKVK 543
            VTGDPGK+ A+   + KFGI+++ RT +I LRR +  +  +A      A   P   K +
Sbjct: 195 EVTGDPGKMVALTTNLEKFGIKEIARTGKIALRREK--MGETAPFWRFSAASYPHLVK-E 251

Query: 544 AASSNGAPKAAAA-----GEERGADVYVVDD-ADLKGVWDVDNVLSPTYSASGAGALPAD 705
           ++    A K   A         G DVY V+   D K V D    +     +SG       
Sbjct: 252 SSHETVAEKTKLALTGNGNASSGGDVYPVEPYNDFKPVLDAHWGMVYDEDSSG------- 304

Query: 706 FKPYTLSIEVQDVPGVLNQVTMVFSRRGYNVQSLAVGPSEREGLSRIVMVVPGKVSSPDG 885
            + +TLS+ V +VPGVLN +T   SRRGYN+QSLAVGP+E+EGLSRI  V+PG       
Sbjct: 305 LRSHTLSLLVANVPGVLNLITGAISRRGYNIQSLAVGPAEKEGLSRITTVIPGT------ 358

Query: 886 SSGISPLLKQLSKLVFVQ 939
              I  L++QL KL+ +Q
Sbjct: 359 DENIDKLVRQLQKLIDLQ 376



 Score =  110 bits (276), Expect = 2e-23
 Identities = 61/153 (39%), Positives = 98/153 (63%), Gaps = 2/153 (1%)
 Frame = +1

Query: 13  ISIFVADEPGLINRVAGVFARRGANIESLAVGLTVDKAL--FTVVVAGKANVVANLVKQL 186
           +S+ VA+ PG++N + G  +RRG NI+SLAVG    + L   T V+ G    +  LV+QL
Sbjct: 310 LSLLVANVPGVLNLITGAISRRGYNIQSLAVGPAEKEGLSRITTVIPGTDENIDKLVRQL 369

Query: 187 GKLVKVRYVEDITSTNRIEREMLLLKLRVPAGSTRAEVLELAAVFRARVVDVGDETLSLC 366
            KL+ ++ +++IT     ERE++L+K+     S R +VL++A VFRA+ +DV D T++L 
Sbjct: 370 QKLIDLQEIQNITHMPFAERELMLIKVAADT-SARRDVLDIAQVFRAKAIDVSDHTITLE 428

Query: 367 VTGDPGKLTAMIKVMSKFGIEQLTRTXRICLRR 465
           VTGD  K++A+   +  +GI ++ RT R+ L R
Sbjct: 429 VTGDLRKMSALQTQLEAYGICEVARTGRVALVR 461