[Bioperl-l] bug in Bio::SearchIO?

Stephan Roessner stephan.roessner at gsf.de
Wed Sep 12 08:44:10 UTC 2007


Hi,

I am parsing a BlastN output with Bio::SearchIO and getting an error for some 
of the hits when retrieving the start and/or the end position with 
$hit->end('sbjct') , $hit->start('sbjct'). I want to filter for hits which 
are are of equal length (~ > 0.9) to the query sequences. 

SearchIO is retrieving the right results, but throws an exemption, in this 
case: MSG:Undefined sub-sequence (1633,760). Valid range = 693 - 760 .....

It seems to me valid range is parsed incorrectly, isn't it? Is this a bug?

Does anybody have a similar problem?

see code, error, and blastn output below.

thanks,
Stephan


Stephan Roessner
MIPS/IBI Inst. for Bioinformatics
GSF Research Center for Environment and Health
Ingolstädter Landstr. 1
85764 Neuherberg; Germany
phone: +49 (0)89 3187 3583
fax:       +49 (0)89 3187 3585
email: stephan.roessner at gsf.de


Here is the piece of code I am using:

my $blast_report = new Bio::SearchIO ('-format'=>'blast',
                                         '-file' => $source);
  	
while( my $result=$blast_report->next_result) {
		while( my $hit= $result->next_hit()) {
			print "Name: ".$hit->name."\n";
			print "S: ".$hit->start('sbjct')."\n";
			print "E: ".$hit->end('sbjct')."\n";
			print "L: ".$hit->length()."\n";
		}
 	}


Here's the message: 

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Undefined sub-sequence (1633,760). Valid range = 693 - 760
STACK: Error::throw
STACK: 
Bio::Root::Root::throw /usr/lib/perl5/vendor_perl/5.8.8/Bio/Root/Root.pm:359
STACK: 
Bio::Search::HSP::HSPI::matches /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/HSP/HSPI.pm:691
STACK: 
Bio::Search::SearchUtils::_adjust_contigs /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/SearchUtils.pm:489
STACK: 
Bio::Search::SearchUtils::tile_hsps /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/SearchUtils.pm:206
STACK: 
Bio::Search::Hit::GenericHit::start /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/Hit/GenericHit.pm:935
STACK: 
main::parse /home/users/roessner/workspace/GeneSimilarity/similarity_analysis.pl:82
STACK: /home/users/roessner/workspace/GeneSimilarity/similarity_analysis.pl:51
-----------------------------------------------------------

S: 635
E: 790
L: 2052

This is the BLASTN output I am parsing::

>LOC_Os11g37470.1 chr11_pseudomolecule_TIGR r_jap version0
            21623485-21621434 BestGuessTranscript
          Length = 2052

 Score = 95.6 bits (48), Expect = 1e-17
 Identities = 106/124 (85%), Gaps = 1/124 (0%)
 Strand = Plus / Plus

                                                                        
Query: 3191 tattaagcataattaatgtatcattagcacatgtagg-ttactgtagcatttaaggctaa 3249
            |||||||| |||||||| | ||||| ||||||||||| |||||||| || ||| ||||||
Sbjct: 635  tattaagcctaattaatctgtcattggcacatgtagggttactgtaacacttatggctaa 694

                                                                        
Query: 3250 tcatagagtaactagacttaaaagactcgtctcgcgattttcaaccaaactgtgtaatta 3309
            |||| || ||| |||||| |||||| || ||||||||||||||  ||||| ||| |||||
Sbjct: 695  tcatggactaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaatta 754

                
Query: 3310 gttt 3313
            ||||
Sbjct: 755  gttt 758



 Score = 48.1 bits (24), Expect = 0.002
 Identities = 57/68 (83%)
 Strand = Plus / Minus

                                                                        
Query: 2253 aaaaactaattacacaatttacctgtacatcgcgagatgaatcttttaagtttagttact 2312
            ||||||||||| |||  ||| | || | ||||||||||||||||||| ||| || ||| |
Sbjct: 760  aaaaactaattgcacggtttgcatgaaaatcgcgagatgaatcttttgagtctatttagt 701

                    
Query: 2313 ccatgatt 2320
            ||||||||
Sbjct: 700  ccatgatt 693



 Score = 44.1 bits (22), Expect = 0.038
 Identities = 76/94 (80%)
 Strand = Plus / Minus

                                                                        
Query: 1539 atgcatgtagtattaaatatagacgaaaataaaaactaattgcacagtttggtcgaaatt 1598
            ||||||| || |||||||||| |  |||  ||||||||||||||| |||||   |||| |
Sbjct: 790  atgcatggagcattaaatataaataaaatgaaaaactaattgcacggtttgcatgaaaat 731

                                              
Query: 1599 gtcgagacgaattttttgagtctagttaggccat 1632
              ||||| |||| ||||||||||| |||| ||||
Sbjct: 730  cgcgagatgaatcttttgagtctatttagtccat 697



 Score = 44.1 bits (22), Expect = 0.038
 Identities = 73/90 (81%)
 Strand = Plus / Plus

                                                                        
Query: 2026 actaactagaattaaaagattcgtctcgtcatttacagacaaactgtgtaattagttttt 2085
            ||||| |||| | ||||||||| |||||  |||| ||  ||||| ||| |||||||||||
Sbjct: 701  actaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaattagttttt 760

                                          
Query: 2086 gttttcgtctatatttaatgcttcatgcat 2115
              |||  | ||||||||||||| |||||||
Sbjct: 761  cattttatttatatttaatgctccatgcat 790








More information about the Bioperl-l mailing list