[Bioperl-l] Bio::SearchIO blast_parsing dot_at_the_end_of query_accession

Ralf Schmid ralf at bch.ed.ac.uk
Wed Mar 17 15:06:58 EST 2004


Hi,

I have recently updated bioperl from 1.21 to 1.4 and this has broken one of my
blast parsing scripts. Using the following snippet of code on a blast output (3
input sequences, -b 2 option for retrieving only two alignments, otherwise
standard) gives different results: 

=> code

#!/usr/bin/perl -w
use strict;
use Bio::SearchIO;
my $in = new Bio::SearchIO( -format => 'blast',
                            -file   => "test.out");
my $prot = ''; 
while( my $result = $in->next_result )  {     
  while (my $hit = $result->next_hit) {     
    $prot=$result->query_accession;                   
    print"$prot\n";	
  }
}     


=> output bioperl 1.21:

ZPP00163
ZPP00163
ZPP00036_1
ZPP00036_1
ZPP00157
ZPP00157

- query accession is retrieved for each hit where there is an alignment

=> output bioperl 1.4:

ZPP00163.
ZPP00163.
ZPP00163.
ZPP00163.
ZPP00163.
ZPP00163.
ZPP00163.
...

- query_accession is retrieved for each hit regardless whether there is an
alignment or not
- each query_accession ends with a "."


So far I have taken advantage of the blast -b option to set the number of hits
to be parsed by bioperl, but I can see the ratio in changing bioperl from
parsing every hit that has an alignment to parsing every hit. 

Looking at the diff between blast.pm 1.42.2.8 and blast.pm 1.76 and finding the
helpful comment in line 769 makes me believe that there is the change in parsing
coded, but I couldn't spot any reason for the "." at the end of each
query_accession. Not sure whether the two are related anyway.   

<SNIP>
# This is for the case when we specify -b 0 (or B=0 for WU-BLAST)              
# and still want to construct minimal Hit objects               
while(my $v = shift @hit_signifs) {
next unless defined $v;                   
$self->start_element({ 'Name' => 'Hit'});
...
<SNIP>

So far I'm fixing the "dot" issue by an s/\.$// , but ...


Cheers,

Ralf






------------------------------------------------------------------------------ 
Dr. Ralf Schmid
Nematode Bioinformatics
Blaxter Nematode Genomics Group
Institute of Cell, Animal and Population Biology
Ashworth Labs						
University of Edinburgh				
King's Buildings					
Edinburgh			
EH9 3JT	 			
UK					

(+44)(0)131 650 7403







More information about the Bioperl-l mailing list