[Bioperl-l] Bio::SearchIO blast_parsing dot_at_the_end_of
query_accession
Ralf Schmid
ralf at bch.ed.ac.uk
Wed Mar 17 15:06:58 EST 2004
Hi,
I have recently updated bioperl from 1.21 to 1.4 and this has broken one of my
blast parsing scripts. Using the following snippet of code on a blast output (3
input sequences, -b 2 option for retrieving only two alignments, otherwise
standard) gives different results:
=> code
#!/usr/bin/perl -w
use strict;
use Bio::SearchIO;
my $in = new Bio::SearchIO( -format => 'blast',
-file => "test.out");
my $prot = '';
while( my $result = $in->next_result ) {
while (my $hit = $result->next_hit) {
$prot=$result->query_accession;
print"$prot\n";
}
}
=> output bioperl 1.21:
ZPP00163
ZPP00163
ZPP00036_1
ZPP00036_1
ZPP00157
ZPP00157
- query accession is retrieved for each hit where there is an alignment
=> output bioperl 1.4:
ZPP00163.
ZPP00163.
ZPP00163.
ZPP00163.
ZPP00163.
ZPP00163.
ZPP00163.
...
- query_accession is retrieved for each hit regardless whether there is an
alignment or not
- each query_accession ends with a "."
So far I have taken advantage of the blast -b option to set the number of hits
to be parsed by bioperl, but I can see the ratio in changing bioperl from
parsing every hit that has an alignment to parsing every hit.
Looking at the diff between blast.pm 1.42.2.8 and blast.pm 1.76 and finding the
helpful comment in line 769 makes me believe that there is the change in parsing
coded, but I couldn't spot any reason for the "." at the end of each
query_accession. Not sure whether the two are related anyway.
<SNIP>
# This is for the case when we specify -b 0 (or B=0 for WU-BLAST)
# and still want to construct minimal Hit objects
while(my $v = shift @hit_signifs) {
next unless defined $v;
$self->start_element({ 'Name' => 'Hit'});
...
<SNIP>
So far I'm fixing the "dot" issue by an s/\.$// , but ...
Cheers,
Ralf
------------------------------------------------------------------------------
Dr. Ralf Schmid
Nematode Bioinformatics
Blaxter Nematode Genomics Group
Institute of Cell, Animal and Population Biology
Ashworth Labs
University of Edinburgh
King's Buildings
Edinburgh
EH9 3JT
UK
(+44)(0)131 650 7403
More information about the Bioperl-l
mailing list