[Bioperl-l] Bio::SearchIO blast_parsing dot_at_the_end_of query_accession

Jason Stajich jason at cgt.duhs.duke.edu
Wed Mar 17 15:36:13 EST 2004


ralf - can you send an example report - if I use your code and the
t/data/ecolitst.bls report in the bioperl disto I get this
jason at jason $ perl ralf_bug.pl
AAC73113.1
AAC73113.1
AAC73113.1
AAC73113.1

also see below.

-j
On Wed, 17 Mar 2004, Ralf Schmid wrote:

>
> Hi,
>
> I have recently updated bioperl from 1.21 to 1.4 and this has broken one of my
> blast parsing scripts. Using the following snippet of code on a blast output (3
> input sequences, -b 2 option for retrieving only two alignments, otherwise
> standard) gives different results:
>
> => code
>
> #!/usr/bin/perl -w
> use strict;
> use Bio::SearchIO;
> my $in = new Bio::SearchIO( -format => 'blast',
>                             -file   => "test.out");
> my $prot = '';
> while( my $result = $in->next_result )  {
>   while (my $hit = $result->next_hit) {
>     $prot=$result->query_accession;
>     print"$prot\n";
>   }
> }
>
>
> => output bioperl 1.21:
>
> ZPP00163
> ZPP00163
> ZPP00036_1
> ZPP00036_1
> ZPP00157
> ZPP00157
>
> - query accession is retrieved for each hit where there is an alignment
>
> => output bioperl 1.4:
>
> ZPP00163.
> ZPP00163.
> ZPP00163.
> ZPP00163.
> ZPP00163.
> ZPP00163.
> ZPP00163.
> ...
>
> - query_accession is retrieved for each hit regardless whether there is an
> alignment or not
> - each query_accession ends with a "."
>
>
> So far I have taken advantage of the blast -b option to set the number of hits
> to be parsed by bioperl, but I can see the ratio in changing bioperl from
> parsing every hit that has an alignment to parsing every hit.
>
This was a requested feature.  You can add a little code which exists the
hit loop if the hit doesn't have any hsps
last if $hit->num_hsps == 0;

> Looking at the diff between blast.pm 1.42.2.8 and blast.pm 1.76 and finding the
> helpful comment in line 769 makes me believe that there is the change in parsing
> coded, but I couldn't spot any reason for the "." at the end of each
> query_accession. Not sure whether the two are related anyway.
>
> <SNIP>
> # This is for the case when we specify -b 0 (or B=0 for WU-BLAST)
> # and still want to construct minimal Hit objects
> while(my $v = shift @hit_signifs) {
> next unless defined $v;
> $self->start_element({ 'Name' => 'Hit'});
> ...
> <SNIP>
>
> So far I'm fixing the "dot" issue by an s/\.$// , but ...
>
>
> Cheers,
>
> Ralf
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> Dr. Ralf Schmid
> Nematode Bioinformatics
> Blaxter Nematode Genomics Group
> Institute of Cell, Animal and Population Biology
> Ashworth Labs
> University of Edinburgh
> King's Buildings
> Edinburgh
> EH9 3JT
> UK
>
> (+44)(0)131 650 7403
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list