[Bioperl-l] Bio::SearchIO blast_parsing dot_at_the_end_of query_accession

Jason Stajich jason at cgt.duhs.duke.edu
Wed Mar 17 16:28:13 EST 2004


Agreed.  The query_accession is a special method which will report the
accession if it was properly parsed out.

However, I think the problem is fixed post 1.4 as Hilmar made these
changes below.  The 1.4.0 release has revision 1.72 of SearchIO/blast.pm
with the bug you are reporting so upgrading your copy to 1.76 version from
CVS should take care of the problem I hope.

On about line 515:

  my ($acc,$version) = &_get_accession_version($nm);
+ $version = defined($version) ? ".$version" : "";
  $acc = '' unless defined($acc);
  $self->element({ 'Name' =>  'BlastOutput_query-acc',
			    'Data'  => "$acc$version"});


-jason
On Wed, 17 Mar 2004, James Wasmuth wrote:

> Just noticed that what was required was $result->query_name rather than
> query_accession.  Still, to me, its curious that the latter method
> should append the '.' .
>
>
> -james
>
>
> Jason Stajich wrote:
>
> >ralf - can you send an example report - if I use your code and the
> >t/data/ecolitst.bls report in the bioperl disto I get this
> >jason at jason $ perl ralf_bug.pl
> >AAC73113.1
> >AAC73113.1
> >AAC73113.1
> >AAC73113.1
> >
> >also see below.
> >
> >-j
> >On Wed, 17 Mar 2004, Ralf Schmid wrote:
> >
> >
> >
> >>Hi,
> >>
> >>I have recently updated bioperl from 1.21 to 1.4 and this has broken one of my
> >>blast parsing scripts. Using the following snippet of code on a blast output (3
> >>input sequences, -b 2 option for retrieving only two alignments, otherwise
> >>standard) gives different results:
> >>
> >>=> code
> >>
> >>#!/usr/bin/perl -w
> >>use strict;
> >>use Bio::SearchIO;
> >>my $in = new Bio::SearchIO( -format => 'blast',
> >>                            -file   => "test.out");
> >>my $prot = '';
> >>while( my $result = $in->next_result )  {
> >>  while (my $hit = $result->next_hit) {
> >>    $prot=$result->query_accession;
> >>    print"$prot\n";
> >>  }
> >>}
> >>
> >>
> >>=> output bioperl 1.21:
> >>
> >>ZPP00163
> >>ZPP00163
> >>ZPP00036_1
> >>ZPP00036_1
> >>ZPP00157
> >>ZPP00157
> >>
> >>- query accession is retrieved for each hit where there is an alignment
> >>
> >>=> output bioperl 1.4:
> >>
> >>ZPP00163.
> >>ZPP00163.
> >>ZPP00163.
> >>ZPP00163.
> >>ZPP00163.
> >>ZPP00163.
> >>ZPP00163.
> >>...
> >>
> >>- query_accession is retrieved for each hit regardless whether there is an
> >>alignment or not
> >>- each query_accession ends with a "."
> >>
> >>
> >>So far I have taken advantage of the blast -b option to set the number of hits
> >>to be parsed by bioperl, but I can see the ratio in changing bioperl from
> >>parsing every hit that has an alignment to parsing every hit.
> >>
> >>
> >>
> >This was a requested feature.  You can add a little code which exists the
> >hit loop if the hit doesn't have any hsps
> >last if $hit->num_hsps == 0;
> >
> >
> >
> >>Looking at the diff between blast.pm 1.42.2.8 and blast.pm 1.76 and finding the
> >>helpful comment in line 769 makes me believe that there is the change in parsing
> >>coded, but I couldn't spot any reason for the "." at the end of each
> >>query_accession. Not sure whether the two are related anyway.
> >>
> >><SNIP>
> >># This is for the case when we specify -b 0 (or B=0 for WU-BLAST)
> >># and still want to construct minimal Hit objects
> >>while(my $v = shift @hit_signifs) {
> >>next unless defined $v;
> >>$self->start_element({ 'Name' => 'Hit'});
> >>...
> >><SNIP>
> >>
> >>So far I'm fixing the "dot" issue by an s/\.$// , but ...
> >>
> >>
> >>Cheers,
> >>
> >>Ralf
> >>
> >>
> >>
> >>
> >>
> >>
> >>------------------------------------------------------------------------------
> >>Dr. Ralf Schmid
> >>Nematode Bioinformatics
> >>Blaxter Nematode Genomics Group
> >>Institute of Cell, Animal and Population Biology
> >>Ashworth Labs
> >>University of Edinburgh
> >>King's Buildings
> >>Edinburgh
> >>EH9 3JT
> >>UK
> >>
> >>(+44)(0)131 650 7403
> >>
> >>
> >>
> >>
> >>
> >>_______________________________________________
> >>Bioperl-l mailing list
> >>Bioperl-l at portal.open-bio.org
> >>http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >>
> >
> >--
> >Jason Stajich
> >Duke University
> >jason at cgt.mc.duke.edu
> >_______________________________________________
> >Bioperl-l mailing list
> >Bioperl-l at portal.open-bio.org
> >http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list