[Bioperl-l] Bio::SearchIO blast_parsing
dot_at_the_end_of query_accession
James Wasmuth
james.wasmuth at ed.ac.uk
Wed Mar 17 16:11:26 EST 2004
Just noticed that what was required was $result->query_name rather than
query_accession. Still, to me, its curious that the latter method
should append the '.' .
-james
Jason Stajich wrote:
>ralf - can you send an example report - if I use your code and the
>t/data/ecolitst.bls report in the bioperl disto I get this
>jason at jason $ perl ralf_bug.pl
>AAC73113.1
>AAC73113.1
>AAC73113.1
>AAC73113.1
>
>also see below.
>
>-j
>On Wed, 17 Mar 2004, Ralf Schmid wrote:
>
>
>
>>Hi,
>>
>>I have recently updated bioperl from 1.21 to 1.4 and this has broken one of my
>>blast parsing scripts. Using the following snippet of code on a blast output (3
>>input sequences, -b 2 option for retrieving only two alignments, otherwise
>>standard) gives different results:
>>
>>=> code
>>
>>#!/usr/bin/perl -w
>>use strict;
>>use Bio::SearchIO;
>>my $in = new Bio::SearchIO( -format => 'blast',
>> -file => "test.out");
>>my $prot = '';
>>while( my $result = $in->next_result ) {
>> while (my $hit = $result->next_hit) {
>> $prot=$result->query_accession;
>> print"$prot\n";
>> }
>>}
>>
>>
>>=> output bioperl 1.21:
>>
>>ZPP00163
>>ZPP00163
>>ZPP00036_1
>>ZPP00036_1
>>ZPP00157
>>ZPP00157
>>
>>- query accession is retrieved for each hit where there is an alignment
>>
>>=> output bioperl 1.4:
>>
>>ZPP00163.
>>ZPP00163.
>>ZPP00163.
>>ZPP00163.
>>ZPP00163.
>>ZPP00163.
>>ZPP00163.
>>...
>>
>>- query_accession is retrieved for each hit regardless whether there is an
>>alignment or not
>>- each query_accession ends with a "."
>>
>>
>>So far I have taken advantage of the blast -b option to set the number of hits
>>to be parsed by bioperl, but I can see the ratio in changing bioperl from
>>parsing every hit that has an alignment to parsing every hit.
>>
>>
>>
>This was a requested feature. You can add a little code which exists the
>hit loop if the hit doesn't have any hsps
>last if $hit->num_hsps == 0;
>
>
>
>>Looking at the diff between blast.pm 1.42.2.8 and blast.pm 1.76 and finding the
>>helpful comment in line 769 makes me believe that there is the change in parsing
>>coded, but I couldn't spot any reason for the "." at the end of each
>>query_accession. Not sure whether the two are related anyway.
>>
>><SNIP>
>># This is for the case when we specify -b 0 (or B=0 for WU-BLAST)
>># and still want to construct minimal Hit objects
>>while(my $v = shift @hit_signifs) {
>>next unless defined $v;
>>$self->start_element({ 'Name' => 'Hit'});
>>...
>><SNIP>
>>
>>So far I'm fixing the "dot" issue by an s/\.$// , but ...
>>
>>
>>Cheers,
>>
>>Ralf
>>
>>
>>
>>
>>
>>
>>------------------------------------------------------------------------------
>>Dr. Ralf Schmid
>>Nematode Bioinformatics
>>Blaxter Nematode Genomics Group
>>Institute of Cell, Animal and Population Biology
>>Ashworth Labs
>>University of Edinburgh
>>King's Buildings
>>Edinburgh
>>EH9 3JT
>>UK
>>
>>(+44)(0)131 650 7403
>>
>>
>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at portal.open-bio.org
>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>--
>Jason Stajich
>Duke University
>jason at cgt.mc.duke.edu
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
More information about the Bioperl-l
mailing list