[Bioperl-l] What does Expect(2) mean in a blast result?

Chris Fields cjfields at uiuc.edu
Tue Nov 13 17:42:07 UTC 2007


Amir,

Can you file this as a bug?  Dave mentioned he would look into it but  
I think it warrants tracking to make sure it gets fixed:

http://www.bioperl.org/wiki/Bugs

Attach the example BLAST report from your last post to the report.   
BTW, I wonder how this appears in XML output?

chris

On Nov 13, 2007, at 11:30 AM, Amir Karger wrote:

>> From: trutane at gmail.com [mailto:trutane at gmail.com] On Behalf
>> Of Steve Chervitz
>>
>> The Bioperl blast parser should extract that value and you can obtain
>> it from an HSP object, via the HSPI::n() method, documented here:
>>
>> http://doc.bioperl.org/releases/bioperl-current/bioperl-live/B
> io/Search/HSP/HSPI.html#POD23
>
> As I mentioned in my email:
>
> And does anyone know off-hand if Bioperl will tell me when situations
> like this happen? I thought the Bio::Search::HSP::BlastHSP::n  
> subroutine
> would help, but I just get a bunch of empty strings for that,  
> whether or
> not there's a (2) in the Expect string. (hsp->n is empty, hsp-> 
> {"_n"} is
> undef.)
>
> And the docs for n() actually say, "This value is not defined with  
> NCBI
> Blast2 with gapping" although they don't say why. Which may explain  
> why,
> when I ran the following code on the blast result I included in my  
> last
> email, I got empty values for all of the n's. (Why is n() undefined  
> for
> gapped blast if I'm getting n's in my results from that blast?)
>
> use warnings;
> use strict;
> use Bio::SearchIO;
>
> my $blast_out = $ARGV[0];
> my $in = new Bio::SearchIO(-format => 'blast',
>                             -file   => $blast_out,
>                             -report_type => 'tblastn');
>
> print join("\t", qw(Qname Qstart Qend Strand Sname Sstart Send Frame N
> Evalue)), "\n";
> while(my $query = $in->next_result) {
>     while(my $subject = $query->next_hit) {
>         while (my $hsp = $subject->next_hsp) {
>             print join("\t",
>                 $query->query_name,
>                 $hsp->start("query"),
>                 $hsp->end("query"),
>                 $hsp->strand("hit"),
>                 $subject->name,
>                 $hsp->start("hit"),
>                 $hsp->end("hit"),
>                 $subject->frame,
>                 $hsp->n,
>                 $hsp->evalue,
>             ),"\n";
>         }
>     }
> }
>
>> Dave's basically correct in his explanation. It's a result of the
>> application of sum statistics by the blast algorithm. You can read  
>> all
>> about it in Korf et al's BLAST book. Here's the relevant section:
>
> [snip]
>
> Thanks,
>
> -Amir
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list