[Bioperl-l] What does Expect(2) mean in a blast result?
Amir Karger
akarger at CGR.Harvard.edu
Mon Nov 19 15:38:26 UTC 2007
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Tuesday, November 13, 2007 12:42 PM
> To: Amir Karger
> Cc: Steve Chervitz; Dave Messina; bioperl-l
> Subject: Re: [Bioperl-l] What does Expect(2) mean in a blast result?
>
> Amir,
>
> Can you file this as a bug?
Done.
http://bugzilla.open-bio.org/show_bug.cgi?id=2399
> Dave mentioned he would look
> into it but
> I think it warrants tracking to make sure it gets fixed:
>
> http://www.bioperl.org/wiki/Bugs
>
> Attach the example BLAST report from your last post to the report.
> BTW, I wonder how this appears in XML output?
>
> chris
>
> On Nov 13, 2007, at 11:30 AM, Amir Karger wrote:
>
> >> From: trutane at gmail.com [mailto:trutane at gmail.com] On Behalf
> >> Of Steve Chervitz
> >>
> >> The Bioperl blast parser should extract that value and you
> can obtain
> >> it from an HSP object, via the HSPI::n() method, documented here:
> >>
> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-live/B
> > io/Search/HSP/HSPI.html#POD23
> >
> > As I mentioned in my email:
> >
> > And does anyone know off-hand if Bioperl will tell me when
> situations
> > like this happen? I thought the Bio::Search::HSP::BlastHSP::n
> > subroutine
> > would help, but I just get a bunch of empty strings for that,
> > whether or
> > not there's a (2) in the Expect string. (hsp->n is empty, hsp->
> > {"_n"} is
> > undef.)
> >
> > And the docs for n() actually say, "This value is not defined with
> > NCBI
> > Blast2 with gapping" although they don't say why. Which may
> explain
> > why,
> > when I ran the following code on the blast result I included in my
> > last
> > email, I got empty values for all of the n's. (Why is n()
> undefined
> > for
> > gapped blast if I'm getting n's in my results from that blast?)
> >
> > use warnings;
> > use strict;
> > use Bio::SearchIO;
> >
> > my $blast_out = $ARGV[0];
> > my $in = new Bio::SearchIO(-format => 'blast',
> > -file => $blast_out,
> > -report_type => 'tblastn');
> >
> > print join("\t", qw(Qname Qstart Qend Strand Sname Sstart
> Send Frame N
> > Evalue)), "\n";
> > while(my $query = $in->next_result) {
> > while(my $subject = $query->next_hit) {
> > while (my $hsp = $subject->next_hsp) {
> > print join("\t",
> > $query->query_name,
> > $hsp->start("query"),
> > $hsp->end("query"),
> > $hsp->strand("hit"),
> > $subject->name,
> > $hsp->start("hit"),
> > $hsp->end("hit"),
> > $subject->frame,
> > $hsp->n,
> > $hsp->evalue,
> > ),"\n";
> > }
> > }
> > }
> >
> >> Dave's basically correct in his explanation. It's a result of the
> >> application of sum statistics by the blast algorithm. You
> can read
> >> all
> >> about it in Korf et al's BLAST book. Here's the relevant section:
> >
> > [snip]
> >
> > Thanks,
> >
> > -Amir
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>
More information about the Bioperl-l
mailing list