[Bioperl-l] v1.0.1 BLAST SearchIO woes

JDiggans@genelogic.com JDiggans@genelogic.com
Tue, 25 Jun 2002 14:08:09 -0400


I just upgraded to the 1.0.1 bioperl release while at the same time
reworking a bunch of BLAST parsers and have found several issues w/ using
SearchIO. I'm hoping I'm just missing something obvious but it seems like
Jason and Steve C. are still working towards moving into one unified BLAST
parser and I'm running into issues caused by this temporary middle-ground.

Specifically, I need strand information from each HSP. Jason's blast.pm
returns HSPs as GenericHSP objects which, when queried for their strand
acc. to one of two ways offered by HSPI to ask for stranded-ness, seem to
return the name of the strand instead of the expected -1/1. i.e.:

        print $currHSP->strand('query')."\n";
        print $currHSP->strand('sbjct')."\n";

prints out:

      query
      sbjct

instead of:
      1
      1

Checking the SearchIO.t file there are tests for accessing strand
information through GenericHSP objects via:

      $genericHSP->query->strand();
      $genericHSP->hit->strand();

which do work, but why call it 'hit' instead of 'sbjct'? Was this for
backwards compatibility? Get's confusing w/ the Search/Hit family of
objects, IMHO. Why not offer an hsp->sbjct->strand() accession method and
alias it to hsp->hit->strand() for backwards compatibility?

Checking GenericHSP and BlastHSP it seems that BlastHSP does indeed return
-1/1 using the first series of method calls. Steve C's psiblast.pm (which
seems to be a general-purpose report parser? did this start out just for
psiblast?) returns BlastHSP objects. This fixes my stranded-ness problem.
However, BlastHSP doesn't seem to have a score() method to return the bit
score from the encapsulated HSP, though GenericHSP does (and HSPI does not,
which is probably why it's not in BlastHSP?).

So ... what's the plan for SearchIO BLAST stuff? I don't mind submitting
fixes but with all the whirlwind development that's going on I'm not sure
what's a problem and what's intentional or even who does what when it comes
to SearchIO and I don't want to step on any toes and I don't see any
obvious bug reports for this behaviour.

So ... if someone could fill me in ... :)
-j

-------------------------------------------------
James Diggans
Bioinformatics Programmer
Gene Logic, Inc.
Phone: 301.987.1756
FAX: 301.987.1701