[Bioperl-l] Alternate hit sorting for Bio::Search::Result objects

Cook, Malcolm MEC at Stowers-Institute.org
Wed Jun 29 15:40:38 EDT 2005


Hi,

To re-raise a thread that was under discussion in Dec 2003
http://portal.open-bio.org/pipermail/bioperl-l/2003-December/014416.html
...

In service of a particular analysis, I am trying to programmatically
'edit' a Bio::Search::Result::BlastResult by filtering and sorting the
hits in a variety of ways prior to writing it back out (once for each
way).

I first tried breaking the interface (as suggested by Jason in Dec 03)
by directly setting `$myResult->{'_hits'}`.

However, as Jason further suggested, it _is_ prone to problems.  I don't
know if it would have worked in Dec 2003, but it does not work now (for
blast reports at least).

As I discovered, B::S::R::BlastResult overrides
B::S::R::GenericResult->hits to accommodate tracking iterations such as
reported by psiblast.  It does this for all blast outputs, including
flavors that don't iterate (`blastall -p blastx ` in my case).

Thus I found that to break the interface, I had to do the following
   $myResult->{_iterations}[0]->{_newhits_below_threshold} =
\@sortedFilteredAndOtherwiseMungedHits;

This works but is ugly and probably even more prone to (future?)
problems.

My question is, what to do in general.

Options:

	1) use the -filter capability of
Bio::SearchIO::Writer::ResultTableWriter in combination with
Bio::Search::Result::ResultI->sort_hits
	2) write an API for resetting hits
	3) continue to live dangerously by setting
$myResult->{_iterations}[0]->{_newhits_below_threshold}
		
In either case, it _may_ make sense to not track hits by iteration
unless necessary by having the Result $rc created in
Bio::SearchIO::blast set _no_iterations to 1 unless the report is
psiblast (or other iteration producer).

I don't really like option 1 due to all the objects and levels of
indirection just to grep/sort a list of hits but it might work fine...

Thoughts ...suggestions ...advice ...admonitions all appreciated

Regards,

Malcolm Cook - mec at stowers-institute.org - 816-926-4449
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, MO  USA 






More information about the Bioperl-l mailing list