[Bioperl-l] Alternate hit sorting for Bio::Search::Result objects

Jason Stajich jason at cgt.duhs.duke.edu
Sun Dec 28 16:01:08 EST 2003


Yeah that would be fine - there are a couple of design decisions we made
so that the parser wouldn't require the hits to all be read in at once
(the only way this sorting would work), hence the next_hit method which
could be used for stream based parsing.  In retrospect it has never been
used (all the report parser implementations right now read in an entire
report anyways).

So long story short - we could add a method which permitted a
post-processing sorting on the hits and presumably on the HSPs as well.

But this is all really introduced because he is using an object
(XXXResultWriter) which uses ResultI objects rather than Hits directly to
get the info - so I wonder if it makes more sense to add the capability to
do custom sorting in the ResultWriter rather than adding more bloat to the
Result/Hit/HSP storage objects?

-jason

On Sun, 28 Dec 2003, Chris Mungall wrote:

>
> oops, hit send by mistake - just musing out loud... maybe we could just
> alter the order hits() / next_hit() returns things rather than actually
> modifying the objects. doesn't make much difference at the end of the
> day...
>
> On Sat, 27 Dec 2003, Chris Mungall wrote:
>
> >
> > $result->sort_hits_by_func( { ..custom sort.. } )
> >
> > On Tue, 23 Dec 2003, Jason Stajich wrote:
> >
> > > So Don - you want to apply a custom hit sorting routine to the data before
> > > it is output with SearchIO::HTMLResultWriter?
> > >
> > > The simpliest - albeit cheating and prone problems if there are changes
> > > in the module - but is pretty easy to do:
> > >
> > > @{$result->{'_hits'}} = sort { your custom sort here }
> > >                         @{$result->{'_hits'}};
> > >
> > > Perhaps we should add an API method which can get/set the Hits.
> > >
> > > You can also create a new Result object in an albeit tedious manner:
> > >
> > > my @hits = sort { #custom hits } $result->hits();
> > >
> > > my $rewres = $result->new(-query_name  => $result->query_name,
> > >                        -query_accession  => $result->query_accession,
> > > 	               -query_description => $result->query_description,
> > > 	               -query_length     => $result->query_length,
> > >                        -database_name    => $result->database_name,
> > > 	               ... # lots more things.
> > >                        -hits            => \@hits);
> > >
> > >
> > > So you put all of this in to play like this
> > > my $in = new Bio::SearchIO(-format => 'blast',
> > >                              -file   => shift @ARGV);
> > >
> > > my $writer = new Bio::SearchIO::Writer::HTMLResultWriter();
> > > my $out = new Bio::SearchIO(-writer => $writer);
> > > my $result = $in->next_result;
> > > # apply the sorting on the hits
> > > #
> > >
> > > # now write the result out
> > > $out->write_result($result);
> > >
> > >
> > >
> > > -jason
> > >
> > >
> > >
> > > On Tue, 23 Dec 2003, Donald G. Jackson wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm working on a blast wrapper using Bio::SearchIO.  I'd like to be able
> > > > to sort through the hits by something besides score/input order.  For
> > > > example, we've crammed the taxid into the FASTA header and would like to
> > > > do something like NCBI's taxblast where hits are sorted by source
> > > > organism, then by score.  I'd like to use the
> > > > Bio::SearchIO::HTMLResultWriter to output my hits, so can't just get all
> > > > the hits and sort them myself.
> > > >
> > > > I thought I'd seen mention of how to do this, but looking over the docs
> > > > and (1.2.3) code I can't find it.  Does anyone have thoughts on how to
> > > > do this?
> > > >
> > > > Thanks,
> > > >
> > > > Don Jackson
> > > > BMS Bioinformatics
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l at portal.open-bio.org
> > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > >
> > > --
> > > Jason Stajich
> > > Duke University
> > > jason at cgt.mc.duke.edu
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list