[Bioperl-l] Alternate hit sorting for Bio::Search::Result objects

Donald Jackson donald.jackson at bms.com
Sun Dec 28 20:04:59 EST 2003


Chris and Jason,

I like Chris's idea of being able to pass in the custom function of one's choice for sorting.  My bias would be to include this in the Result object because it seems like it would be more generally available, both to multiple ResultWriters and to other tools (not that I could name one).

I also like the idea of just changing the return order rather than the $result->{_hits} structure, but agree w/ Chris that it doesn't really matter.

I'm happy to code something up and submit it - I'll try to do so over the next day or so.

Thanks,

Don Jackson
BMS Bioinformatics

----- Original Message -----
From: Jason Stajich <jason at cgt.duhs.duke.edu>
Date: Sunday, December 28, 2003 4:01 pm
Subject: Re: [Bioperl-l] Alternate hit sorting for Bio::Search::Result objects

> Yeah that would be fine - there are a couple of design decisions we 
> madeso that the parser wouldn't require the hits to all be read in 
> at once
> (the only way this sorting would work), hence the next_hit method 
> whichcould be used for stream based parsing.  In retrospect it has 
> never been
> used (all the report parser implementations right now read in an 
> entirereport anyways).
> 
> So long story short - we could add a method which permitted a
> post-processing sorting on the hits and presumably on the HSPs as 
> well.
> But this is all really introduced because he is using an object
> (XXXResultWriter) which uses ResultI objects rather than Hits 
> directly to
> get the info - so I wonder if it makes more sense to add the 
> capability to
> do custom sorting in the ResultWriter rather than adding more bloat 
> to the
> Result/Hit/HSP storage objects?
> 
> -jason
> 
> On Sun, 28 Dec 2003, Chris Mungall wrote:
> 
> >
> > oops, hit send by mistake - just musing out loud... maybe we 
> could just
> > alter the order hits() / next_hit() returns things rather than 
> actually> modifying the objects. doesn't make much difference at 
> the end of the
> > day...
> >
> > On Sat, 27 Dec 2003, Chris Mungall wrote:
> >
> > >
> > > $result->sort_hits_by_func( { ..custom sort.. } )
> > >
> > > On Tue, 23 Dec 2003, Jason Stajich wrote:
> > >
> > > > So Don - you want to apply a custom hit sorting routine to 
> the data before
> > > > it is output with SearchIO::HTMLResultWriter?
> > > >
> > > > The simpliest - albeit cheating and prone problems if there 
> are changes
> > > > in the module - but is pretty easy to do:
> > > >
> > > > @{$result->{'_hits'}} = sort { your custom sort here }
> > > >                         @{$result->{'_hits'}};
> > > >
> > > > Perhaps we should add an API method which can get/set the Hits.
> > > >
> > > > You can also create a new Result object in an albeit tedious 
> manner:> > >
> > > > my @hits = sort { #custom hits } $result->hits();
> > > >
> > > > my $rewres = $result->new(-query_name  => $result->query_name,
> > > >                        -query_accession  => $result-
> >query_accession,> > >                        -query_description => 
> $result->query_description,
> > > >                        -query_length     => $result-
> >query_length,> > >                        -database_name    => 
> $result->database_name,
> > > >                        ... # lots more things.
> > > >                        -hits            => \@hits);
> > > >
> > > >
> > > > So you put all of this in to play like this
> > > > my $in = new Bio::SearchIO(-format => 'blast',
> > > >                              -file   => shift @ARGV);
> > > >
> > > > my $writer = new Bio::SearchIO::Writer::HTMLResultWriter();
> > > > my $out = new Bio::SearchIO(-writer => $writer);
> > > > my $result = $in->next_result;
> > > > # apply the sorting on the hits
> > > > #
> > > >
> > > > # now write the result out
> > > > $out->write_result($result);
> > > >
> > > >
> > > >
> > > > -jason
> > > >
> > > >
> > > >
> > > > On Tue, 23 Dec 2003, Donald G. Jackson wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'm working on a blast wrapper using Bio::SearchIO.  I'd 
> like to be able
> > > > > to sort through the hits by something besides score/input 
> order.  For
> > > > > example, we've crammed the taxid into the FASTA header and 
> would like to
> > > > > do something like NCBI's taxblast where hits are sorted by 
> source> > > > organism, then by score.  I'd like to use the
> > > > > Bio::SearchIO::HTMLResultWriter to output my hits, so can't 
> just get all
> > > > > the hits and sort them myself.
> > > > >
> > > > > I thought I'd seen mention of how to do this, but looking 
> over the docs
> > > > > and (1.2.3) code I can't find it.  Does anyone have 
> thoughts on how to
> > > > > do this?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Don Jackson
> > > > > BMS Bioinformatics
> > > > >
> > > > > _______________________________________________
> > > > > Bioperl-l mailing list
> > > > > Bioperl-l at portal.open-bio.org
> > > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > > >
> > > >
> > > > --
> > > > Jason Stajich
> > > > Duke University
> > > > jason at cgt.mc.duke.edu
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l at portal.open-bio.org
> > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> >
> 
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> 



More information about the Bioperl-l mailing list