[Bioperl-l] methods, etc. for Bio::SearchIO on exonerate output

Allen Gathman agathman at semo.edu
Fri Apr 15 16:47:40 EDT 2005


Duh.  When in doubt, read the documentation -- I was so busy trying to
figure out how to do it with Bioperl, I overlooked the GFF output option
from exonerate.  

Thanks -- and yes, if you can send me your GFF-munging script I'd like to
see it -- at least we can wind up doing stuff in a consistent way then.  



Allen Gathman
http://cstl-csm.semo.edu/gathman


> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich at duke.edu]
> Sent: Friday, April 15, 2005 3:33 PM
> To: Gathman, Allen
> Cc: Guy Slater; 'bioperl-l at bioperl.org'
> Subject: Re: [Bioperl-l] methods, etc. for Bio::SearchIO on exonerate
> output
> 
> Allen -
> 
> I gave up to really trying to turn VULGAR into proper HSP objects
> representing exons and just parsing the GFF from Guy's showtargetgff
> output.  His GFF has all of what you want except that reverse strand
> features need to have their coordinates flipped (start=seqlength - end;
> end=seqlength - start + 1).
> 
> Seems like we should be able to do it alone by only interpreting the
> proper intron states and not all the gaps as we do.
> 
> I ended up having to grab the query and hit sequence lengths as well to
> correct for reverse strand sequences so the coordinates can be on the
> forward strand.  I think we could fix it to do this but maybe would
> make the output gene-centric instead of SearchIO form.  Not sure.
> 
> In short I'm not sure the exonerate parser works too well for what you
> want right now - for me, the exonerate generation of GFF which is then
> munged into GFF2 that exonerate will pump out as my pipeline for
> loading into DB::GFF and Gbrowse.
> 
> I can send that script for munging over if it helps.
> 
> -jason
> 
> --
> Jason Stajich
> jason.stajich at duke.edu
> http://www.duke.edu/~jes12/
> 
> On Apr 15, 2005, at 3:48 PM, Gathman, Allen wrote:
> 
> > Hi all --
> >
> >
> >
> > I'm using exonerate to align ESTs with a set of genomic contigs, and
> > I'm
> > trying to figure out the best way to pull information out of the
> > output.  I
> > wrote a little test script to see what Bio::SearchIO would get me:
> >
> > use Bio::SearchIO;
> > open (OUT, ">$ARGV[1]");
> >
> > my $searchobj=new Bio::SearchIO ( -format => 'exonerate',
> >                                   -file => $ARGV[0]);
> >
> > while (my $result=$searchobj->next_result() ) {
> >      print OUT "query: " . $result->query_name(). "\n";
> >      my @params=$result->available_parameters;
> >      print OUT "params:@params\n";
> >      my @stats=$result->available_statistics;
> >      print OUT "stats:@stats\n";
> >      while (my $hit=$result->next_hit() ) {
> >           print OUT "hitstart: " . $hit->start('hit') . "\n";
> >           while (my $hsp=$hit->next_hsp() ) {
> >                print OUT "hspsstart: " . $hsp->start('hit') . "\n";
> >           } # end hsp
> >      } # end hit
> > } # end result
> > close OUT;
> >
> >
> > There aren't any parameters or stats returned.  $hit->start works
> > fine, and
> > $hsp->start works, but the hsps are the individual matches; if there's
> > a
> > one-nucleotide gap, that separates two hsps, just as a real intron
> > would.
> >
> >
> >
> > It appears that there should be some methods or arguments applicable
> > here
> > beyond those for the generic Hit and HSP objects, but I don't know
> > what they
> > are.  I've gone through the documentation for
> > Bio::SearchIO::exonerate, but
> > I don't see what I'm looking for.  For instance, the VULGAR line in the
> > exonerate output distinguishes between introns and gaps - is there
> > some way
> > to pull them out separately in Bio::SearchIO?
> >
> >
> >
> > In short, I want to be able to ignore small gaps and define start and
> > end
> > points of exons, marking 5' and 3' splice junctions.  I'd appreciate
> > any
> > help on how to get at these.
> >
> >
> >
> > Thanks --
> >
> >
> >
> > Allen
> >
> >
> >
> > Allen Gathman
> >
> > http://cstl-csm.semo.edu/gathman
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >



More information about the Bioperl-l mailing list