[Bioperl-l] six frame translation to gff?
jimhu at tamu.edu
Wed Oct 24 19:59:09 UTC 2012
Unfortunately, the sixpack gff output is worthless. It just makes each orf a gff file and catenates them. It doesn't preserve the genome coordinates.
However, I came up with two possible solutions:
- use EMBOSS sixpack or getorf to make fasta, and blast it against the original genome. Really ugly, but it should work, and we already have the tools to convert blast output to gff.
- use getorf to make fasta instead of sixpack. getorf writes the coordinates in the fasta header, so I could parse the getorf fasta into gff.
I also had a vague idea that it might be possible to cannibalize Bio::Graphics::Glyph::translation, but I didn't spend much time on that.
I had hoped that this was common enough to have already been done many times, and someone would just say that I had missed an obvious BioPerl HOWTO. But upon reflection, it's kind of a straw man. We're making something to show that it's not the way to do things, so we wouldn't want a HOWTO to show people how to go down the wrong path. I sometimes think that sixpack is pretty close to how some of the prokaryotic or phage annotations were done, especially the older ones.
On Oct 24, 2012, at 2:18 PM, Smithies, Russell wrote:
> Not a Bioperl solution, but sixpack from Emboss will output gff
> You could probably do something tricky with a regex to chunk the sequence into frames and $my_seq_object->translate
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Jim Hu
> Sent: Thursday, 25 October 2012 6:06 a.m.
> To: bioperl-l at bioperl.org; GMOD GBrowse List
> Subject: [Bioperl-l] six frame translation to gff?
> Is there a simple way to do this? For teaching we want to illustrate how all possible reading frames are NOT real CDS features. Most browsers display the 6 frame translation, but I would like to convert this information to GFF so they can be viewed, filtered etc.
> Jim Hu
> Dept. of Biochemistry and Biophysics
> 2128 TAMU
> Texas A&M Univ.
> College Station, TX 77843-2128
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
Dept. of Biochemistry and Biophysics
Texas A&M Univ.
College Station, TX 77843-2128
More information about the Bioperl-l