[Bioperl-l] Re: [Gmod-gbrowse] GTF-->GFF3 converter

Scott Cain cain at cshl.edu
Mon Aug 29 12:34:58 EDT 2005


Hi Etienne,

Probably the best mailing list to ask this question on is the bioperl
mailing list (cc'ed here). 

As far as I know, there is no script specifically to do that.  Because
GFF3 is more strict than GTF (aka GFF 2.5), it can be difficult to move
from GTF to GFF3.  If the Bio::FeatureIO::gff module were a little more
fleshed out, it would probably be able to do it, but currently, while it
will write GTF, it doesn't yet read it.  If you wanted to contribute
code to do that, that would be great.

The other possibility in the absence of Bio::FeatureIO::gff is
Bio::Tools::GFF, which should be able to parse GTF and then write
something resembling GFF3.  I wrote 'resembling' because you may need to
massage the output to actually get something that is GFF3.

Scott
 

On Fri, 2005-08-26 at 15:14 -0600, Etienne Noumen wrote:
> Hi,
> In our projects, our data are in GTF format. I wrote a script to
> convert it to GFF3 but there are tags like Feature ID, ProteinID that
> i don't know how to deal with. I am also concerned about grouping
> exons and CDS into mRNA and Genes. Is there any converter that does it
> well?
> 
> This is how my files look like:
> ............
> scaffold_10034	src	exon	7360	8354	.	-	.	name
> "fgenesh1_pg.C_scaffold_10034000001"; transcriptId 58482
> scaffold_10034	src	CDS	7360	8352	.	-	0	name
> "fgenesh1_pg.C_scaffold_10034000001"; proteinId 58482; exonNumber 1
> scaffold_10034	src	stop_codon	7360	7362	.	-	0	name
> "fgenesh1_pg.C_scaffold_10034000001"
> scaffold_10309	src	exon	5822	6042	.	+	.	name
> "fgenesh1_pg.C_scaffold_10309000001"; transcriptId 58526
> scaffold_10309	src	CDS	5822	6042	.	+	0	name
> "fgenesh1_pg.C_scaffold_10309000001"; proteinId 58526; exonNumber 1
> scaffold_10309	src	exon	7270	7612	.	+	.	name
> "fgenesh1_pg.C_scaffold_10309000001"; transcriptId 58526
> scaffold_10309	src	CDS	7270	7612	.	+	2	name
> "fgenesh1_pg.C_scaffold_10309000001"; proteinId 58526; exonNumber 2
> scaffold_10309	src	stop_codon	7610	7612	.	+	0	name
> "fgenesh1_pg.C_scaffold_10309000001"
> ...........
> Thank you.
> noumen
> 
> 
> -------------------------------------------------------
> SF.Net email is Sponsored by the Better Software Conference & EXPO
> September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
> Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
> Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> 
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory



More information about the Bioperl-l mailing list