[Bioperl-l] Re: [Gmod-gbrowse] GTF-->GFF3 converter
Scott Cain
cain at cshl.edu
Mon Aug 29 12:34:58 EDT 2005
Hi Etienne,
Probably the best mailing list to ask this question on is the bioperl
mailing list (cc'ed here).
As far as I know, there is no script specifically to do that. Because
GFF3 is more strict than GTF (aka GFF 2.5), it can be difficult to move
from GTF to GFF3. If the Bio::FeatureIO::gff module were a little more
fleshed out, it would probably be able to do it, but currently, while it
will write GTF, it doesn't yet read it. If you wanted to contribute
code to do that, that would be great.
The other possibility in the absence of Bio::FeatureIO::gff is
Bio::Tools::GFF, which should be able to parse GTF and then write
something resembling GFF3. I wrote 'resembling' because you may need to
massage the output to actually get something that is GFF3.
Scott
On Fri, 2005-08-26 at 15:14 -0600, Etienne Noumen wrote:
> Hi,
> In our projects, our data are in GTF format. I wrote a script to
> convert it to GFF3 but there are tags like Feature ID, ProteinID that
> i don't know how to deal with. I am also concerned about grouping
> exons and CDS into mRNA and Genes. Is there any converter that does it
> well?
>
> This is how my files look like:
> ............
> scaffold_10034 src exon 7360 8354 . - . name
> "fgenesh1_pg.C_scaffold_10034000001"; transcriptId 58482
> scaffold_10034 src CDS 7360 8352 . - 0 name
> "fgenesh1_pg.C_scaffold_10034000001"; proteinId 58482; exonNumber 1
> scaffold_10034 src stop_codon 7360 7362 . - 0 name
> "fgenesh1_pg.C_scaffold_10034000001"
> scaffold_10309 src exon 5822 6042 . + . name
> "fgenesh1_pg.C_scaffold_10309000001"; transcriptId 58526
> scaffold_10309 src CDS 5822 6042 . + 0 name
> "fgenesh1_pg.C_scaffold_10309000001"; proteinId 58526; exonNumber 1
> scaffold_10309 src exon 7270 7612 . + . name
> "fgenesh1_pg.C_scaffold_10309000001"; transcriptId 58526
> scaffold_10309 src CDS 7270 7612 . + 2 name
> "fgenesh1_pg.C_scaffold_10309000001"; proteinId 58526; exonNumber 2
> scaffold_10309 src stop_codon 7610 7612 . + 0 name
> "fgenesh1_pg.C_scaffold_10309000001"
> ...........
> Thank you.
> noumen
>
>
> -------------------------------------------------------
> SF.Net email is Sponsored by the Better Software Conference & EXPO
> September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
> Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
> Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D. cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory
More information about the Bioperl-l
mailing list