[Bioperl-l] GFF to GTF
Cook, Malcolm
MEC at stowers-institute.org
Wed Nov 9 13:20:46 EST 2005
So, Projector is supposedly expecting GTF format at documented
http://www.fruitfly.org/flyannot/format.html#GTF
I know your problem is 'solved' in one form or another for getting
flybase data in/out from UCSC/Ensembl/GBrowse, but the details elude me.
I wonder if you can't simply avail yourself of UCSC's having done this
already?
For instance, the following 'GTF' is extactable from their site
(http://genome.ucsc.edu/cgi-bin/hgTables?db=dm2) at region
chr4:22000-24000
chr4 dm2_flyBaseGene CDS 22338 22528 0.000000 -
2 gene_id "CG32013-RA"; transcript_id "CG32013-RA";
chr4 dm2_flyBaseGene exon 22335 22528 0.000000 -
. gene_id "CG32013-RA"; transcript_id "CG32013-RA";
chr4 dm2_flyBaseGene CDS 22617 23205 0.000000 -
0 gene_id "CG32013-RA"; transcript_id "CG32013-RA";
chr4 dm2_flyBaseGene start_codon 23203 23205 0.000000
- . gene_id "CG32013-RA"; transcript_id "CG32013-RA";
chr4 dm2_flyBaseGene exon 22617 23205 0.000000 -
. gene_id "CG32013-RA"; transcript_id "CG32013-RA";
Though it is perhaps not quite 'right' becuase missing from it is
stop_codon and the exon_ids...
Also, there is some form of GFF converstion included in bioperl's
'process_gadfly.
good luck
--Malcolm
-----Original Message-----
From: Filipe Garrett [mailto:fgarret at ub.edu]
Sent: Wednesday, November 09, 2005 11:05 AM
To: Cook, Malcolm; Bioperl
Subject: Re: [Bioperl-l] GFF to GTF
Cook, Malcolm wrote:
>GFF as a format has a variety of versions. AFAIK, GTF is GFF 2.1 or
2.5
>
>The main differences have to do with format and semantics of GFF's
>column 9, the inclusion of sequence data itself in the file, and
>
>version reference
>1 & 2 http://www.sanger.ac.uk/Software/formats/GFF/GFF_Spec.shtml
>2.1 http://genes.cs.wustl.edu/GTF21.html
>2.5 ref?
>3 http://song.sourceforge.net/gff3.shtml
>
>What is your source of GFF? It is possible that your GFF annotation
>already is GTF. What version is it? Look in the file. There may be a
>'##gff-version' directive in it. Or, if you see that column 9 looks
>like 'gene_id "381.000"; transcript_id "381.000.1";' (c.f. gff 2.1
docs)
>then it probably already is in GTF.
>
>If it is NOT already GTF (GFF 2.1 or 2.5), then you must provide
example
>expected input and output to see if we (I) can help further.
>
>Cheers,
>
>Malcolm Cook - mec at stowers-institute.org - 816-926-4449
>Database Applications Manager - Bioinformatics
>Stowers Institute for Medical Research - Kansas City, MO USA
>
>
>
>-----Original Message-----
>From: bioperl-l-bounces at portal.open-bio.org
>[mailto:bioperl-l-bounces at portal.open-bio.org] On Behalf Of Filipe
>Garrett
>Sent: Tuesday, November 08, 2005 8:18 AM
>To: Bioperl
>Subject: [Bioperl-l] GFF to GTF
>
>
>Hi all,
>
>I'm currently trying to use a software called Projector to predict
>genes. As input it needs two sequences and the anotation of one of them
>in GTF. As the info in GTF is included in the GFF I was thinking of
>getting a way to convert the GFF files into GTF. I thought of using the
>Bioperl module for GFF to parse the GFF and write the GTF fields to an
>output.
>
>Does anyone knows if there's any script already made? Or of a simple
way
>
>to do it?
>
>Thanks in advance
>
>FG
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
Hi all,
I'm using GFF v3 from FlyBase.
I attached an example of input and how the output should come out..
Thanks in adv.
FG
More information about the Bioperl-l
mailing list