[Bioperl-l] Parsing CDS info from GFF file
Gowthaman Ramasamy
gowthaman.ramasamy at sbri.org
Fri Mar 2 21:08:27 UTC 2007
Hi List,
I am trying to find a way to grab cordinates CDS (startcodon-stopcodon) from a GFF file.
But, the GFF file has cordinates of individual exons (cds).
Just wondering if there is any tool/module/script available for this.
It should take care of both multi-exonic genes and + or - strand as well.
set of examples of GFF file entries are bellow...
many thanks in advance
gowtham
SBRI, Seattle.
1400 TIGR gene 127456 128386 . + . ID=1400.t00213;Name=hypothetical protein
1400 TIGR mRNA 127456 128386 . + . ID=1400.m02493;Parent=1400.t00213
1400 TIGR five_prime_utr 127456 127993 . + . ID=utr5p_of_1400.m02493;Parent=1400.m02493
1400 TIGR exon 127456 128386 . + . ID=1400.e05831;Parent=1400.m02493
1400 TIGR CDS 127994 128314 . + 0 ID=cds_of_1400.m02493;Parent=1400.m02493
1400 TIGR three_prime_utr 128315 128386 . + . ID=utr3p_of_1400.m02493;Parent=1400.m02493
1400 TIGR gene 232655 233965 . - . ID=1400.t00271;Name=pleckstrin homology domain protein, puta
tive
1400 TIGR mRNA 232655 233965 . - . ID=1400.m02876;Parent=1400.t00271
1400 TIGR five_prime_utr 233477 233965 . - . ID=utr5p_of_1400.m02876;Parent=1400.m02876
1400 TIGR exon 233339 233965 . - . ID=1400.e05827;Parent=1400.m02876
1400 TIGR CDS 233339 233476 . - 0 ID=cds_of_1400.m02876;Parent=1400.m02876
1400 TIGR exon 233011 233182 . - . ID=1400.e05826;Parent=1400.m02876
1400 TIGR CDS 233011 233182 . - 0 ID=cds_of_1400.m02876;Parent=1400.m02876
1400 TIGR exon 232655 232781 . - . ID=1400.e05825;Parent=1400.m02876
1400 TIGR CDS 232729 232781 . - 1 ID=cds_of_1400.m02876;Parent=1400.m02876
1400 TIGR three_prime_utr 232655 232728 . - . ID=utr3p_of_1400.m02876;Parent=1400.m02876
More information about the Bioperl-l
mailing list