[Bioperl-l] locate introns in a protein sequence
Tao Zhu
tzhu at mail.bnu.edu.cn
Tue Mar 8 06:45:11 UTC 2011
Hello, everyone. For example, I have a GTF file annotating like this,
Supercontig_3.3 S_cryophilus_OY26_V3_CALLGENES_FINAL_2 start_codon 25009
25011 . + 0 gene_id "SPOG_00008"; transcript_id "SPOG_00008T0";
Supercontig_3.3 S_cryophilus_OY26_V3_CALLGENES_FINAL_2 stop_codon 26003
26005 . + 0 gene_id "SPOG_00008"; transcript_id "SPOG_00008T0";
Supercontig_3.3 S_cryophilus_OY26_V3_CALLGENES_FINAL_2 exon 24828
25172 . + . gene_id "SPOG_00008"; transcript_id "SPOG_00008T0";
Supercontig_3.3 S_cryophilus_OY26_V3_CALLGENES_FINAL_2 CDS 25009 25172 .
+ 0 gene_id "SPOG_00008"; transcript_id "SPOG_00008T0";
Supercontig_3.3 S_cryophilus_OY26_V3_CALLGENES_FINAL_2 exon 25245
25364 . + . gene_id "SPOG_00008"; transcript_id "SPOG_00008T0";
Supercontig_3.3 S_cryophilus_OY26_V3_CALLGENES_FINAL_2 CDS 25245 25364 .
+ 1 gene_id "SPOG_00008"; transcript_id "SPOG_00008T0";
Supercontig_3.3 S_cryophilus_OY26_V3_CALLGENES_FINAL_2 exon 25414
26178 . + . gene_id "SPOG_00008"; transcript_id "SPOG_00008T0";
Supercontig_3.3 S_cryophilus_OY26_V3_CALLGENES_FINAL_2 CDS 25414 26002 .
+ 1 gene_id "SPOG_00008"; transcript_id "SPOG_00008T0";
Obviously this transcript "SPOG_00008T0" has two introns.
Also I have a corresponding protein sequence file like this( in fasta
format),
>SPOG_00008T0 | SPOG_00008 | Schizosaccharomyces cryophilus OY26 exosome
subunit Rrp45 (292 aa)
MSKSLEPSANNKGFIVNALKKELRLDGRSLTDFRDLKIEFGEDYGQVDISLGSTRVMARI
SAEITKPYSDRPFDGIFAITTELTPLASPAFETGRVSEQEVIISRLIEQAIRRSNALDTE
SLCIISGQKCWSVRASVHFINHDGNLVDAACIAVITGLCHFRRPEITVLGDEVTVHSIEE
RVPVPLSVLHTPICVTFSFFEDGSLSAIDASLEEEELRTGSMTVTLNKNREICQIFKAGG
VTIEASSVVACAHTAFQKTTSIISEIQRALDEDLSKKETQFFGGSAENQRS*
I hope to precisely locate these two introns into the protein
sequence(find their location among the amino acids). Please recommend a
relatively convenient method. Thank you!
--
Tao Zhu, College of Life Sciences, Beijing Normal University, Beijing
100875, China
Email: tzhu at mail.bnu.edu.cn
Website: http://bnuzt.org (mainly written in Chinese)
More information about the Bioperl-l
mailing list