[Biojava-l] ??? extracting introns sequences for transcripts using java API for ensembl ???

dmitriy dms700 at gmail.com
Wed Aug 16 13:04:13 UTC 2006


Hi


I'm trying to  use ensembl  java API to extract info on five Prime UTR, exons,
introns,threePrimeUTR for  transcripts corresponding to particular
NM_xxxxx ref seq .
Unfortunetly it looks like I incorrectly use API to get intron info.
The following is the type of the code I try to use to get intron info.


----------------------------
     int exon1EndOffsetRelativeToGeneStart =
((Exon)transcript.getExons().get(0)).getLocation().getEnd() -
gene.getLocation().getStart();
      int exon2StartOffsetRelativeToGeneStart =
((Exon)transcript.getExons().get(1)).getLocation().getStart() -
gene.getLocation().getStart();


String intron1 =
gene.getSequence().getString().substring(exon1EndOffsetRelativeToGeneStart
+ 1, exon2StartOffsetRelativeToGeneStart ));
------------------------------
This code would works for "ENST00000275493" EGFR  NM_005228.3 , but
would not work for many if not vast majority of genes.


I would greatly appreciate info on correct way of getting intron info
for transcript.


Thank you
Dmitriy



More information about the Biojava-l mailing list