[Bioperl-l] Genbank parsing using Bioperl

Fri Apr 21 13:21:20 UTC 2006

On 4/21/06 6:45 AM, "Prabu R" <prabubio at gmail.com> wrote:

> Dear all!
> 
> I am a novice bioperl user, trying to parse Genbank files with Bioperl
> modules to get some specific features and details.
> 
> Anyone please tell me, whether we can retrive a Gene, its Transcript ID and
> its Protein ID from the Genbank file.
> 
> I mainly need to extract with one to one relationship between TranscriptID
> and Protein ID.
> 
> I was trying this. I was able to take these details if the gene is not
> alternatively spliced.
> 
> If a gene contains multiple mRNA/CDS feature, I am not able to build the
> relationship between Transcript and its Protein.
> 
> Kindly help me to find out whether this is possible in Bioperl.

See here:

http://www.bioperl.org/wiki/HOWTO:Feature-Annotation

However, genbank is only a repository, so not every transcript is going to
necessarily have a protein annotation, I don't think.  You might want to
look into using something like the RefSeq set (from NCBI) or Ensembl, both
of which have very rich annotation associated with their
transcripts/proteins.

Sean