[Bioperl-l] genbank alternate-splicing representation

Mark Wilkinson mwilkinson@gene.pbi.nrc.ca
Tue, 09 Oct 2001 13:44:23 -0600

Hi all,

I have started working on modifying the Genbank parser to create
GeneStructureI-compliant features...
<ASIDE TYPE = "rant" FRUSTRATION_LEVEL = "high">Good Lord I wish I had never
stuck my hand up to do that one!  I had forgotten just how much I *hated* the
Genbank format for Seq/Feature representation... not to mention the fact that
it is user-curated... but one day ask me how I *really* feel about

Anyway, it is unsurprisingly troublesome to parse correctly given that it is
possible to represent transcripts in several different ways in the Genbank
format, and moreover it appears to be possible to describe an exon, prior to
describing the mRNA prior to describing the gene which contains it... but
what worries me even more at the moment is the way that differential splicing
events are represented given the situation described above.  (eg. are exons
common to multiple transcripts represented multiple times?  I am guessing

Can anyone point me to a Genbank accession which displays alternate splicing
so that I can see how it looks and write/test my parser on it?

When the time comes to break everyone scripts I will, instead, commit the new
parser under a different name until I am sure that it is functional.  It
would be great if a few of you tested it on your GB entries to be sure that
it is functioning correctly.

ugh....  I'd rather be poking my eyes out with barbed wire...


"Speed is subsittute fo accurancy."

Dr. Mark Wilkinson
Bioinformatics Group
National Research Council of Canada
Plant Biotechnology Institute
110 Gymnasium Place
Saskatoon, SK