[Bioperl-l] genbank alternate-splicing representation

Chris Mungall cjm@fruitfly.bdgp.berkeley.edu
Tue, 9 Oct 2001 14:28:16 -0700 (PDT)

On Tue, 9 Oct 2001, Mark Wilkinson wrote:

> Hi all,
> I have started working on modifying the Genbank parser to create
> GeneStructureI-compliant features...
> <ASIDE TYPE = "rant" FRUSTRATION_LEVEL = "high">Good Lord I wish I had never
> stuck my hand up to do that one!  I had forgotten just how much I *hated* the
> Genbank format for Seq/Feature representation... not to mention the fact that
> it is user-curated... but one day ask me how I *really* feel about
> it!</ASIDE>
> Anyway, it is unsurprisingly troublesome to parse correctly given that it is
> possible to represent transcripts in several different ways in the Genbank
> format, and moreover it appears to be possible to describe an exon, prior to
> describing the mRNA prior to describing the gene which contains it... but
> what worries me even more at the moment is the way that differential splicing
> events are represented given the situation described above.  (eg. are exons
> common to multiple transcripts represented multiple times?  I am guessing
> so...)
> Can anyone point me to a Genbank accession which displays alternate splicing
> so that I can see how it looks and write/test my parser on it?

AE003822 - the first gene in this piece of genomic is calmodulin which has
lots of alternate splicing

> When the time comes to break everyone scripts I will, instead, commit the new
> parser under a different name until I am sure that it is functional.  It
> would be great if a few of you tested it on your GB entries to be sure that
> it is functioning correctly.
> ugh....  I'd rather be poking my eyes out with barbed wire...

In that case, please don't look at X98338 (dicistronic mrna for two
genes)..... <evil cackle>

NCBI format may be overly lax, but at least it does allow for this sort of

> M
> --
> --------------------------------
> "Speed is subsittute fo accurancy."
> ________________________________
> Dr. Mark Wilkinson
> Bioinformatics Group
> National Research Council of Canada
> Plant Biotechnology Institute
> 110 Gymnasium Place
> Saskatoon, SK
> Canada
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l