[Bioperl-l] Re: *major* error in genbank parser or am i just insane?
Lin, Xiaoying J.
Xiaoying.Lin@celera.com
Fri, 9 Aug 2002 11:38:08 -0700
Lincoln,
i agree that the code should not be do the guessing game for human
mistake like out of sync mRNA + CDS joins.
but for CDS features but no exon features, I am not sure I understand
you correctly. there are lots submissions in Genbank, which only comes
with CDS (join) features, but no separate exon features. If that is a
mistake, it is a systematic mistake then. How does the current parser
handle a record like
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=nucleotide
&list_uids=1458097&dopt=GenBank
I have not finished the older e-mails on this subject, so I may have
missed something here. thought everyone was busy having fun at
Edmonton, when did you guys get time to flood everyone's e-mail box ;-).
BTW, enjoyed your and other's talks at the BOSC.
Thanks.
Xiaoying
> -----Original Message-----
> From: Lincoln Stein [mailto:lstein@cshl.org]
> Sent: Friday, August 09, 2002 1:28 PM
> To: brian.king@animorphics.net; Brian King; Ewan Birney
> Cc: bioperl-l@bioperl.org
> Subject: Re: [Bioperl-l] Re: *major* error in genbank parser or am i
> just insane?
>
>
> Here's my 2c:
>
> If the genbank entry has CDS features but no exons, or an
> mRNA join operator
> which is out of sync with the CDS join, then in my opinion
> the quality of the
> annotation is so questionable that BioSQL should throw up its
> hands and seek
> human assistance in interpretation. Asking the import
> software to read the
> minds of the submitters is beyond what can be reasonably
> expected, and only
> ends up propagating errors.
>
> Lincoln
>
> On Friday 09 August 2002 04:49 am, Brian King wrote:
> > > This is very hard to do because you have to handle:
> > >
> > >
> > > (a) CDS with no Exons
> > >
> > > and, my particular favourite
> > >
> > > (b) a mRNA join operator which is out of sync
> > > with the CDS join
> > > operator (!)
> >
> > For (a) I'd put generic sub-features in the CDS to
> > hold the places of the presumed exons, and for (b) use
> > generic sub-features for the CDS and the mRNA joins
> > and just let them be out of sync. I surrender on
> > remote joins! I'd keep the location string in
> > documentation in the data, but not try to interpret
> > it. Ideally the parser would download the remote
> > record, but...
> >
> > Regards,
> > Brian
> >
> >
> >
> > __________________________________________________
> > Do You Yahoo!?
> > HotJobs - Search Thousands of New Jobs
> > http://www.hotjobs.com
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
>
> --
> ==============================================================
> ==========
> Lincoln D. Stein Cold Spring Harbor
> Laboratory
> lstein@cshl.org Cold
> Spring Harbor, NY
> ==============================================================
> ==========
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>