[Biojava-l] Genbank parsing problem
Cox, Greg
gcox@netgenics.com
Wed, 1 May 2002 10:34:06 -0400
For our purposes, it's important to be able to reconstruct the Genbank
record from a BioJava sequence. I wish Genbank didn't allow this
construction, but since it does we have to deal with it. Even though this
isn't a BioJava-type feature, I'd rather see BioJava's definition changed to
fit Genbank/EMBL rather than vice-versa.
Looking at the docs, I'd rather have this mapped to a fuzzy point location,
and relax the restriction on where features can be constructed.
Greg Cox
> -----Original Message-----
> From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz]
> Sent: Tuesday, April 30, 2002 5:33 PM
> To: Thomas Down; Simon Foote
> Cc: biojava-l@biojava.org
> Subject: RE: [Biojava-l] Genbank parsing problem
>
>
> To my mind a wholey remote feature is not really a Feature in the
> biojava sense and might be best handled as an Annotation. Perhaps a
> special kind of value (with a nice toString() method) could be
> constructed for it.
>
> - Mark
>
>
> > -----Original Message-----
> > From: Thomas Down [mailto:td2@sanger.ac.uk]
> > Sent: Wednesday, 1 May 2002 3:26 a.m.
> > To: Simon Foote
> > Cc: biojava-l@biojava.org
> > Subject: Re: [Biojava-l] Genbank parsing problem
> >
> >
> > On Tue, Apr 30, 2002 at 09:12:59AM -0400, Simon Foote wrote:
> > > I've recently run across a problem with parsing of Genbank files
> > > containing unbounded locations.
> > > Anyone have any idea what's causing it. I tried to trace it back
> > > through but got lost. But I think it has to do with the
> > single <1 for
> > > the -35_signal as shown in the example.
> > >
> > > -35_signal <1
> > > /gene="entD"
> >
> > The default Feature implementations in the BioJava
> > development tree explicitly forbid construction of Features
> > with locations which aren't contained by the sequence to
> > which they're attached. As a quick fix, you can just remove
> > the check from the constructor of
> > org.biojava.bio.seq.impl.SimpleFeature (lines 281--283 in my copy).
> >
> > I'm not sure what the proper solution for this problem is.
> > Normally, features which extend beyond the sequence can be
> > transformed into RemoteFeatures. However, this particular
> > feature is nasty in that it doesn't even partially overlap
> > the sequence. To my mind, it's actually pretty much
> > meaningless, and the best thing to do would be to drop it.
> > But some people like to be able to represent the whole of Genbank.
> >
> > Does anyone know how many more `wholly remote' features there
> > are in the databases? And any great ideas about how they
> > could be usefully represented?
> >
> > Thomas.
> >
> > _______________________________________________
> > Biojava-l mailing list - Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> >
> ==============================================================
> =========
> Attention: The information contained in this message and/or
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or
> privileged
> material. Any review, retransmission, dissemination or other
> use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> ==============================================================
> =========
> _______________________________________________
> Biojava-l mailing list - Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>