[Bioperl-l] parsing clone/contig coordinates from genbank

Chris Mungall cjm@fruitfly.bdgp.berkeley.edu
Mon, 5 Nov 2001 08:31:48 -0800 (PST)


On Sat, 3 Nov 2001, Hilmar Lapp wrote:

> Ewan Birney wrote:
> > 
> > On Thu, 1 Nov 2001, Chris Mungall wrote:
> > 
> > >
> > > I'm trying to parse this:
> > > ftp://ftp.ncbi.nlm.nih.gov/genbank/genomes/C_elegans/CHR_I/worm_I.gbs
> > >
> > > But I get this:
> > > Can't call method "_generic_seqfeature" on an undefined value at
> > > /users/cjm/cvs/bioperl-live/Bio/SeqIO/genbank.pm line 277
> > >
> > > is it pushing the genbank parser too far to get contig coordinates?
> > 
> > Which version? line 227 doesn't call _generic_seqfeature for me.
> > 
> 
> genbank.pm seqIO format cannot handle CONTIGs. We had this
> discussion some time ago, but no-one actually wrote the code yet
> to recursively fetch the pieces.
> 
> CONTIG is not recognized presently; what do you suggest to do with
> this section in absence of the recursively fetching code? Ignore
> and return a seq object with empty seq? Or die with "not
> implemented yet"?

What would be ideal for me would be a SeqFeature for every contig with
each one having a single location.

There would be no recursive fetching, and the Seq would have an empty
Seq->seq string. (I'd load the individual clones into my own relational
database and hence wouldnt need any recursive fetching and i could get
sequence from the contig on fly from the clone sequences)

Of course this behaviour wouldn't be right for everyone; if i was to
implement it this way would it be acceptable to commit this or would
others want different behaviour?
 
> 	-hilmar
> -- 
> -----------------------------------------------------------------
> Hilmar Lapp                              email: hilmarl@yahoo.com
> San Diego, Ca. 92130                     phone: +1 858 812 1757
> -----------------------------------------------------------------
>