[Bioperl-l] Re: [Bioperl-announce-l] an extension to Bio::SeqIO

Jason Stajich jason at cgt.duhs.duke.edu
Wed Jun 18 13:21:56 EDT 2003


On Wed, 18 Jun 2003, Chris Mungall wrote:

>
>
> On Wed, 18 Jun 2003, Peili Zhang wrote:
>
> > >>
> > >> sequence data in any (rich) formats
> > >> 	 |
> > >> 	 | via Bio::SeqIO
> > >> 	 v
> > >     Bio::Seq->get_SeqFeatures()
> > >         OR
> > >      Collection of Bio::SeqFeatureI
> > >
> > >[Actually what's cooler I think is that you don't need Bio::Seq objects or
> > >anything, just a set of Bio::SeqFeatureI objects. This would mean that
> > >people could take their GFF files and turn them into chado IFF they are
> > >rich enough.]
> > >
> >
> > we do want the Bio::Seq objects. for instance, if the Bio::Seq object is a gene,
> > we'll want to create a top-level feature of type 'gene' for it in chado, as well
> > as loading in its references as feature_pubs, accessions as feature_dbxrefs,
> > comments/descriptions/others as featureprops. its gene model features
> > (transcripts, exons, CDS's) will hang off of the top-level feature.
>
> Hi Peili
>
> Can you give an example of where a Bio::Seq object is created for a gene?
> If these are coming from genbank, the Bio::Seq corresponds to either the
> srcfeature (if it is a genomic DNA record) or to a transcript (if it is an
> mRNA record)
>
> I agree with Jason that we will mostly be populating from Bio::SeqFeatureI
> objects; GFF3 is actually quite a nice match for chado.
>
> By the way, where does GFF fit into the *IO framework? Right now it's a
> Bio::Tools thing. Will there be a Bio::SeqFeatureIO?

I think if we are going to have more than one feature input format,
then a general FeatureIO class would be a good thing.

NCBI has their .ptt format for CDS which is a pretty simple parser for as
well.  So I would like to see a Bio::(Feature/SeqFeature)IO (not sure what
the best name would be here?)  I could also imagine us writing lightweight
parsers for genbank/embl/bsml files as well for people wanting to just
pull out seqfeatures for the common case (at least for me) of converting
to GFF for loading into Gbrowse.




>
> > -peili
> >
> >
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list