[Biojava-l] Re: Proposed addition to the SequenceDB interface

Marc Colosimo MEColosimo@alumni.carnegiemellon.edu
Tue, 19 Mar 2002 17:21:11 -0500


Thomas Down wrote:

> [snip]
> > Would this return the feature as some sort of generic gene.id feature? My
> > growing concern is that for each file/db/SQL format we are adding features with
> > their original names rather than some defined BioJava enforced named feature. I
> > noticed a dtd for features. Unfortunately, I don't know much about XML besides
> > the simple things. Could we make something like gene_id, accession_no, etc...
> > ? By using these set names, you don't have to know what a gene_id tag is for
> > EMBL, genbank, SQL,.......
> >
> > Or have I missed this ability in BioJava somehow?
>
> No, your concern is quite justified.  It is, indeed, necessary to
> have some specialized knowledge about a particular data source before
> you can really make use of the tag-value data present in the
> Annotation bundles.
>
> I think a set of `common' key names would be a big help, and I'd welcome
> any proposals for what should be in here (the standard set of feature
> types and qualifiers from EMBL might be a good starting point, but
> probably not a complete solution).  I'd also like to be able to introspect,
> for a given database, what properties I should expect to find on features.
> The AnnotationType objects, written recently by Matthew, ought to be one
> part of the puzzle.
>
> Even before this problem is solved, the filter-all-features-in-a-database
> operator still seems to me to be useful -- and I can't see any way in
> which it should make improved standardization and `introspectability' harder
> in the future.  Or am I missing something?
>
>     Thomas.

Well after looking around I did find something that would make me happy and possibly
others. A common interface has been worked out for EMBL/GenBank/DDBJ flat file
formats in use with CORBA servers <http://corba.ebi.ac.uk/EMBL_embl.html>.

I didn't find anything in biocorba or Biocorba folders and I don't know what
BioFetch is for. So my guess is, this hasn't been touched here (in BioJava). I think
it would be a good option. If I every have time away from the bench or my class
homework, I could work on it.

Marc