[Biojava-l] sequence dbs

Jason Stajich jason@chg.mc.duke.edu
Tue, 15 May 2001 17:44:40 -0400 (EDT)


On Wed, 16 May 2001, Schreiber, Mark wrote:

> Hi -
> 
> I very much favour the idea of having a remote SequenceDB rather than
> breaking the substantial amount of code that uses SequenceDB. I use
> SequenceDB all the time in my programs so I guess I am keen to not have to
> recode it all.
> 
Agreed.  Would really not want to do that.  I think we can make this work
with current interface + RemoteSequenceDB throwing appropriate exceptions 
it just won't be quite as clean as the OO junkies may like... 

> As for parsing an unknown sequence type a simple (and innefficient way to do
> it would be to read the record once as a text (or XML) file to determine the
> correct alphabet then parse it for real. Don't know if this can be done
> dynamically with the current biojava parsers. Maybe parsers based on a SAX
> event model would be the way to go??
> 

> Mark
> 
> Mark Schreiber
> Bioinformatics
> AgResearch Invermay
> PO Box 50034
> Mosgiel
> New Zealand
> 
> PH: +64 3 489 9175
> 
>  
> 
> > -----Original Message-----
> > From: Jason Stajich [mailto:jason@chg.mc.duke.edu]
> > Sent: Wednesday, May 16, 2001 9:19 AM
> > To: BioJava List
> > Subject: [Biojava-l] sequence dbs
> > 
> > 
> > I started to work on this at biojava bootcamp, didn't get 
> > very far because
> > of the following:
> > seq.db.SequenceDB currently have the following methods that one cannot
> > implement for 'remote' databases.  
> > 
> > <   Set ids();
> > <   SequenceIterator sequenceIterator();
> > 
> > <   void addSequence(Sequence seq)
> > <   throws IllegalIDException, BioException, ChangeVetoException;
> > <   void removeSequence(String id)
> > <   throws IllegalIDException, BioException, ChangeVetoException;
> > 
> > I started to split these methods into separate interfaces -
> > LocalSequenceDB for the ids() and seuenceIterator and 
> > UpdateableSequenceDB
> > for add/remove.  This of course breaks all classes which depend on
> > SequenceDB.  The other option is to create RemoteSequenceDB 
> > which throws
> > VetoExceptions for add/remove calls and some other exception for
> > ids()/sequenceIterator().  
> > 
> > BTW: An example of a RemoteDB is web EMBL queries which we will patch
> > through HTTP to extract a sequence from this database (will 
> > be talking to
> > Heikki's web script).  Similarly if the GenBank parsing works 
> > we can pass
> > queries to NCBI GenBank to query on an accession number.
> > 
> > One other major issue is: what if we do not know what type of 
> > sequence we
> > are obtaining (prot or [dr]na)?  Biojava likes to have these things
> > established in the parser - but I won't really be able to 
> > divine anything
> > from an accession number.  ideas?
> > 
> > -jason
> > 
> > Jason Stajich
> > jason@chg.mc.duke.edu
> > Center for Human Genetics
> > Duke University Medical Center 
> > http://www.chg.duke.edu/ 
> > 
> > 
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> > 
> 

Jason Stajich
jason@chg.mc.duke.edu
Center for Human Genetics
Duke University Medical Center 
http://www.chg.duke.edu/