[Bioperl-l] final proposal: Bio::DB::WebSeqDBI
Ewan Birney
birney@ebi.ac.uk
Tue, 12 Dec 2000 10:21:16 +0000 (GMT)
On Mon, 11 Dec 2000, Jason Stajich wrote:
> The final proposal before I commit the code (all tests pass on my
> machine).
>
> 2 new modules
> Bio::DB::WebSeqDBI - ISA Bio::DB::RandomAccessI
> Bio::DB::NCBIHelper ISA Bio::DB::WebSeqDBI
>
> rewrites of Bio::DB::GenBank, Bio::DB::GenPept, Bio::DB::SwissProt.
>
> Bio::DB::WebSeqI -
>
> This interface encapsulates the standard data retrieval methods from a
> Web Sequence Database. Implementing classes must implement the method
> get_request while takes as arguments a hash
> of qualifiers - uids, format, etc with which to query the database and
> returns a HTTP::Request object. The WebSeqDBI class manages a
> LWP::UserAgent for obtaining data from the web dbs and turning that data
> stream into a Bio::SeqIO.
>
> Because of the way LWP works right now, it is not possible to take a data
> stream from webserver and transform it into a Bio::SeqIO, rather, one must
> read all the data from the server and then either store that in a tempfile
> or transform it into a IO::String which can be treated as a filehandle.
> Also a pain, the retrieval method from NCBI has some HTML 'contamination'
> which needs to be screened out through a method call to postprocess_data.
>
> One issue I am not sure how to best deal with, the temporary file removal
> at the end of the life of the Bio::DB::WebSeqDBI object. The following
> code illustrates a case this will remove files too soon.
>
> my $seqdb = new Bio::DB::Genbank(-retrievaltype=>'tempfile');
> my $seqio = $seqdb->get_Stream_by_id($accession);
> undef $seqdb; # this will remove the seqdb object and cleanup the
> # tempfile that was created
> my $seq = $seqio->next_seq(); # bomb because no file exists now.
>
> Anyone with better ideas on this feel free to let me know.
>
> Bio::DB::NCBIHelper -
>
> Since the Bio::DB::GenBank and Bio::DB::GenPept are so similar I wrote a
> class that encapsulates all the of common functionality for retrieving
> sequence data from these databases.
>
> I'm sure it will all make much more sense once I check the code in, I just
> wanted to check and see if anyone has comments or wants clarification
> before I checkin major reworks to the current modules.
>
> Is the name WebSeqDBI misleading - (ie looks like it would be a DBI
> module...?) We like to use 'I' at the end of a module name to denote
> interfaces.
I know where you are coming from, but I do think we have to do something
different here in the naming. WebDBSeqI ?
>
> -Jason
> Jason Stajich
> jason@chg.mc.duke.edu
> Center for Human Genetics
> Duke University Medical Center
> http://www.chg.duke.edu/
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>.
-----------------------------------------------------------------