[Bioperl-l] remote ncbi blast

Jason Stajich jason@chg.mc.duke.edu
Wed, 4 Apr 2001 16:07:33 -0400 (EDT)

Remote blast (Bio::Tools::Blast) has been broken with the changes to ncbi
blast for about 9 months or so.  I propose we still try and support it, at
first at least through NCBI's blastcl3 tool which is available for almost
all platforms including macintosh.  It is possible to use it just like
blastall except that it connects to remote blast server queue and requests
the job.  Supporting this would require minor changes to StandAloneBlast
to call the blastcl3 application.  Would this be an acceptable solution
for people?

   A second approach is to try and submit a job via HTTP and use the
queueing system and checking back periodically to see if a job is
finished.  This will require some reverse engineering of ncbi blast
request forms and I suspect would not be the easiest thing in the world so
I would leave it as a second tier approach.  It is more attractive because
we can implement a pure perl solution.

While we're on the topic of remote NCBI resources and to toast Heikki's
Bio::DB::EMBL. I'd like to think about the relative un-reliable nature of
Bio::DB::GenBank much of the time.  At least when running the tests I
often get problems with the DB.t tests when running the GenBank/GenPept
queries.  I'm not sure if this is just the proximity of the queries
causing NCBI to rehect them, bad network connections, or what.  I've been
playing with ncbi network entrez which is available as mac/windows/unix
executables which is always reliable.  Is there any easy way to take
advantage of their TCP/IP connection approach.  I know there is probably
some wicked ASN1.1 going on underneath, but it is reliable and fast...

Jason Stajich
Center for Human Genetics
Duke University Medical Center