[Bioperl-l] RPSblast and existing BLAST packages (WAS:
RemoteBlast)
Will Spooner
whs at sanger.ac.uk
Tue Nov 4 07:34:01 EST 2003
Hi Richard,
I recently implemented a BioPerl-based generic sequence search API
for Ensembl (http://www.ensembl.org/Multi/blastview). This seems very
similar to what you propose below. The approach I used, however, was to
abstract the differences between search methods (wu-blast, ncbi-blast,
fasta etc) into different perl modules. This is similar to the way that
SeqIO and SearchIO handle different formats. For example:
# This lazy-loads Bio/Tools/Run/Search/wu_blastn.pm
my $search = Bio::Tools::Run::Search->new( -method=>'wu_blastn' );
# This lazy-loads Bio/Tools/Run/Search/fasta.pm
my $search = Bio::Tools::Run::Search->new( -method=>'fasta' );
Bio::Tools::Run::Search has the following methods:
'seq' - adds the query sequance
'database' - configures the database location
'command' - generates the command to run
'option' - configures command options
'dispatch' - launches the command
'environment_variables' - configures environment variables
'run' - combines 'command' + 'dispatch'
'status' - reports job status (PENDING, RUNNING, COMPLETED etc)
'report' - returns the raw search report
'next_result' - returns a Bio::Search::Result object
(N.b. Bio::Tools::Run::Search ISA Bio::Tools::Run::WrapperBase)
This approach is pretty nice because you can easily subclass the '-method'
modules to change the search behaviour. For example,
-method=>'wu_blastn_bsub' is the same as -method=>'wu_blastn', except
that the 'dispatch' method has been overridden to use the bsub job
submission system. In addition, new '-methods' can be added without
editing existing code.
Whilst I'm still developing the core of the system, I have functioning
modules for wu-blast (inline, offline, bsub), ncbi-blast (inline,
offline, bsub), blat (gfClient) and ssaha (ssahaClient).
A lot more detail can be found at:
http://www.ensembl.org/Docs/wiki/html/EnsemblDocs/EnsemblBlastView.html
If this style of approach is of interest to you moving forward, then I
would be very interested in contributing.
Kind regards,
Will
On Tue, 4 Nov 2003, Richard Adams wrote:
> I'm not sure - here are some random musings
> It seems easier to organize the modules by program package rather than
> by program function - for example,
> Smith-Waterman modules are distinct from Blast modules even though the
> programs have similar aims. If we're going to have
> a uniform access to Remote Blast and standalone blast then one way might
> be to have BlastQuery class with common parameter
> setting methods, and methods such as
> run_remote_blast
> run_local_blast
> which access the implementing code as appropriate. But this might be
> a pain to implement without breaking everyone's existing code.
>
> Or, since standaloneblast uses autoload
> we could just add alternative allowable names for methods so that
> $factory->p('blastn') and $factory->program('blastn')are treated the
> same. Having method names the same as the header names in the blast URI
> documentation might be best as I would suspect that everyone has used
> the web interface but not everyone uses standalone blast.
>
> Richard
>
>
>
> --
> Dr Richard Adams
> Bioinformatician,
> Psychiatric Genetics Group,
> Medical Genetics,
> Molecular Medicine Centre,
> Western General Hospital,
> Crewe Rd West,
> Edinburgh UK
> EH4 2XU
>
> Tel: 44 131 651 1084
> richard.adams at ed.ac.uk
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
---
Dr William Spooner whs at sanger.ac.uk
Ensembl Web Developer http://www.ensembl.org
More information about the Bioperl-l
mailing list