[Bioperl-l] RPSblast and existing BLAST packages (WAS: RemoteBlast)

Will Spooner whs at sanger.ac.uk
Tue Nov 4 07:34:01 EST 2003


Hi Richard,

I recently implemented a BioPerl-based generic sequence search API
for Ensembl (http://www.ensembl.org/Multi/blastview). This seems very
similar to what you propose below. The approach I used, however,  was to
abstract the differences between search methods (wu-blast, ncbi-blast,
fasta etc) into different  perl modules. This is similar to the way that
SeqIO and SearchIO handle different formats. For example:

  # This lazy-loads Bio/Tools/Run/Search/wu_blastn.pm
  my $search = Bio::Tools::Run::Search->new( -method=>'wu_blastn' );

  # This lazy-loads Bio/Tools/Run/Search/fasta.pm
  my $search = Bio::Tools::Run::Search->new( -method=>'fasta' );

Bio::Tools::Run::Search has the following methods:

  'seq'      - adds the query sequance
  'database' - configures the database location
  'command'  - generates the command to run
  'option'   - configures command options
  'dispatch' - launches the command
  'environment_variables' - configures environment variables
  'run'      - combines 'command' + 'dispatch'
  'status'   - reports job status (PENDING, RUNNING, COMPLETED etc)
  'report'   - returns the raw search report
  'next_result' - returns a Bio::Search::Result object
  (N.b. Bio::Tools::Run::Search ISA Bio::Tools::Run::WrapperBase)

This approach is pretty nice because you can easily subclass the '-method'
modules to change the search behaviour. For example,
-method=>'wu_blastn_bsub' is the same as -method=>'wu_blastn', except
that the 'dispatch' method has been overridden to use the bsub job
submission system. In addition, new '-methods' can be added without
editing existing code.

Whilst I'm still developing the core of the system, I have functioning
modules for wu-blast (inline, offline, bsub), ncbi-blast (inline,
offline, bsub), blat (gfClient) and ssaha (ssahaClient).

A lot more detail can be found at:
  http://www.ensembl.org/Docs/wiki/html/EnsemblDocs/EnsemblBlastView.html

If this style of approach is of interest to you moving forward, then I
would be very interested in contributing.

Kind regards,

Will


On Tue, 4 Nov 2003, Richard Adams wrote:

> I'm not sure - here are some random musings
> It seems easier to organize the modules by program package rather than
> by program function - for example,
> Smith-Waterman modules are distinct from Blast modules even though the
> programs have similar aims. If we're going to have
> a uniform access to Remote Blast and standalone blast then one way might
> be to have BlastQuery class with common parameter
> setting methods, and methods such as
>    run_remote_blast
>    run_local_blast
>     which access the implementing code as appropriate. But this might be
> a pain to implement without breaking everyone's existing code.
>
> Or, since standaloneblast uses autoload
> we could just add alternative allowable names for methods so that
> $factory->p('blastn') and $factory->program('blastn')are treated the
> same. Having method names the same as the header names in the blast URI
> documentation might be best as I would suspect that everyone has used
> the web interface but not everyone uses standalone blast.
>
> Richard
>
>
>
> --
> Dr Richard Adams
> Bioinformatician,
> Psychiatric Genetics Group,
> Medical Genetics,
> Molecular Medicine Centre,
> Western General Hospital,
> Crewe Rd West,
> Edinburgh UK
> EH4 2XU
>
> Tel: 44 131 651 1084
> richard.adams at ed.ac.uk
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

---
Dr William Spooner                          whs at sanger.ac.uk
Ensembl Web Developer                 http://www.ensembl.org



More information about the Bioperl-l mailing list