[Bioperl-l] RPSblast and existing BLAST packages (WAS:
RemoteBlast)
Richard Adams
Richard.Adams at ed.ac.uk
Tue Nov 4 13:16:07 EST 2003
Will,
That's great, just looking at your docs it definitely sounds the best
option to have separate modules for the specific parts of each blast
program. You obviously spend a lot of time woriking on this and so I'd
imagine that most of your methods for setting up the query
could be put straight in with little change, with the base class
containing the remote_blast() and local_blast()
methods instead of your run() method, which send the query to
RemoteBlast / StandAloneBlast for actually running the query.
Maybe to answer Donald's point the module for running hmmer could also
be included?
If you'd be willing to send your code for the ncbi-blast modules and,
I'll try and put together a draft plan of module organisation and
methods for discussion.
Cheers
Richard
Will Spooner wrote:
>Hi Richard,
>
>I recently implemented a BioPerl-based generic sequence search API
>for Ensembl (http://www.ensembl.org/Multi/blastview). This seems very
>similar to what you propose below. The approach I used, however, was to
>abstract the differences between search methods (wu-blast, ncbi-blast,
>fasta etc) into different perl modules. This is similar to the way that
>SeqIO and SearchIO handle different formats. For example:
>
> # This lazy-loads Bio/Tools/Run/Search/wu_blastn.pm
> my $search = Bio::Tools::Run::Search->new( -method=>'wu_blastn' );
>
> # This lazy-loads Bio/Tools/Run/Search/fasta.pm
> my $search = Bio::Tools::Run::Search->new( -method=>'fasta' );
>
>Bio::Tools::Run::Search has the following methods:
>
> 'seq' - adds the query sequance
> 'database' - configures the database location
> 'command' - generates the command to run
> 'option' - configures command options
> 'dispatch' - launches the command
> 'environment_variables' - configures environment variables
> 'run' - combines 'command' + 'dispatch'
> 'status' - reports job status (PENDING, RUNNING, COMPLETED etc)
> 'report' - returns the raw search report
> 'next_result' - returns a Bio::Search::Result object
> (N.b. Bio::Tools::Run::Search ISA Bio::Tools::Run::WrapperBase)
>
>This approach is pretty nice because you can easily subclass the '-method'
>modules to change the search behaviour. For example,
>-method=>'wu_blastn_bsub' is the same as -method=>'wu_blastn', except
>that the 'dispatch' method has been overridden to use the bsub job
>submission system. In addition, new '-methods' can be added without
>editing existing code.
>
>Whilst I'm still developing the core of the system, I have functioning
>modules for wu-blast (inline, offline, bsub), ncbi-blast (inline,
>offline, bsub), blat (gfClient) and ssaha (ssahaClient).
>
>A lot more detail can be found at:
> http://www.ensembl.org/Docs/wiki/html/EnsemblDocs/EnsemblBlastView.html
>
>If this style of approach is of interest to you moving forward, then I
>would be very interested in contributing.
>
>Kind regards,
>
>Will
>
>
>On Tue, 4 Nov 2003, Richard Adams wrote:
>
>
>
>>I'm not sure - here are some random musings
>>It seems easier to organize the modules by program package rather than
>>by program function - for example,
>>Smith-Waterman modules are distinct from Blast modules even though the
>>programs have similar aims. If we're going to have
>>a uniform access to Remote Blast and standalone blast then one way might
>>be to have BlastQuery class with common parameter
>>setting methods, and methods such as
>> run_remote_blast
>> run_local_blast
>> which access the implementing code as appropriate. But this might be
>>a pain to implement without breaking everyone's existing code.
>>
>>Or, since standaloneblast uses autoload
>>we could just add alternative allowable names for methods so that
>>$factory->p('blastn') and $factory->program('blastn')are treated the
>>same. Having method names the same as the header names in the blast URI
>>documentation might be best as I would suspect that everyone has used
>>the web interface but not everyone uses standalone blast.
>>
>>Richard
>>
>>
>>
>>--
>>Dr Richard Adams
>>Bioinformatician,
>>Psychiatric Genetics Group,
>>Medical Genetics,
>>Molecular Medicine Centre,
>>Western General Hospital,
>>Crewe Rd West,
>>Edinburgh UK
>>EH4 2XU
>>
>>Tel: 44 131 651 1084
>>richard.adams at ed.ac.uk
>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at portal.open-bio.org
>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>---
>Dr William Spooner whs at sanger.ac.uk
>Ensembl Web Developer http://www.ensembl.org
>
>
>
--
Dr Richard Adams
Bioinformatician,
Psychiatric Genetics Group,
Medical Genetics,
Molecular Medicine Centre,
Western General Hospital,
Crewe Rd West,
Edinburgh UK
EH4 2XU
Tel: 44 131 651 1084
richard.adams at ed.ac.uk
More information about the Bioperl-l
mailing list