[Bioperl-l] 1.01

Catherine Letondal letondal@pasteur.fr
Mon, 13 May 2002 19:19:02 +0200


Jason Stajich wrote:
>Some projects on the table that one might hope would be part of 1.2:
>[...]
> * Design the interface based on the Bioperl/PISE to describe
>   remote analysis queues and add those classes to the main trunk.  Use
>   this interface for local execution as well as remote.

Hi,

Is it time to start discussions? I don't know exactly what questions are to be discussed 
yet? Anyway, these are my questions and suggestions ...

1) creating the factory and running:

  # a) analysis queue (returns a Bio::Factory::Pise)
  $factory = new Bio::Factory::EMBOSS;
  # or:
  $factory = new Bio::Factory::Pise;

  # b) analysis application object (returns a Bio::Tools::Run::PiseApplication or
  # Bio::Tools::Run::EMBOSSApplication)
  $mfold = $factory->program('mfold');	

  # c) analysis results 
  $result = $mfold->run(); 		


  ... is that OK for EMBOSS and openBSA?

 
2) general execution parameters:

 a) local or remote execution
	- default could be local for EMBOSS and remote for Pise?
	- in Pise, the default remote server could be different for different programs (I
	mean, not only at Pasteur...:-) )

    - so one should be able to choose between local/remote execution and, if remote, to 
   choose a non-default server location; this choices could happen either at 
   factory creation, or at application creation, or at run step:
	# a) at factory creation
	$factory = new Bio::Factory::Pise(-remote => 'http://somewhere/cgi-bin/Pise');

	# b) at application creation - take the default remote server
	$needle = $factory->program('needle', -remote => 1);	

	# c) at run time
	$result = $mfold->run(-remote => 'http://bioweb.pasteur.fr/cgi-bin/seqanal/mfold.pl'); 


 b) email could be specified once at factory creation (for Pise)


3) parameters specification

   a) when?  
	# at factory creation?
	$water = $factory->program('water', sequencea => $seqa,  seqall => $seqb);
	$result1 = $water->run();

	# before running?
	$water->sequencea($seqc);
	$result2 = $water->run();

	# when running?
	$result3 = $water->run(sequencea => $seqd);

   b) how?  -name or name


4) analysis results: what is it, a string, an object, ...?

  $result = $fasta->run(); 
		
	- in Pise/bioperl $result is an instance of PiseJob, i.e a kind of "handle" from 
        which you can fetch results (image files, treefile, ...)
	print $result->content("treefile");
	print $result->stdout;
	$result->save("blast2.txt");
	etc...

	- in Bio::Tools::Run::EMBOSSApplication, it's a string (the actual result): don't
	you think it's more general to have an object?

5) use of analysis result:

   - it's convenient to be able to build a handle from a result, in order
   to feed it to bioperl parsers or to other programs

    $aln = Bio::AlignIO->newFh (-fh => $needle_result->fh("outfile.align"), 
                                -format => "fasta");
    $neighbor = $factory->program('neighbor', infile => $protdist_job->fh('outfile'));

   - construct an analysis result from an ID:
    $neighbor = $factory->result('http://bioweb.pasteur.fr/seqanal/tmp/blast2/A12465102130064/')

6) misc:

 - It should be possible to issue an asynchronous run request (to enable parallel
   execution for long jobs)


How is all that compatible with OpenBSA?

--
Catherine Letondal -- Pasteur Institute Computing Center