[Bioperl-pipeline] Re: biopipe questions

Wed Feb 19 18:34:30 EST 2003

> 1. Why the "Bio" in the name?  Is BioPipe specifically tied in to 
> BioPerl and Ensembl?

It is tied to bioperl, not to ensembl. It uses bioperl runners (in
bioperl-run) and parsers (in bioperl-live) to run programs and parse their
output. Also all objects inherit from Bio::Root::Root for exception
handling,etc.

> 2. It appears that workflow is initiated by "pull" rather than "push". 
> That is, the endpoint of the workflow (the last node) requests input 
> from the previous node, and this process repeats until the whole network 
> is recursively parsed.  Is this true?

No, not sure where you saw this happening... it simply looks into the rule
tables to see what follows next, not the other way around...

> 3. Can BioPipe be provided the XML (as a flat file or from stdin) 
> describing the pipeline by an external program (see below)? 

Yes, absolutely, that was the idea behined abstracting the whole workflow
into XML.

> Or can a program work with the pipeline description directly from
> MySQL?

Both can be done. The "heart" is in SQL, but it can all be
exported/imported with XML. The SQL will contain more data than the XML
only once the pipeline is running (jobs, status,etc.)

> 4. I may have asked this before: has BioPipe been tested over a network, 
> such as a grid or cluster?

BioPipe was primarily written to work on large clusters, and runs now
routinely on our cluster. It relies on an external load scheduler and has
plugins for LSF and PBS, it is normally run on our site via LSF.

Great to hear you are interested in working on a web front-end! Comments:

> 6. Execute the pipeline (send the pipeline XML to the program that runs it)

In a real life scenario this part gets hairy because the pipeline server
is often residing on a different more secure network segment that does not
NFS share any space with the web servers. For this we have been thinking
that it would be best to have a simple client-server system that would
send the XML and tell the server to start the pipeline.

> 7. Save the results of the execution (in their account online)

Neat

> I am leaning toward doing this in PHP, as the web part is more important 
> than tying it in to BioPerl (if I can do #6 above) or making it 
> heavily OO (as was the case with the Piper front-end).

In theory (see above) it can be done in any technology as long as the XML
is created.

> In any case, it appears that BioPipe has a great deal of momentum behind 
> it, more than there was behind Piper.  And the fact that someone else is 
> working on the infrastructure (back-end) issues, is very appealing (I am 
> not an expert on networking protocols, nor do I plan to make that my 
> focus in life ;-)).  Perhaps we can work together (I make the front-end 
> for BioPipe, and you make BioPipe usable on BiO Grid).

Absolutely, I'd be very excited about this!

We are actually in the middle of the BioHackathon a meeting of all OpenBio
core developers, and a BioPipe is getting revamped too...

Elia

********************************
* http://www.fugu-sg.org/~elia *
* tel:    +65 6874 1467        *
* mobile: +65 9030 7613        *
* fax:    +65 6779 1117        *
********************************