[Bioperl-pipeline] web, restart

Shawn Hoon shawnh at fugu-sg.org
Fri Jun 27 17:36:55 EDT 2003


On Friday, June 27, 2003, at 08:47  AM, <aaron at tll.org.sg> wrote:

> Dear Shawn,
>
> Hey, hope your network problems have cleared up, I can't detect 
> anything
> wrong from this end (even ssh connections from home).. was it a 
> particular
> machine or leo?
>

Ai yah, its Fugu network .. not problems when I go with IMCB 
network..sorry!

> As for the web pipeline... it certainly sounds like an interesting 
> option.
> Was working on web-based monitoring software earlier that went thru 
> three
> development revamps over 2 years. 1st version perl.cgi. 2nd pure java 
> (app)
> thru proprietary tcp-based protocol, and we had to finally settle on
> jsp/tomcat to avoid firewall/port problems, etc.
>

Ah you certainly have the right experience for it.

There are two areas that we want to have biopipe placed on the web,
monitoring and management. They are pretty much decoupled in terms of 
the pipeline
backend. They will probably be unified through the web interface.

Elia had some good ideas for this and below are his abbreviated points:

Monitoring--

-PipelineMonitor graphics (legacy work from Juguang)
-web pages that will show a live monitor of a running pipeline
-retrieve details of failing jobs etc


By monitoring, I mean we do not need to look at the process calls like 
you mentioned below
cuz we rely on the job scheduler LSF in this case. Thus we only need to 
provide the information
in the pipeline job table like juguang said. We may want to be smart 
about the monitoring, and put in some
heuristics to have the pipeline stop if all jobs that sent  to the node 
come back failed very quickly which may
signal that something is not setup properly that has escaped the 
initial setup checks.

For the web end, Monitoring should not be too hard. A simple web based 
one may just be a cgi script that queries
the pipeline database and refreshes every so often. Of course, usually 
we have the pipeline database
sitting behind a firewall so we need some daemon... More fanciful would 
be an applet...

Management --

-write client to run on the web server that can send small pipelines 
from the web to the pipeline server
-manage multiple pipelines concurrently
-start, suspend, resume, stop pipelines
-Clean-up files and databases associated with finished pipelines

This work was started by Kiran and Yujin but it never took off. Time to 
revisit. The idea is to have some interface
web or java app that allows one to launch pipelines. We wanted to have 
a daemon sitting on the server that manages
multiple pipelines. When a user submits a pipeline, the daemon will run 
the pipeline setup script and launch the pipeline and return
a pipeline_id. In this way the user can query his jobs using this 
pipeline_id.

This is quite doable once we are able to bridge the data flow from the 
web-end to putting them in the appropriate places
and have scripts that automate setting up pipeline (which are mostly 
there already).

> Let me read thru the biopipe architecture to see how the processes are
> handled (forked/threaded), managed by PID/lockfiles, and I'll get back 
> to
> the list on how best to begin the design of the web monitor/launcher.
>

Welcome aboard!


Shawn


> Cheers,
> aaron
>
>>
>> On Thursday, June 26, 2003, at 07:17  PM, jeremyp at sgx3.bmb.uga.edu
>> wrote:
>>
>>> Hi,
>>>
>>> I remember there was some talk about including software to allow for
>>> running pipelines through a web interface. Has this been done?
>>>
>>
>> Uhm, I personally have not been working on this but I think some folks
>> at TLL might
>> be working on this. Aaron you wanna chime in? Should we start some
>> discussion on what some issues that need
>> to be addressed to get this developing seriously?
>>
>>> Also, is there a way to restart the PipelineManager? That is, if the
>>> PipelineManager were killed but I wanted the pipeline it was running
>>> when
>>> killed to continue running at a later date, is there a way to do 
>>> this?
>>>
>>
>> Oh definitely, we do it all the time. When you kill the
>> PipelineManager, all the job states are
>> stored in the database. Normally what happens in this scenario is that
>> the pipeline user
>>
>> 1) first kills the PipelineManager script
>> 2) does a bkill 0 (for lsf) for his jobs
>> 3) Do whatever fixing one needs and make sure that the inputs and
>> output databases are cleaned appropriately
>> 3) The state of the jobs in the job table will have a mix of jobs that
>> have status Failed, New, Submitted
>>     You will need to set the jobs that are in Submitted state back to
>> New or Failed. This is because we killed
>>     the jobs with the bkill before they could write their status back
>> to the table. So the PipelineManager upon
>>    restart will think they are still running (not so clever as to 
>> check
>>
>> with LSF yet) and only fetch the New|Failed jobs.
>>
>>   So execute : update job set status="NEW" where status="SUBMITTED" in
>>   your pipeline database
>> 4) Remove the Pipeline lock file and run PipelineManager again OR
>>     run PipelineManager with the -f option and it should remove it for
>> you.
>>
>> hope that is clear
>>
>> cheers,
>>
>> shawn
>>
>>
>>
>>
>>> Thanks,
>>> Jeremy
>>> _______________________________________________
>>> bioperl-pipeline mailing list
>>> bioperl-pipeline at bioperl.org
>>> http://bioperl.org/mailman/listinfo/bioperl-pipeline
>
>
>
> _______________________________________________
> bioperl-pipeline mailing list
> bioperl-pipeline at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-pipeline
>



More information about the bioperl-pipeline mailing list