From bphillips at gte.net Sat Aug 2 03:55:02 2003 From: bphillips at gte.net (bphillips@gte.net) Date: Sat Aug 2 03:47:07 2003 Subject: [Bioperl-pipeline] fwd: hey its me again Message-ID: <200308020747.h727l44T019317@localhost.localdomain> holy pork bun! - you have to see this crazy site, I saved $3000... You really owe it to yourself and your family to take a look, are you prepared for lower mortgage repayments? http://srd.yahoo.com/drst/112/*http://buynow3sx.com/viewso65/index.asp?RefID=198478 From blaura_10 at email.com Sun Aug 3 07:49:08 2003 From: blaura_10 at email.com (blaura_10@email.com) Date: Sun Aug 3 07:59:51 2003 Subject: [Bioperl-pipeline] fwd: hey its me again Message-ID: <200308031159.h73Bxj4T026772@localhost.localdomain> An HTML attachment was scrubbed... URL: http://portal.open-bio.org/pipermail/bioperl-pipeline/attachments/20030803/1729168c/attachment.htm From bcathy at erols.com Mon Aug 4 12:56:15 2003 From: bcathy at erols.com (bcathy@erols.com) Date: Mon Aug 4 12:49:54 2003 Subject: [Bioperl-pipeline] Fwd: Great Site Message-ID: <200308041649.h74Gno4T030224@localhost.localdomain> An HTML attachment was scrubbed... URL: http://portal.open-bio.org/pipermail/bioperl-pipeline/attachments/20030805/eda60163/attachment.htm From jeremyp at sgx3.bmb.uga.edu Mon Aug 4 17:43:40 2003 From: jeremyp at sgx3.bmb.uga.edu (jeremyp@sgx3.bmb.uga.edu) Date: Mon Aug 4 17:43:21 2003 Subject: [Bioperl-pipeline] Tags Message-ID: <4893.128.192.15.158.1060033420.squirrel@sgx3.bmb.uga.edu> Hi, I have a question concerning "tags." setup_file.pm stores a tag if one is provided in the xml configuration file. Typically, this is set to infile as an infile method is already provided by RunnableI.pm. Using the COPY_ID_FILE action, this can easily be propagated to later analyses. But, using the id as a filename sidesteps iohandlers. Is there a way to use a tag with general input that must go through an iohandler without using an input create? Is there a way to specify this in the xml file? Perhaps by writing the dbadaptor/iohandler/runnable db a certain way? I'm looking to avoid the datatypes subroutine used by many other runnables. Thanks, Jeremy From bpfoss at email.com Wed Aug 6 06:29:43 2003 From: bpfoss at email.com (bpfoss@email.com) Date: Wed Aug 6 06:36:40 2003 Subject: [Bioperl-pipeline] gates Message-ID: <200308061036.h76AaZ4T004732@localhost.localdomain> An HTML attachment was scrubbed... URL: http://portal.open-bio.org/pipermail/bioperl-pipeline/attachments/20030806/87e2561b/attachment.htm From a_paris2 at dell.com Thu Aug 7 10:54:06 2003 From: a_paris2 at dell.com (a_paris2@dell.com) Date: Thu Aug 7 10:53:52 2003 Subject: [Bioperl-pipeline] Best Buy Award, 2002 Message-ID: <200308071453.h77Ero4T012412@localhost.localdomain> YOUR COMPUTER IS NOT PROTECTED! the most common viruses are transmitted and installed behind the scenes while you're on the internet! Norton Antivirus is a complete package that offers you TOTAL protection! btw, you look great today. Click here for TOTAL PROTECTION. http://appqov@softwaresavings2you.biz/default.asp?id=3000 ps. dont want any more of this shit? http://appq5ov@softwaresavings2you.biz/remove/remove.html From bthomas at juno.com Sat Aug 9 02:40:40 2003 From: bthomas at juno.com (bthomas@juno.com) Date: Sat Aug 9 02:41:55 2003 Subject: [Bioperl-pipeline] man Message-ID: <200308090641.h796fq4T023207@localhost.localdomain> holy pork bun! - you have to see this crazy site, I saved $3000... Every day thousands of Americans are saving money, don't be one of the few who miss out! you're placed up for auction and financers outbid each other on getting you the best deal on your mortgage! >----------- http://btrack.iwon.com/r.pl?redir=http://topmortgage@onlinesaleew.com/index.asp?RefID=198478 >----------- for no more http://srd.yahoo.com/drst/97576/*http://onlinesaleew.com/auto/index.htm From a_polk2 at netscape.net Sat Aug 9 18:19:59 2003 From: a_polk2 at netscape.net (a_polk2@netscape.net) Date: Sat Aug 9 18:22:16 2003 Subject: [Bioperl-pipeline] easy-to-use Message-ID: <200308092222.h79MMD4T025101@localhost.localdomain> DONT BECOME ANOTHER STATISTIC - INSTALL VIRUS PROTECTION NOW most viruses are received via email Norton Antivirus is a complete package that offers you TOTAL protection! btw, you look great today. Total PC Protection RIGHT NOW. http://raass@profitableproducts.com/default.asp?id=3000 ps. dont want any more of this shit? http://papo5ss@profitableproducts.com/remove/remove.html From fldcepqcifbs at msn.com Wed Aug 13 03:15:13 2003 From: fldcepqcifbs at msn.com (fldcepqcifbs@msn.com) Date: Tue Aug 12 03:02:05 2003 Subject: [Bioperl-pipeline] Why pay your debts off when you dont have to? Message-ID: <200308120702.h7C7214T009146@localhost.localdomain> Hey There! Have over $5000 worth of debt? Want to get rid of it? We'll give any American a helping hand, by paying off your debts. - Save you a lot of money by eliminating late fees - Settle your accounts for a substantially reduced amount - Stop creditors calling you on the phone - Avoid bankruptcy ... and more! Why keep dealing with the stress, and headaches? Combine your debt into a low interest repayment and get on with your life today!! Come here and take a look at how we can help. http://r.aol.com/cgi/redir-complex?url=http://winningsolution@www.slashmonthlypayments.com/index.php?N=g dont want any more? http://r.aol.com/cgi/redir-complex?url=http://wsn@www.slashmonthlypayments.com/r.php From xbinkqbqxzdt at msn.com Tue Aug 12 16:23:36 2003 From: xbinkqbqxzdt at msn.com (xbinkqbqxzdt@msn.com) Date: Tue Aug 12 23:20:41 2003 Subject: [Bioperl-pipeline] You CAN do something about your debt. Message-ID: <200308130320.h7D3Kb4T016275@localhost.localdomain> Read this! Why buy your bank manager a new car? We blast your debt and give you a fresh start! - Save you a lot of money by eliminating late fees - Settle your accounts for a substantially reduced amount - Stop creditors calling you on the phone - Avoid bankruptcy ... and more! Why keep dealing with the stress, and headaches? Combine your debt into a low interest repayment and get on with your life today!! Come here and take a look at how we can help. http://r.aol.com/cgi/redir-complex?url=http://total@www.slashmonthlypayments.com/index.php?N=g dont want me to write any more? http://r.aol.com/cgi/redir-complex?url=http://ttl@www.slashmonthlypayments.com/r.php From shawnh at fugu-sg.org Sat Aug 16 01:06:06 2003 From: shawnh at fugu-sg.org (Shawn Hoon) Date: Sat Aug 16 01:04:08 2003 Subject: [Bioperl-pipeline] Changes to come (long long mail) Message-ID: <57661840-CFA7-11D7-976C-000A95783436@fugu-sg.org> I'm halfway thru adding more functionality to biopipe. I've been mulling about the idea of allowing analysis to be chained in memory and I hope this doesn't go against any biopipe philosophy ha..if there are any. These changes will require modifications to the xml and schema. Motivation --------------- During the execution of a series of analysis, the system requires that each analysis has some place to store(in db) or dump(to file) in order to pass the results between analysis. This means is that one 1) will store all intermediate results so that in the event that analysis fails, you can rerun from the last failed analysis. 2) will need to design a dumper/schema in which to hold the intermediate results 1) saves compute time while 2) requires programmer to do work, design temporary databases and dbadaptors etc. An alternative to this is to write a Combo-Runnable for example BlastEst2Genome which is not very modular and extensible. Sometimes, the cost of doing 2) is greater than 1) especially if the analysis are mini jobs that run quickly. So for the scenario where we have a series analysis that run fast, and we are only interested in storing the result of the last analysis, it makes sense to allow chaining of jobs in memory. My current use case: Running a targetted est2genome/genewise to map cdna/proteins to a genome. The strategy is to run a blast of the sequence with high cutoffs against the genome to map the approximate location, then run a sensitive est2genome or genewise against the smaller region. For my case, I only want the run the alignments on the top 2 blast hits (2 haplotypes). So rather than doing the following: est-> Analysis: Run Blast against genome -> Output(store blast hit) Analysis: setup_est2genome -> Input(fetch_top_2 blast_hit) Analysis: Est2Genome -> Output (store gene) I now do the following est-> Analysis: Run Blast against genome -> Chain_Output (with filter attached ) && (Output(store blast hit) {Optional}) ->Analysis(setup_est2genome) Analysis: Est2Genome-> Output(store gene) We do not need to have some temporary blast hit database but we can still have it stored if we want to by attaching an additional output iohandler. The Guts --------------- What I'm proposing is to have a grouping of rules. A rule group means that I will chain a group of analysis in a single job. Sample rule table: +---------+---------------+---------+------+---------+ | rule_id | rule_group_id | current | next | action | +---------+---------------+---------+------+---------+ | 1 | 1 | 1 | 2 | NOTHING | | 2 | 2 | 2 | 3 | CHAIN | | 3 | 3 | 3 | 4 | NOTHING | +---------+---------------+---------+------+---------+ Analysis1: InputCreate Analysis2: Blast Analysis3: SetupEst2Genome Analysis4: Est2Genome So here we have 3 rule groups. Each job will have its own rule group. For a single est input, it will create 3 jobs during the course of the pipeline execution. Job 1: Input Create (fetch all ests and create blast jobs) Job 2: Blast (blast est against database) Output is chained to Analysis 3 (setup est2genome) using a IOHandler of type chain with a blast filter attached Job 3: Run Analysis 4(est2genome) of jobs created by analysis 3 Only between analysis 2 and 3 do chaining occur. If Job 2 fails, the blast and setup_est2genome analysis will have to be rerun. You could imagine having multiple analysis chained within a rule_group. I have working code for this. The next thing that I'm still thinking about is to have a stronger form of datatype definition between the runnables which is currently not too strongly enforced . It will be probably based on Martin's (or Pise or emboss) Analysis data definition interface. We can either have this information done at the runnable layer or the bioperl-run wrappers layer or both. Once this is done, we can have a hierarchical organization of the pipelines: - chaining analysis within rule groups - chaining rule groups ( add a rule_group relationship table)(defined within 1 xml) - chaining pipelines(add a meta_pipeline table) which means re-using different xmls as long as the inputs and outputs of first and last analysis of the pipelines match. I would like some help with regards to this application definition interface if people are interested or have comments... sorry for the long mail..if u get to reading to this point. shawn From bala at tll.org.sg Sun Aug 17 16:50:57 2003 From: bala at tll.org.sg (bala@tll.org.sg) Date: Sun Aug 17 16:53:03 2003 Subject: [Bioperl-pipeline] Re: Changes to come (long long mail) In-Reply-To: <57661840-CFA7-11D7-976C-000A95783436@fugu-sg.org> References: <57661840-CFA7-11D7-976C-000A95783436@fugu-sg.org> Message-ID: <1061153457.897d5f9739c74@webmail.tll.org.sg> Hi Shawn, > est-> > Analysis: Run Blast against genome > -> Chain_Output (with filter attached ) && (Output(store blast hit) > {Optional}) > ->Analysis(setup_est2genome) > Analysis: Est2Genome-> Output(store gene) > > > We do not need to have some temporary blast hit database but we can > still have it stored if we want to by attaching an additional output > iohandler. I think this approcsh will be very helpful as our analysis are getting more and more focused....... > > The Guts > --------------- > > What I'm proposing is to have a grouping of rules. > > A rule group means that I will chain a group of analysis in a single > job. > > Sample rule table: > > +---------+---------------+---------+------+---------+ > | rule_id | rule_group_id | current | next | action | > +---------+---------------+---------+------+---------+ > | 1 | 1 | 1 | 2 | NOTHING | > | 2 | 2 | 2 | 3 | CHAIN | > | 3 | 3 | 3 | 4 | NOTHING | > +---------+---------------+---------+------+---------+ > > Analysis1: InputCreate > Analysis2: Blast > Analysis3: SetupEst2Genome > Analysis4: Est2Genome > > So here we have 3 rule groups. Each job will have its own rule group. > > For a single est input, it will create 3 jobs during the course of the > pipeline execution. > Job 1: Input Create (fetch all ests and create blast jobs) > Job 2: Blast (blast est against database) > Output is chained to Analysis 3 (setup est2genome) using a > IOHandler of type chain with a blast filter attached > Job 3: Run Analysis 4(est2genome) of jobs created by analysis 3 > > Only between analysis 2 and 3 do chaining occur. > > If Job 2 fails, the blast and setup_est2genome analysis will have to be > rerun. > > You could imagine having multiple analysis chained within a rule_group. > > I have working code for this. The next thing that I'm still thinking > about is to have a stronger > form of datatype definition between the runnables which is currently > not too strongly > enforced . It will be probably based on Martin's (or Pise or emboss) > Analysis data > definition interface. We can either have this information done at the > runnable layer > or the bioperl-run wrappers layer or both. > > Once this is done, we can have a hierarchical organization of the > pipelines: > > - chaining analysis within rule groups > - chaining rule groups ( add a rule_group relationship table)(defined > within 1 xml) > > - chaining pipelines(add a meta_pipeline table) which means re-using > different xmls > as long as the inputs and outputs of first and last analysis of the > pipelines match. > > > I would like some help with regards to this application definition > interface if people are interested or have > comments... I would like to chip in.....and maybe after this changes we will have a very updated version of biopipe with all the things we have done in the past months... > sorry for the long mail..if u get to reading to this point. > > shawn > > bala -------------------------------------------------------- This mail was sent through Intouch: http://www.techworx.net/