[BioRuby] BioRuby Digest, Vol 110, Issue 1

Michael Barton mail at michaelbarton.me.uk
Thu Mar 5 21:06:21 UTC 2015


If you're considering docker support you might be interested in this
project I've been working on recently - github.com/bioboxes/rfc. We're
aiming to standardise many of the common bioinformatics tools inside docker
containers so they can be used interchangeably.

On 4 March 2015 at 02:38, Yannick Wurm <y.wurm at qmul.ac.uk> wrote:

> Hey Francesco,
>
> that's very cool. I like the fact that it abstracts away all the
> complication of the queuing system. Can you use pipengine without a queuing
> system/scheduler? (i.e. on a single 48-core fat node)?
>
> Is there an easily searchable bioinfo-core mailing list archive? I am a
> member but cannot easily find the discussion you mention.
>
> I agree that its challenging to find/create one-size-fits-all solutions.
> However I do think that there is a need for a "pipelining" solution that is
> sufficiently biologist-friendly to get them to immediately see the value
> (saving them time AND improving
> agility/reproducibitliy/maintainabiltiy/sharability). Ad-hoc solutions
> produced by biologists tend to do everything badly...
>
> Cheers,
> Yannick
>
> p.s.: Sorry about the Gsoc & thanks for your efforts in putting it
> together...
> p.p.s.: docker is amazeballs :)
>         Have a look at (WIP) https://github.com/yeban/oswitch
>         We're facilitating transparent switching (files/paths/ids
> conserved)
>         back and forth between different OS.
>
>
>
> > On 3 Mar 2015, at 12:00, bioruby-request at mailman.open-bio.org wrote:
> >
> > Hi Yannick,
> > that's an interesting topic.
> > I have been working for a while on a Ruby package to handle pipelines and
> > distributed analyses in our Bioinformatics core: the code is here
> > https://github.com/fstrozzi/bioruby-pipengine .
> >
> > With this solution we have decided to stick to a simple approach, i.e.
> > pipelines templates written in YAML where you can put raw command lines
> > with simple placeholders that get substituted at run time according to
> your
> > project and samples. So the DSL is reduced to a minimum and the tool then
> > creates runnable scripts that can be send through a queuing system. There
> > is also a simple error control for jobs and also checkpoints to skip
> > already completed steps for a given pipeline.
> > This is *very* Illumina-centric and so far it works only through a
> > Torque/PBS scheduler (this is what we have in-house). It is a bit rough
> but
> > we are using it since >2 years now and we are quite happy. I know it has
> > been used also in other places. I've recently started a Scala
> > implementation of this code (https://github.com/fstrozzi/PipEngine), to
> > make it more portable and also to introduce a number of improvements.
> It's
> > still very work in progress, but among other things we want to add the
> > support for multiple queuing systems, step dependencies and Docker
> support.
> >
> > Anyway, the point with these solutions, in my opinion, is that I do not
> > think there could be a perfect tool that can fit every purpose or
> scenario
> > or environment. There was a similar discussion also on the biocore
> mailing
> > list some time ago and it turned out that many centres either use their
> own
> > systems or take existing solutions, such as for instance Bpipe, and
> modify
> > them to fit their needs. Nextflow is also a very nice tool.
> >
> > In the end we have done the same and developed a solution that, even if
> > with its own limitations, fits our needs and our way of structuring and
> > organising the data analyses.
> >
> > Cheers
> > Francesco
>
>
>
> -------------------------------------------------------
> Yannick Wurm - http://wurmlab.github.io
> Ants, Genomes & Evolution ⋅ y.wurm at qmul.ac.uk ⋅ skype:yannickwurm ⋅ +44
> 207 882 3049
> 5.03A Fogg ⋅ School of Biological & Chemical Sciences ⋅ Queen Mary,
> University of London ⋅ Mile End Road ⋅ E1 4NS London ⋅ UK
>
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/bioruby


More information about the BioRuby mailing list