[GSoC] Bionode project idea

Bruno Vieira mail at bmpvieira.com
Mon Feb 15 14:35:44 UTC 2016


Here's the proposal:

*Bionode workflow engine for streamed data analysis*

Researchers should be able to:
  * Perform analyses while data are generated (i.e., with “data streams”);
  * Easily and rapidly update results if input data or analysis
approaches/parameters change (with minimal recomputation);
  * Effortlessly change and scale underlying computing platforms while
pipeline is running;
  * Easily visualise results.

This is largely impossible because current approaches were developed when
datasets were simpler and smaller. The student will take advantage of
recent improvements in generic analysis tools (Node.js Streams &
asynchronous concurrency) to attain the above objectives.

The student will create a workflow engine for streamed data analysis with
concurrent pipelining. The main mentors will be Max Ogden and Mathias Buus,
top Node.js contributors and founders of Dat-data.com, for their experience
with streaming interfaces. Bruno Vieira (founder of Bionode.io) and Yannick
Wurm (lecturer in Bioinformatics at QMUL) will co-supervise.

Some work on the data structures and programming interfaces for commonly
used data sources (e.g., NCBI, Uniprot, Ensembl/Biomart) and data types
(e.g., VCF, BAM, FASTQ) will be required.

The underlying computational architecture architecture should be
abstracted. This means that analysis code will run identically using
different traditional high performance computing system (e.g., Torque, SGE)
and modern systems (e.g., Hadoop MapReduce).

Some components and proof of concepts required for this project are
available at http://github.com/bionode

JavaScript skills are required. Node.js and some biology knowledge is a
plus. Difficulty is medium.

Cheers,
Bruno

On Mon, Feb 15, 2016 at 1:35 PM Bruno Vieira <mail at bmpvieira.com> wrote:

> Hi all,
>
> Would it be possible to propose an idea for the Bionode.io project through
> OBF?
> If so, please let me know the proper process to submit, since I saw in
> another thread that you're having issues with the wiki.
>
> Cheers,
> Bruno
> bmpvieira.com    bionode.io    wurmlab.github.io
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/gsoc/attachments/20160215/83344781/attachment.html>


More information about the GSoC mailing list