[BioRuby] gsoc suggestion: microframework for simple scientific web wrappers

Pjotr Prins pjotr.public14 at thebird.nl
Thu Feb 6 06:34:24 UTC 2014


This is a very good idea, and ties in with earlier bio-ngs work and
our future plans in pipeline software management. 

GSoC also likes 'infrastructure' type projects - it was found out the
last summit.

Do add it to the OBF project proposal list. Also mention bio-ngs and
your project.

Pj.

On Thu, Feb 06, 2014 at 12:09:58AM +0000, Yannick Wurm wrote:
> Dear all,
> 
> a small thought about a potential GSoC project. 
> 
> Many bioinformatics software consist in a binary that you run on the command line with one or few input files, some parameters and generates some output files. Let's consider only software that generates potentially human-readable output. 
> 
> Most of us on this mailing list have no problem running that kind of software on the command-line. But for the majority of biologists that's still impossible: they need a point and click interface instead. 
> 
> So if you're the person who needs to implement that point and click interface, how do you do it? 
>  1. create a wrapper for galaxy [1]. This has become easy.. but puts the burden on your enduser to have or set up a galaxy installation (not trivial), and the galaxy user experience is debatable.
>  2. use sinatra.rb (we did this for our sequenceserver wrapper for blast) - it worked but involved way too much manual labor.
>  3. be old-skool (build your own from php/etc).
> 
> 
> Clearly 1 isn't always appropriate & locks you into a weird framework, and 2. is still to much work. Padrino & rails are overkill for the simplest apps. With Ruby providing such great web development frameworks, why isn't there an easier/faster way to generate a web wrapper around a piece of scientific software? 
> 
> Perhaps I'm missing something. 
> 
> Alternatively, creating a "wrapping scientific software" framework could be a viable GSoC project. 
> 
> Build it upon Sinatra, create a rigid framework where the basic locations of files that the developer needs to edit are predetermined (similarly to rails). Single page/webform for the user to enter data; single output/download page after the run was successful. No need to store any user-data on the server. The framework should include the following features: 
>  * easy way to verify presence, executability and version of binary (or script) that is being wrapped
>  * easy way to specify number of input files, and potential constraints on them  [this stuff should be specified once; appropriate HTML should be auto-generated (bootstrap)]. 
>     * most basic constraints: size and/or extension
>     * more advanced constraints: user-extensible function that verifies the format
>  * easy way to specify possible parameters and constraints on their types 
>  * easy way to show/include local data (HMM models, sequence databases etc...)
>  * easy way to make text-output look good
>     * eg. inserting specific headers or indexing at specific regexps (for table of contents)
>     * eg. csv output should be shown as a table
> 
> I'm not the best qualified person to consider exact implementation details, but if someone wants to go ahead with it I'm happy to provide more general thoughts. 
> 
> Cheers,
> 
> Yannick
> 
> [1]: http://galaxyproject.org
> 
> 
> -------------------------------------------------------
> Yannick Wurm - http://yannick.poulet.org
> Ants, Genomes & Evolution ??? y.wurm at qmul.ac.uk ??? skype:yannickwurm ??? +44 207 882 3049
> 5.03A Fogg ??? School of Biological & Chemical Sciences ??? Queen Mary, University of London ??? Mile End Road ??? E1 4NS London ??? UK
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby



More information about the BioRuby mailing list