[GSoC] GSoC 2014 queries and inputs

Francesco Strozzi francesco.strozzi at gmail.com
Mon Mar 17 20:18:47 UTC 2014


I don't think it's necessary.  If you would like to use JRuby, there is the
Picard API ( http://picard.sourceforge.net ) which you can reuse right
away. It's fast and well tested.

All the best.
Francesco
Il 17/mar/2014 20:38 "Ujjwal Thaakar" <ujjwalthaakar at gmail.com> ha scritto:

> Would we have to write a new VCF parser in Ruby?
>
>
> On 15 March 2014 17:33, Ujjwal Thaakar <ujjwalthaakar at gmail.com> wrote:
>
> > Hi,
> > My name is Ujjwal. I'm a 21 years old student from India and interested
> in
> > contributing to Bioruby this year. I have certain queries regarding the
> > project idea listed.
> >
> >    1. Can you give me some more use cases for this tool. Some specific
> >    functional requirements we'd like to see. What we need to mine
> determines
> >    the data structure of our persistence layer and therefore which
> database
> >    engine to use.
> >    2. When you say a RESTful api, we want to deploy this on a server with
> >    a backing database together with a ruby gem that communicates with
> the api
> >    right? And I presume we also want people to be able to make
> comparison of
> >    our hosted VFC files with their local VCF files
> >    3. Although this is a *Bioruby* project, the server doesn't
> >    necessarily need to be written in Ruby I presume? As is mentioned,
> Scala or
> >    JRuby could be used. I would suggest we have a look at Go lang too.
> >
> > To give you a background about me. I was a GSoC intern last year for Ruby
> > on Rails where I implemented a RESTful collection routing api. I am an
> > intermediate ruby programmer. I have also been interested in synthetic
> > biology for about a year now and have some lab experience too so I
> > understand the basics of biology and specifically genetic engineering. I
> am
> > a computer science undergrad and have taken a course on data engineering
> > too. I also have experience working with REST apis and am building one
> > right now for my startup.
> >
> > I have been wondering on the database. I think Neo4J will be a great fit.
> > It's not heavy like oracle and does not need installation. It's portable
> > and can be started and stopped easily on the machine. Has low memory
> > footprint and support for SPARQL too although it's native query language
> > Cypher will do the trick for us right now. We can run embedded instances
> > too using JRuby which are super fast. I'm the maintainer of the most
> > popular Neo4j ruby bindings and also in the process of rewriting the next
> > version of neo4j-core. It will allow us to make all sorts of queries and
> do
> > data mining at an incredible speed while being incredibly portable and
> > light. All logic can then reside within the gem itself and we do not need
> > any backend. It should be fast enough since we'll be directly dealing
> with
> > java objects made available through jruby. I have a fair idea of how fast
> > this is and its really fast although working with such huge files will
> have
> > different challenges. We don't need a database for the embedded version.
> > All we need is jars which fortunately are available as a gem so all we
> have
> > to do is include them as dependencies and our database is ready! I don't
> > think it will be this easy for any other db while giving us the same
> speed,
> > power and capabilities!
> >
> > I've started working on the proposal and will upload it in a couple of
> > days for your feedback. This is going to be incredibly fun :)
> >
> > BTW what is the user base of bioruby like? What does it lack from other
> > bio libraries like biopython?
> >
> > How much biology do I need to understand for this project or will I learn
> > as we go along?
> >
> > --
> > Thanks
> > Ujjwal
> >
>
>
>
> --
> Thanks
> Ujjwal
> _______________________________________________
> GSoC mailing list
> GSoC at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/gsoc
>



More information about the GSoC mailing list