[GSoC] GSoC 2014 queries and inputs

Ujjwal Thaakar ujjwalthaakar at gmail.com
Mon Mar 17 20:37:12 UTC 2014


If its fine to have a JRuby only implementation then we definitely write a
thin wrapper over Picard


On 18 March 2014 01:56, Ujjwal Thaakar <ujjwalthaakar at gmail.com> wrote:

> When we say BioRuby I think it should work with Ruby - CRuby, JRuby,
> Rubinius etc. I'm not sure it's a good idea to constrain people to JRuby!
>
>
> On 18 March 2014 01:48, Francesco Strozzi <francesco.strozzi at gmail.com>wrote:
>
>> I don't think it's necessary.  If you would like to use JRuby, there is
>> the Picard API ( http://picard.sourceforge.net ) which you can reuse
>> right away. It's fast and well tested.
>>
>> All the best.
>> Francesco
>> Il 17/mar/2014 20:38 "Ujjwal Thaakar" <ujjwalthaakar at gmail.com> ha
>> scritto:
>>
>>> Would we have to write a new VCF parser in Ruby?
>>>
>>>
>>>
>>> On 15 March 2014 17:33, Ujjwal Thaakar <ujjwalthaakar at gmail.com> wrote:
>>>
>>> > Hi,
>>> > My name is Ujjwal. I'm a 21 years old student from India and
>>> interested in
>>> > contributing to Bioruby this year. I have certain queries regarding the
>>> > project idea listed.
>>> >
>>> >    1. Can you give me some more use cases for this tool. Some specific
>>>
>>> >    functional requirements we'd like to see. What we need to mine
>>> determines
>>> >    the data structure of our persistence layer and therefore which
>>> database
>>> >    engine to use.
>>> >    2. When you say a RESTful api, we want to deploy this on a server
>>> with
>>>
>>> >    a backing database together with a ruby gem that communicates with
>>> the api
>>> >    right? And I presume we also want people to be able to make
>>> comparison of
>>> >    our hosted VFC files with their local VCF files
>>> >    3. Although this is a *Bioruby* project, the server doesn't
>>>
>>> >    necessarily need to be written in Ruby I presume? As is mentioned,
>>> Scala or
>>> >    JRuby could be used. I would suggest we have a look at Go lang too.
>>> >
>>> > To give you a background about me. I was a GSoC intern last year for
>>> Ruby
>>> > on Rails where I implemented a RESTful collection routing api. I am an
>>> > intermediate ruby programmer. I have also been interested in synthetic
>>> > biology for about a year now and have some lab experience too so I
>>> > understand the basics of biology and specifically genetic engineering.
>>> I am
>>> > a computer science undergrad and have taken a course on data
>>> engineering
>>> > too. I also have experience working with REST apis and am building one
>>> > right now for my startup.
>>> >
>>> > I have been wondering on the database. I think Neo4J will be a great
>>> fit.
>>> > It's not heavy like oracle and does not need installation. It's
>>> portable
>>> > and can be started and stopped easily on the machine. Has low memory
>>> > footprint and support for SPARQL too although it's native query
>>> language
>>> > Cypher will do the trick for us right now. We can run embedded
>>> instances
>>> > too using JRuby which are super fast. I'm the maintainer of the most
>>> > popular Neo4j ruby bindings and also in the process of rewriting the
>>> next
>>> > version of neo4j-core. It will allow us to make all sorts of queries
>>> and do
>>> > data mining at an incredible speed while being incredibly portable and
>>> > light. All logic can then reside within the gem itself and we do not
>>> need
>>> > any backend. It should be fast enough since we'll be directly dealing
>>> with
>>> > java objects made available through jruby. I have a fair idea of how
>>> fast
>>> > this is and its really fast although working with such huge files will
>>> have
>>> > different challenges. We don't need a database for the embedded
>>> version.
>>> > All we need is jars which fortunately are available as a gem so all we
>>> have
>>> > to do is include them as dependencies and our database is ready! I
>>> don't
>>> > think it will be this easy for any other db while giving us the same
>>> speed,
>>> > power and capabilities!
>>> >
>>> > I've started working on the proposal and will upload it in a couple of
>>> > days for your feedback. This is going to be incredibly fun :)
>>> >
>>> > BTW what is the user base of bioruby like? What does it lack from other
>>> > bio libraries like biopython?
>>> >
>>> > How much biology do I need to understand for this project or will I
>>> learn
>>> > as we go along?
>>> >
>>> > --
>>> > Thanks
>>> > Ujjwal
>>> >
>>>
>>>
>>>
>>> --
>>> Thanks
>>> Ujjwal
>>> _______________________________________________
>>> GSoC mailing list
>>> GSoC at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/gsoc
>>>
>>
>
>
> --
> Thanks
> Ujjwal
>



-- 
Thanks
Ujjwal



More information about the GSoC mailing list