[BioRuby] R: Fwd: Re: BioSQL development

Raoul Bonnal bonnalraoul at ingm.it
Wed Sep 1 13:39:44 UTC 2010


Hello,
I didn't start to map GFF3 to BioSQL in BioRuby, sorry for that.
Theoretically if the format is readable in BioRuby as a BioSequence object then is possible to start to map it to the BioSQL schema.
If you wrote some code please share it with git or gist http://gist.github.com/ pointing out which are the GFF3's "features", to start the GFF3 support in BioRuby/BioSQL.


About CHADO, I think that adding another schema is a good thing, supporting more schemas give us the flexibility to satisfy more users, is good.

The problem of loading huge datasets is a big problem in BioRuby/BioSQL too, creating the BioSequence object and then moving it to biosql low level is very cpu/memory consuming. Now my brain is still in vacation so any idea on how to speed up the process without relying on better implementation of the RubyVM is appreciated.

Now BioSQL is distributed with BioRuby one of the ideas is to move biosql to an optional plugin and I think that the CHADO implementation should follow the same path. BUT we have to discuss more deeply the plugin system in BioRuby.

About which ORM is better to use ? Actually, at the beginning I was very confortable with DataMapper because had some nice features like composite primary key or lazy loading, unfortunately the majority of the Rails applications use ActiveRecord and maintaining different ORM it's a messy and I had to choose the more supported/spread/known ORM. Now I'll not convert BioSQL to DataMapper (I have an original implementation) however I'm very open to discuss about alternative ORM, it is also very fascinating at least for me.


Day 1, the hell is knocking to my door.

--
Raoul J.P. Bonnal
Life Science Informatics
Integrative Biology Program
Fondazione INGM
Via F. Sforza 28
20122 Milano, IT
phone: +39 02 006 623  26
fax: +39 02 006 623 46
http://www.ingm.it


> -----Messaggio originale-----
> Da: bioruby-bounces at lists.open-bio.org [mailto:bioruby-
> bounces at lists.open-bio.org] Per conto di Julian Nordt
> Inviato: domenica 22 agosto 2010 17:18
> A: Hilmar Lapp; Rob Syme
> Cc: bioruby at lists.open-bio.org
> Oggetto: Re: [BioRuby] Fwd: Re: BioSQL development
> 
> One more thing in regard to the mapping between BioSQL and GFF3:
> 
> I tried to follow the mapping given by the biosql wiki and blue collar
> bioinformatics. The mapping is acceptable in the sense that you can
> store
> *most* or even all (?) of the features that GFF3 offers. The further I
> got
> though within the development the unclearer things got me, especially
> in
> terms of the "attribute" column.
> 
> If you compare the table at the biosql wiki (for the attribute column)
> with the one at blue collar bioinformatics, one will notice that the
> there
> are keywords that occour in one, but not at the other table. That not
> mentioning the todos on the wiki regarding the "standard" columns. I
> havn't looked in that detail though through blue collars code, maybe
> the
> answer is given there.
> 
> However I wrote a small library that managed to store most - but not
> all
> the given information of the GFF3-files - correctly to BioSQL. There
> were
> some points where the mapping has been unclear to me and where I stored
> the given information where I thought it would fit best.
> 
> Considering that I chose a standard db schema to avoid any ambiguously
> and
> the fact that I experienced performance issues with MYSQL+Rails (not
> related to BioSQL) at the project made it enough for me to switch to
> CHADO
> backed by POSTGRES.
> 
> The documentation regarding CHADO is in my opinion richer and most
> importantly one can follow gmod_bulk_load_gff3.pl for the mapping
> relatively easy, since it is well documented.
> 
> I would very much welcome other opinions on the topic, especially in
> combination with the use of web applications.
> 
> -- Julian
> 
> 
> 
> 
> 
> On Sun, 22 Aug 2010 16:17:45 +0200, Rob Syme <rob.syme at gmail.com>
> wrote:
> 
> > I've had a look around and a pretty solid mapping seems to be
> available:
> > http://www.biosql.org/wiki/Annotation_Mapping#GFF3
> >
> > Blue collar bioinformatics gave it a shot here:
> > http://bcbio.wordpress.com/2009/02/22/exploring-bioperl-genbank-to-
> gff-mapping/
> >
> > -r
> >
> > On 22 Aug 2010 22:02, "Hilmar Lapp" <hlapp at drycafe.net> wrote:
> > Is the issue with GFF3 in the Bioruby to BioSQL mapping, or is
> somehow in
> > the BioSQL schema?
> >
> > I recall there was a thread on GFF recently which I wasn't able to
> > follow,
> > so if the answer is in that thread and isn't easy to sum up here,
> just
> > point
> > me there.
> >
> >        -hilmar
> >
> >
> >
> > On Aug 22, 2010, at 6:30 AM, Julian Nordt wrote:
> >
> >> Hi Rob,
> >>
> >> I just wanted to point that there ...
> 
> 
> --
> Using Opera's revolutionary email client: http://www.opera.com/mail/
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby





More information about the BioRuby mailing list