[Open-bio-l] OBDA redux?

Mon Nov 14 17:59:35 UTC 2011

On Nov 13, 2011, at 6:24 AM, Peter Cock wrote:

> So, Chris and I seem in general agreement that an OBDA v2
> using SQLite but based on essentially the same approach as
> the BDB or flat file based OBDA v1 is a good idea. i.e. Tables
> mapping record identifiers to file offsets in the original sequence
> files.

The worry I have is adhering to a specific backend (e.g. SQLite).  The reason I say this is b/c BDB in it's time was the go-to way of storing simple index data, but that is no longer feasible for very large data sets.  Who's to say something similar won't happen to SQLite, or that it is the best option available?  

Maybe we should focus on the data storage schema, as simple as it may be, then indicate the default backend must be SQLite but others are allowed (maybe with a mention that SQLite can be replaced by alternatives in the future if needed).  

> I hope to get BioRuby on board, they already have an OBDA
> v1 support so that shouldn't be too hard.
> 
> Right now I don't recall if BioJava has/had OBDA v1 support,
> and if they did if it was affected in their recent move to BioJava
> v3 (I understand from their mailing list that some older lower
> priority functionality has not all been ported yet).

I wouldn't be surprised at that, OBDA kind of lingered for a while, and I'm not sure how widely adopted it became (maybe others can shed light on that?)

> Also EMBOSS are likely to be interested, certainly Peter Rice
> was interested in the SQLite indexing we're already using in
> Biopython for sequence files (i.e. what is effectively the
> prototype for OBDA v2).
> 
> Note that in addition to simple indexing of text files, we are
> already using the same simple offset + length approach for
> indexing binary files (e.g. SFF).

I think that's the general idea, that is how all bioperl data was indexed, before with the Bio::Index modules and with the OBDA implementations as well.

> On the immediate practical side, I think I can edit the
> current OBDA website of http://obda.open-bio.org/
> via /home/websites/obda.open-bio.org/html on the
> server.

See below w/ regards to my thoughts on the wiki.

> We need to work out where the current OBDA indexing
> specification lives (CVS or SVN?) and perhaps move
> that to github. We may need a general OBF organisation
> account on git hub for this and any other cross-project
> repositories.

+1 to a move to github, but maybe this belongs in an OBF-specific organization.  And maybe we should take advantage of the simple wiki or project homepage that GitHub offers and move everything (docs) there. 

> I see there is already an OBDA project on RedMine,
> (Chris can you add me to that please?)
> https://redmine.open-bio.org/projects/obda
> 
> Peter

Done (last night actually, but I didn't have time to respond immediately).

chris