[Open-bio-l] Re: project update for newsletter
Jason Stajich
jason@cgt.mc.duke.edu
Fri, 19 Jul 2002 13:25:19 -0400 (EDT)
OBDA is more than just flatfile indexing though, we need to spend time on
the remote data and how biosql fits in. This is just a newsletter though
- it just needs to be a blurb - I'm not sure we even need to discuss how
it is implemented. Just want a description of the major components of
OBDA - Flat file, Web/Remote DBs HTTP and SOAP, CORBA, and BioSQL and
what support they have in the various projects.
The website is http://obda.open-bio.org as I have posted before (and it
needs a volunteer to put content and interpretations together for the user
who wants to know what the heck is going on with this).
This is an important part of doing a project where we want to establish a
standard, communicating what it accomplishes in a quick description. Not
ripping on you stuff Andrew, just frustrated because I think we've lost
steam to get any sort of documentation together justifying the work we've
put into things and demonstrating how it is useful.
On Fri, 19 Jul 2002, Andrew Dalke wrote:
> Jason Stajich:
> > Is there any chance I can squeeze 2 paragraphs out of you guys on your
> > OBF projects, I need it by this weekend. I also need someone to write an
> > OBDA and BioSQL description if any of you are interested....
>
> Something like this? This is based on the text from the lightning
> talk I'm giving in a couple weeks.
>
> OBF Flatfile Indexing
>
> Most bioinformatics data is available as a flat-file. Many labs only
> need simple retrieval of the text of a record in the file given some
> identifier, like a record name, but don't need the extra overhead of
> setting up database management system. As part of the Open
> Bioinformatics Foundation biohackathon, the Biopython, Bioperl,
> BioJava, and C coders got together to define and implement for those
> platforms a standard, cross-platform, interoperable indexing scheme.
>
> Given an identifier and optional namespace (needed to distinguish
> between, eg, the entry id and the accesssion name) it works like a
> lookup table to return the list of matching locations, where the
> location is the filename and byte range of the record in the file.
>
> The spec defines two types of indexing schemes. The simplest uses a
> flat files to store the index and is for sites which don't want to
> install any extra software and are willing to trade off some space and
> ease of modification. The second fixes those limitations but requires
> that Sleepycat's BerkeleyDB be installed on the local machine. The
> choice is up to the user, and the indexing type can be determined
> automatically so the choice is transparent when doing lookups.
>
> The full specification is freely available for anyone to use and
> support. For more information see ...
>
> ... Hmm, where do people go for more information?
>
> Andrew
> dalke@dalkescientific.com
>
--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu