[BioSQL-l] questions
Hilmar Lapp
hlapp at gnf.org
Tue Feb 1 12:52:13 EST 2005
On Thursday, January 27, 2005, at 01:25 PM, Tamas Hegedus wrote:
> -----------------------------
> DOCUMENTATION; WHYs; packing
> -----------------------------
> However, I know that BioSQL is under development, but it is not a
> 'theoretical' projects, intend to be used by users.
> More users, more feedback, more development, happier programmer.
> But at this moment very difficult to recognize its advantages:
> => Why is BioSQL (RDBMS) better than other solution (e.g. flat files);
> why should I use it for my project?
I summarized the most popular use cases at BOSC03. You may want to
check out
http://www.open-bio.org/bosc2003/slides/Persistent_Bioperl_BOSC03.pdf
> => What to download, from where to download? (In my opinion CVS is
> definitely for programmers not for biologist.)
> => What programs can I use to access the data? Only scripting? No! I
> can use it e.g. with GBrowse, exactly what I need...
There is an adaptor that bridges biosql to be used by gbrowse, there's
just been a thread on this on the gbrowse mailing list. There's
possibly some wrinkles though that need to be worked out so that
Gbrowse finds the features it is supposed to find. Check out the last
week in the gbrowse mailing list archive.
> => Is there any convinient way to query the database? I do not really
> want to learn SQL. How to perform and link queries/returned entries to
> 'conventional' analysis tools (like pattern search)?
Bioperl-db provides you with an interface (object-relational mapper)
that lets you interact with the database through bioperl and query
objects, not SQL. When you run a search ($adaptor->find_by_XXX) you'll
get bioperl objects returned.
Note though that ultimately SQL is always going to be so much more
powerful.
> ----
> For developers: if you work on BioSQL constantly (from the beginning),
> you will know what column is for what (like 'Rank'), what are the role
> in a specific relation; but you can find out these things, if you
> populate the database, and dig into it: so much energy needed that the
> developer find out an easier way to solve his problem.
> ----
Have you checked out the doc directory in the repository? There is a
schema overview and an ERD. Those two will still leave many questions
open, like 'rank', but it could be a start nonetheless.
> I know that it is a huge work to create (and keep uptodate) a website.
> Personally I really do not like (hate) creating web-pages. But I think
> a web-site for BioSQL would greatly accelerate the BioSQL project.
Unfortunately, almost all developers have the same enthusiasm for
creating web-pages as you have. Volunteering web-page authors is what
the OBF needs most desperately. I have no doubt that informative,
well-organized, and most importantly regularly updated web pages would
help the biosql project, but this is also the area where people could
volunteer most easily.
Biosql like all other OBF (and generally open-source) projects is a
project built by people who volunteer their time ...
>
> -----------------------------
> SCHEMA; RDBMSs
> -----------------------------
> I may rise the next question, since I do not see the deepness of
> BioSQL, I know only PgSQL and MySQL.
>
> Why do you have different schema for different database servers
> (PgSQL, hsqldb, Oracle)? I guess why for MySQL...
> Would not be possible to manage only one schema? It could free
> energies/time for other things.
This is how it worked initially. When the schema evolved it became a
pain for two reasons. First, the schema translator wasn't capable of
the full SQL standard and bringing it up to requirements was more work
than maintaining multiple versions of the schema, and second, the MySQL
version used to be used as the reference, which is bad because in MySQL
you can express only a small subset of the DDL capabilities that you
have in other RDBMSs.
Note that biosql is meant to be a very stable schema. There may be
changes in the future, but not at a rapid pace. It's not been a big
time sink to maintain 3 versions since the Singapore changes. This is
not a programming library - it's a schema.
> [...]
>
> ------------------------------
> PYTHON
> ------------------------------
> I prefer python over perl (e.g. because of this I had extra struggles
> to install BioSQL with SwissProt).
>
> If I would know the object-mapping I think I could write a python
> script to load the the SwissProt into the BioSQL (it should be easy
> and straightforward).
You may want to talk to the guys on the biopython list on how they
stored sequences in biosql.
-hilmar
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the BioSQL-l
mailing list