[BioSQL-l] Plone4bio 1.0 and BioSQL

michael watson (IAH-C) michael.watson at bbsrc.ac.uk
Fri Oct 2 10:15:32 UTC 2009


Hi Jim

Thanks for that.  I think this has real potential, but I am luke warm about the sequence images - I am not sure I would need them in this context.

Could you expand on this but?

* issue #3: The search box doesn't search BioSQL datasources. No idea 
how hard this would be to fix, but a little plone knowledge probably 
required.

So the search box doesn't do the equivalent of a full text search of the BioSQL database?

Mick

-----Original Message-----
From: biosql-l-bounces at lists.open-bio.org [mailto:biosql-l-bounces at lists.open-bio.org] On Behalf Of James Procter
Sent: 01 October 2009 14:20
To: BioSQL-l at lists.open-bio.org
Cc: Plone4Bio mailing list
Subject: [BioSQL-l] Plone4bio 1.0 and BioSQL


Hello all.

Here's my review of plone4bio+Biosql. Thanks to Peter and Michael who 
sent me encouraging emails - sorry it took so long to post! Finally, 
please accept my apologies in advance for any unnecessary rambling... 
and for my cross-posting to p4bio and biosql-l.

Installing Plone4Bio
--------------------
This basically went according to the instructions, except for two issues:
  1. I experienced some problems accessing some python egg repositories,
and had to manually download and build one module before adding it to
the buildout (python build system) configuration. This was possibly
related to our local network config, since Ivan Rossi couldn't reproduce
the problem.

  2. Once the download/build/plone-instance generation steps were
finished the plone server instance that had been built took way too long
to launch. The installation was running off a directory hosted on our
SAN, and I decided the delay was probably due to the large number of
files needed by plone. I ended up moving the whole install onto a
locally attached disk to minimise the time spent statting the files on a
network. In that config, the server comes up after around 40-60 secs on
a lightly loaded Opteron.


Adding a biosql database and browsing
-------------------------------------
It was easy to add connections to a local biosql database - even for a
plone admin novice like myself. All you need is to know how to form the
appropriate python database connector URI - however, a minor patch to
the site's help text is needed to remind certain forgetful users (me)
how to put the database user's password in the ODBC (?) string.

Once added, I could access the source and browse through my bioentry
sequences via the same list interface as shown in the demo. Clicking on
a sequence link gave me the same five tabs (annotation, features,
dbxrefs, sequence, references) as in the demo. However, here is where I
noticed some issues which I've logged on the plone4bio trac:

* issue #1: Plone4bio uses the bioentry_id primary key as the main
identifier for the bioentry, rather than its accession. E.g. a
sequence's plon4bio record has the URL
http://myploneserver/plone4bio/mybiosqlsource/bioentry.database/bioentry.id

As people on the list will know, the bioentry ID primary key is
autogenerated and only really for internal consumption. Using it as the
primary identifier means it's not possible to link directly to a
sequence's page if you only know its bioentry database and accession.

* issue #2: The imagemap shown under the 'Features' tab is generated 
using bioperl from a genbank file emitted by biopython. This is a flaw, 
and means lots of info is lost (my biosql db is used to serve protein
sequence DAS annotation, so it has URLs, scores, and lots of notes).

I had to hack this script to cope with feature labels that contain
spaces in order for the intervals to display correctly (otherwise they
get a start of '-1'). I'd recommend that the image generator is modified
to use a less restrictive format, and/or made easily pluggable to allow
other feature renderers to be used (perhaps even something like dasty).

* issue #3: The search box doesn't search BioSQL datasources. No idea 
how hard this would be to fix, but a little plone knowledge probably 
required.

This was a bit of a killer for me - I was hoping for a basic search
interface that worked out of the box, allowing me to focus on providing
more advanced queries. As it is, I don't have the time at this moment to
fix this issue myself.

Suggested Enhancements
----------------------
The Biosql/GenBank data format transformation is an easily fixed bug in
the current plone4bio version, but it stopped me exploring the
das/biojava/bioperl/biopython interoperation issues any further.
However, it also revealed a few aspects of the plone4bio architecture
that might need thinking about:

  1. pluggable feature rendering tools - potentially use the biosql
connection directly (already said)
  2. easily configured database cross-reference linkout URLs. Typically,
its bad form to hard-code URLs within a biosql database, and plone4bio 
has its own set of URLs that it decorates dbxrefs with. However, these 
are currently buried inside the plone4bio python code, but they could be 
configured via a flatfile or even via the web interface.


In summary...
-------------
This process took far longer than I'd expected, and the slow install and
startup time gave me the impression that plone is a heavyweight solution
that may not have sufficient performance for high-volume situations (I'm 
sure I'm wrong here).

The functionality available at the time of writing is not enough for my 
purposes - but it is a good starting point (particularly if you know how 
to develop in plone). However - if issues 1,2 and 3 were resolved, and 
the default .cfg scripts were made more robust and slightly better 
commented for python-n00bs like myself, then plone4bio would certainly 
be worth installing to provide basic biosql datasource browsing for your 
lab or institute.

thats all folks!
Jim.

-- 
-------------------------------------------------------------------
J. B. Procter  (Jalview/ENFIN)  Barton Bioinformatics Research Group
Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
The University of Dundee is a Scottish Registered Charity, No. SC015096.

_______________________________________________
BioSQL-l mailing list
BioSQL-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biosql-l




More information about the BioSQL-l mailing list