[Bioperl-l] bioperl based database infrastucture for directed graphs
Sendu Bala
bix at sendu.me.uk
Wed Jan 9 13:59:08 UTC 2008
Robson Francisco de Souza wrote:
> Before starting, I would like to know if the BioSQL and Chado schemata
> do have accelerators for quering intervals among billions of features
> and feature relatioships (some examples using these databases would
> also help, if they that these databases are efficient for such tasks).
> If these or other databases are not as suitable as Bio::DB::SeqFeature
> for feature retrieval based on interval overlap and attributes,
I'm using Bio::DB::SeqFeature for that purpose, but just a warning: I
found that with millions of features it made a db that was too large in
terms of disc space and too slow in terms of query time. I had to hack
out its storage of feature objects in the db, instead generating feature
objects on request from the stored attributes. Doing this turned out to
be faster than simply unfreezing certain kinds of feature objects!
(I also had to hack in support for retrieval by source, a patch that
Lincoln hasn't gotten back to me about yet.)
While I can't answer your main questions, I wish you good luck with your
project and request that you keep us posted with what you achieve.
More information about the Bioperl-l
mailing list