[Bioperl-l] location binner object
Jason Stajich
jason@cgt.mc.duke.edu
Wed, 8 May 2002 10:46:20 -0400 (EDT)
I'm generating a bunch of Bio::LocationI objects and would like to test if
some are within some specified distance away from each other. I want to
be able to do fast lookups to see if a location is within some range.
This seems pretty similar to Lincoln's binning in GFF for locations, but
would it be possible to do this in-memory/BDB file as I am generating a
lot of these and don't need to keep them once I've processed and
identified the best choices?
Essentially I want to be able to take a location from list A and see if
any locations in list B fall in the range of X bp downstream of A.
Plenty of implementation options, probably the fastes and easiest to
implement would be a single vector of length of the full range covered by
all the locations and have ptrs to the objects in the slots where the
location overlaps, but this is a memory hog. I could map to a BDB file
and just eat the disk space since this is essentially generating a
tempfile
Anyways, Lincoln do you have any input here - is it just going to be
easier to slap everything into an sql backend with DB:GFF rather than
reinventing the wheel or can I basically just run everything through the
Bio::DB::GFF binning but store in a BDB file? I'm happy to adapt this to
some sort of location aggregator/binner object for everyone's use.
Thanks.
-jason
--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu