[Bioperl-l] Article

Keith Player keithplayer at hotmail.com
Thu Nov 2 00:22:07 UTC 2006


Sorry I didn't attach the article link originally.  You can view the full-text 
for free:

http://www.genome.org/cgi/content/abstract/12/10/1599

When I was talking about the R-tree I was talking about the current 
implementation.  

I should point out that I didn't actually try the perl module directly but 
implemented the binning schema straight in mysql.  I found that by using the 
SQL I mentioned previously the database performed better compared to using the 
binning schema, I assume because of less disk seeks.  I tested on a dataset of 
around 30k records and another the same size as the paper.  The binning 
outperformed the queries as described in the paper, but the SQL I mentioned in 
the first post outperformed the binning schema by around a factor of 3.  

The new binning schema might make all this moot, especially if it removes 
layers so that groups/features next to each other are saved on the 
same/adjacent pages.  The only question then would be whether database 
optimization is effected by the binning.

Also does needing to know the largest length of a group/feature make the SQL 
statement I created impractical?  





More information about the Bioperl-l mailing list