[BioSQL-l] Indexing of (seqfeature) locations?

Hilmar Lapp hlapp at gmx.net
Wed Aug 26 12:07:08 UTC 2009


On Aug 26, 2009, at 6:53 AM, Peter wrote:

> The BioSQL schema includes a few indexes on the location table
> (e.g. quoting the MySQL schema, but it looks the same on pg too):
>
> CREATE INDEX seqfeatureloc_start ON location(start_pos, end_pos);
> [...]
> Will these facilitate searches like this?:
>
> "SELECT ... WHERE 2000 <= location.start_pos
> AND location.end_pos <= 5000 AND ..."
>
> Or, for this would it help to include:
>
> CREATE INDEX seqfeatureloc_start ON location(start_pos);
> CREATE INDEX seqfeatureloc_start ON location(end_pos);

With a decent RDBMS, having two indexes instead of a compound one will  
slow this query down. What the compound one won't help you with is if  
your query doesn't constrain the leading columns. For example, a  
compound index on (start_pos,end_pos) won't be used if you only  
constrain end_pos. If you want to do that, you need on index on  
(end_pos) too.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================






More information about the BioSQL-l mailing list