[BioSQL-l] [Bioperl-l] Added 'Installing bioperl-db in Windows' to wiki, problems with bioperl-db
Hilmar Lapp
hlapp at gmx.net
Thu Feb 16 01:54:01 UTC 2006
On Feb 14, 2006, at 12:32 PM, Chris Fields wrote:
> Hilmar,
>
> Good News: I've added a section to the bioperl wiki on installing
> bioperl-db
> in Windows:
>
> http://www.bioperl.org/wiki/
> Installing_Bioperl_on_Windows#Installing_bioperl
> -db
>
> Bad News: There's a new problem now. I updated from CVS yesterday; I
> walked
> through the steps and ran 'nmake test', with everything passing fine.
> However, load_seqdatabase.pl is extremely slow; it's loading a sequence
> every 5 minutes or so. I noticed (when using '-debug') that it is
> hanging
> up in Bio::DB::BioSQL::SpeciesAdaptor each time. If I create a
> database,
> load the biosql schema, and load sequences w/o loading taxonomy, the
> problem
> goes away.
>
> Here's the debugging output (I cut it off at the point it hangs up):
> [...]
> preparing UK select statement: SELECT taxon_name.taxon_id, NULL, NULL,
> taxon.ncbi_taxon_id, taxon_name.name, NULL FROM taxon, taxon_name WHERE
> taxon.taxon_id = taxon_name.taxon_id AND name_class = ? AND
> ncbi_taxon_id =
> ?
> SpeciesAdaptor: binding UK column 1 to "scientific name" (name_class)
> SpeciesAdaptor: binding UK column 2 to "208964" (ncbi_taxid)
I'm a bit surprised if this is the query where it hangs. Are the
indexes all there? There should be a primary key index on
taxon.taxon_id, unique indexes on taxon.ncbi_taxon_id and on taxon_name
over (taxon_id,name,name_class). Also, there should be separate indexes
on taxon_name.taxon_id and taxon_name.name. Are they all there? If you
reinstantiated the schema from the DDL then it seems unlikely that
somehow the indexes have vanished except if you messed with the schema
or the DDL.
Putting an index on taxon_name.name_class really can't make sense, so
let's assume it can't be that.
So really I suspect this has something to do with the state of the
database and the version of MySQL. In particular, from some 4.x version
of MySQL under certain circumstances you have to analyze the statistics
of the tables in order to get the optimizer pick up the indexes
properly. Are you on MySQL 4.x and if so, have you done that?
There's the ANALYZE TABLE command:
http://dev.mysql.com/doc/refman/4.1/en/analyze-table.html
Note the comment: "This statement works with MyISAM, BDB, and (as of
MySQL 4.0.13) InnoDB tables." Is your MySQL version 4.0.13 or higher?
Also, you can check the execution plan for the query using EXPLAIN.
http://dev.mysql.com/doc/refman/4.1/en/explain.html
This should show you whether the index would be picked up for the query
or not. EXPLAIN as well as ANALYZE TABLE will need you to connect to
the db using the mysql shell (mysql).
I believe something similarly strange was encountered by someone using
DB::GFF (or Chado) under MySQL, and if I recall correctly the solution
was to optimize (analyze) the tables. Maybe someone who was in that
thread reads this and can comment?
-hilmar
>
> -----------------------------------------------------------------------
> -----
> -------------------------
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------
More information about the BioSQL-l
mailing list