[Bioperl-l] Added 'Installing bioperl-db in Windows' to wiki, problems with bioperl-db

Chris Fields cjfields at uiuc.edu
Thu Feb 16 03:56:14 UTC 2006



On Feb 15, 2006, at 7:54 PM, Hilmar Lapp wrote:

>
> On Feb 14, 2006, at 12:32 PM, Chris Fields wrote:
>
>> Hilmar,
>>
>> Good News: I've added a section to the bioperl wiki on installing
>> bioperl-db
>> in Windows:
>>
>> http://www.bioperl.org/wiki/
>> Installing_Bioperl_on_Windows#Installing_bioperl
>> -db
>>
>> Bad News:  There's a new problem now. I updated from CVS yesterday; I
>> walked
>> through the steps and ran 'nmake test', with everything passing fine.
>> However, load_seqdatabase.pl is extremely slow; it's loading a  
>> sequence
>> every 5 minutes or so.  I noticed (when using '-debug') that it is
>> hanging
>> up in Bio::DB::BioSQL::SpeciesAdaptor each time.  If I create a
>> database,
>> load the biosql schema, and load sequences w/o loading taxonomy, the
>> problem
>> goes away.
>>
>> Here's the debugging output (I cut it off at the point it hangs up):
>> [...]
>
>> preparing UK select statement: SELECT taxon_name.taxon_id, NULL,  
>> NULL,
>> taxon.ncbi_taxon_id, taxon_name.name, NULL FROM taxon, taxon_name  
>> WHERE
>> taxon.taxon_id = taxon_name.taxon_id AND name_class = ? AND
>> ncbi_taxon_id =
>> ?
>> SpeciesAdaptor: binding UK column 1 to "scientific name" (name_class)
>> SpeciesAdaptor: binding UK column 2 to "208964" (ncbi_taxid)
>
> I'm a bit surprised if this is the query where it hangs. Are the
> indexes all there? There should be a primary key index on
> taxon.taxon_id, unique indexes on taxon.ncbi_taxon_id and on  
> taxon_name
> over (taxon_id,name,name_class). Also, there should be separate  
> indexes
> on taxon_name.taxon_id and taxon_name.name. Are they all there? If you
> reinstantiated the schema from the DDL then it seems unlikely that
> somehow the indexes have vanished except if you messed with the schema
> or the DDL.

I looked in the mailing list archives and Barry mentions something here:

http://bioperl.org/pipermail/bioperl-l/2005-January/018093.html

He rebuilt the database from scratch and got it working; no reason  
was given.  I wouldn't be surprised if it is something Mysql-related  
that pops up.  The strange thing is that only a few months ago  
everything ran well with this version of MySQL (v.5); this was with  
the first test database I installed on it.  Another strange thing (I  
think I mentioned it) is that NOT loading the taxonomy with  
load_ncbi_taxonomy.pl worked (everything was entered).  I'll try  
rebuilding the database from scratch to see what happens.  I am  
running this on Windows, so this is new territory...

> Putting an index on taxon_name.name_class really can't make sense, so
> let's assume it can't be that.
>
> So really I suspect this has something to do with the state of the
> database and the version of MySQL. In particular, from some 4.x  
> version
> of MySQL under certain circumstances you have to analyze the  
> statistics
> of the tables in order to get the optimizer pick up the indexes
> properly. Are you on MySQL 4.x and if so, have you done that?
>
> There's the ANALYZE TABLE command:
> http://dev.mysql.com/doc/refman/4.1/en/analyze-table.html
>
> Note the comment: "This statement works with MyISAM, BDB, and (as of
> MySQL 4.0.13) InnoDB tables." Is your MySQL version 4.0.13 or higher?
>
> Also, you can check the execution plan for the query using EXPLAIN.
> http://dev.mysql.com/doc/refman/4.1/en/explain.html
>
> This should show you whether the index would be picked up for the  
> query
> or not. EXPLAIN as well as ANALYZE TABLE will need you to connect to
> the db using the mysql shell (mysql).

I'll give these a shot and post what I find in the next few days.

> I believe something similarly strange was encountered by someone using
> DB::GFF (or Chado) under MySQL, and if I recall correctly the solution
> was to optimize (analyze) the tables. Maybe someone who was in that
> thread reads this and can comment?
>
> 	-hilmar

I wanted to also mention that we shouldn't check in the modifications  
to Bio::Root:Root until I confirm something (I'm at home and  
currently can't).  I tried running a script on an unrelated module  
using the modified Bio::Root::Roo (with the commas added after the  
'throw $class' statements.  Everything worked for $self->throw(),  
except the thrown message wasn't displayed.  I'll dig into it a bit  
more to see what happens.

>
>
>>
>> --------------------------------------------------------------------- 
>> --
>> -----
>> -------------------------
>>
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> -- 
> ----------------------------------------------------------
> : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> ----------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list