[Bioperl-l] time-consuming bp_load_seqdatabase.pl

Hilmar Lapp hlapp at gmx.net
Tue Mar 11 21:34:30 UTC 2008


It won't be fast, as it will create about ~6 Mln bioentries in your  
database. However, it running since Friday sounds on the high end.

The first step I recommend doing when running into this kind of  
situation is checking the CPU load that the script generates,  
compared to the load generated by the database server. If the  
script's CPU load is significantly less than ~10% then it is likely  
that your database is too slow.

There are various possible reasons why it may be too slow, ranging  
from limited resources, to grossly suboptimal configuration. If your  
database is running on the same 15GB server then resources should not  
be an issue (assuming that you don't have a totally antiquated CPU  
there). You might still want to check the PostgreSQL config file,  
though. What I would suspect though is that you didn't VACUUM the  
database before and/or during the load. That will make the indexes  
used for lookup increasingly slow as a large amount of data accumulates.

Does this ring a bell?

	-hilmar

On Mar 11, 2008, at 7:08 AM, stephan.rosecker wrote:

> Dear list,
>
> I have started the "bp_load_seqdatabase.pl" script from the  
> "bioperl-db-1.5.2_100" package with the unigene
> "Hs.data". It runs on a 7 processot machine with 15GB ram. The DBMS  
> is postgres on a similar machine.
> BioSQL core schema is v1.0.0..
>
> The job runs since friday.
>
> ./bp_load_seqdatabase.pl --host foo --port 5435 --dbname bioseqdb -- 
> dbuser foo --dbpass bar --driver Pg --format ClusterIO::unigene ../ 
> ncbi/Hs.data
>
> Is it normal that it takes so long?
> What are your experiences?
>
> best regards
> stephan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================






More information about the Bioperl-l mailing list