[Bioperl-l] time-consuming bp_load_seqdatabase.pl
Hilmar Lapp
hlapp at gmx.net
Tue Mar 11 21:34:30 UTC 2008
It won't be fast, as it will create about ~6 Mln bioentries in your
database. However, it running since Friday sounds on the high end.
The first step I recommend doing when running into this kind of
situation is checking the CPU load that the script generates,
compared to the load generated by the database server. If the
script's CPU load is significantly less than ~10% then it is likely
that your database is too slow.
There are various possible reasons why it may be too slow, ranging
from limited resources, to grossly suboptimal configuration. If your
database is running on the same 15GB server then resources should not
be an issue (assuming that you don't have a totally antiquated CPU
there). You might still want to check the PostgreSQL config file,
though. What I would suspect though is that you didn't VACUUM the
database before and/or during the load. That will make the indexes
used for lookup increasingly slow as a large amount of data accumulates.
Does this ring a bell?
-hilmar
On Mar 11, 2008, at 7:08 AM, stephan.rosecker wrote:
> Dear list,
>
> I have started the "bp_load_seqdatabase.pl" script from the
> "bioperl-db-1.5.2_100" package with the unigene
> "Hs.data". It runs on a 7 processot machine with 15GB ram. The DBMS
> is postgres on a similar machine.
> BioSQL core schema is v1.0.0..
>
> The job runs since friday.
>
> ./bp_load_seqdatabase.pl --host foo --port 5435 --dbname bioseqdb --
> dbuser foo --dbpass bar --driver Pg --format ClusterIO::unigene ../
> ncbi/Hs.data
>
> Is it normal that it takes so long?
> What are your experiences?
>
> best regards
> stephan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the Bioperl-l
mailing list