[Bioperl-l] tuning load_seqdatabase.db script in bioperl-db

Ewan Birney birney at ebi.ac.uk
Mon May 26 16:08:11 EDT 2003



On 26 May 2003, Nicolas Rueff wrote:

> I'm using bioperl-db/script/biosql/load_seqdatabase.pl to fill the
> biosql schema. The big issue of this script is that the time is takes is
> exponential, since for every new sequence, it has to search in the
> database if the entry doesn't exists yet. Useful for updates, but not
> for first-time fill.
>
> For exemple, I used it with the last full swiss-prot release
> (sprot41.dat) to spawn a new fresh database, and if the computer could
> handle 100 inserts / sec, it drops to 2/sec near the end of the file.
>
> I think it could be a good idea to add an option like "--forceinsert" to
> avoid this problem.

That is a great idea. Don't forget; bioperl is open source and we welcome
diffs and improvements. <<hint, hint>>




More information about the Bioperl-l mailing list