[BioSQL-l] load_ncbi_taxonomy.pl

Hilmar Lapp hlapp at gmx.net
Fri Aug 1 21:04:37 UTC 2008


Sounds like I at least managed to silence all the complaining of the  
script ;-) How long did it run? Was it similar to what you've seen  
earlier or outrageously longer?

	-hilmar

On Aug 1, 2008, at 4:58 PM, Peter wrote:

>>> By testing I meant primarily if people use other platforms that I do
>>> (PostgreSQL on MacOSX), such as MySQL or Oracle on Linux, and can  
>>> give this
>>> a whirl as in, load the NCBI taxonomy into a scratch database  
>>> (using the
>>> script), then load it again (simulating an update), and see  
>>> whether there
>>> are any error or warning messages that'd be great.
>>
>> OK, as a very cursory check I did a quick test on a Linux machine
>> using MySQL.  I just grabbed the latest script via the SVN webpage,
>> then using an existing (partly populated) database:
>>
>> $ perl ./load_ncbi_taxonomy.pl --dbname bioseqdb --driver mysql
>> --dbuser root --download true
>> Downloading NCBI taxon database to taxdata
>> Unable to close datastream at ./load_ncbi_taxonomy.pl line 726
>>
>> This may be a network issue... the taxdata/taxdump.tar.gz file had
>> downloaded OK, so I manually unzipped it, and then:
>>
>> $ perl ./load_ncbi_taxonomy.pl --dbname bioseqdb --driver mysql
>> --dbuser root Loading NCBI taxon database in taxdata:
>>       ... retrieving all taxon nodes in the database
>>       ... reading in taxon nodes from nodes.dmp
>>       ... insert / update / delete taxon nodes
>>       ... updating new parent IDs
>>       ... (committing nodes)
>>       ... rebuilding nested set left/right values
>>       ... reading in taxon names from names.dmp
>>       ... deleting old taxon names
>>       ... inserting new taxon names
>>       ... cleaning up
>> Done.
>>
>> So no further error messages - however, I have not actually checked  
>> to
>> see what exactly this did to my database ;)
>
> I then simulated an update by deleting the downloaded taxdata, and
> rerunning the script:
>
> $ perl ./load_ncbi_taxonomy.pl --dbname bioseqdb --driver mysql
> --dbuser root --download true
> Downloading NCBI taxon database to taxdata
> Unable to close datastream at ./load_ncbi_taxonomy.pl line 726
> Loading NCBI taxon database in taxdata:
>        ... retrieving all taxon nodes in the database
>        ... reading in taxon nodes from nodes.dmp
>        ... insert / update / delete taxon nodes
>        ... updating new parent IDs
>        ... (committing nodes)
>        ... rebuilding nested set left/right values
>        ... reading in taxon names from names.dmp
>        ... deleting old taxon names
>        ... inserting new taxon names
>        ... cleaning up
> Done.
>
> [Note that after the "unable to close" message I just left the script
> running this time, and it continued fine]
>
> Again, I haven't checked the database.
>
> Peter

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================






More information about the BioSQL-l mailing list