[BioSQL-l] load_ncbi_taxonomy.pl
Hilmar Lapp
hlapp at gmx.net
Fri Aug 1 21:04:37 UTC 2008
Sounds like I at least managed to silence all the complaining of the
script ;-) How long did it run? Was it similar to what you've seen
earlier or outrageously longer?
-hilmar
On Aug 1, 2008, at 4:58 PM, Peter wrote:
>>> By testing I meant primarily if people use other platforms that I do
>>> (PostgreSQL on MacOSX), such as MySQL or Oracle on Linux, and can
>>> give this
>>> a whirl as in, load the NCBI taxonomy into a scratch database
>>> (using the
>>> script), then load it again (simulating an update), and see
>>> whether there
>>> are any error or warning messages that'd be great.
>>
>> OK, as a very cursory check I did a quick test on a Linux machine
>> using MySQL. I just grabbed the latest script via the SVN webpage,
>> then using an existing (partly populated) database:
>>
>> $ perl ./load_ncbi_taxonomy.pl --dbname bioseqdb --driver mysql
>> --dbuser root --download true
>> Downloading NCBI taxon database to taxdata
>> Unable to close datastream at ./load_ncbi_taxonomy.pl line 726
>>
>> This may be a network issue... the taxdata/taxdump.tar.gz file had
>> downloaded OK, so I manually unzipped it, and then:
>>
>> $ perl ./load_ncbi_taxonomy.pl --dbname bioseqdb --driver mysql
>> --dbuser root Loading NCBI taxon database in taxdata:
>> ... retrieving all taxon nodes in the database
>> ... reading in taxon nodes from nodes.dmp
>> ... insert / update / delete taxon nodes
>> ... updating new parent IDs
>> ... (committing nodes)
>> ... rebuilding nested set left/right values
>> ... reading in taxon names from names.dmp
>> ... deleting old taxon names
>> ... inserting new taxon names
>> ... cleaning up
>> Done.
>>
>> So no further error messages - however, I have not actually checked
>> to
>> see what exactly this did to my database ;)
>
> I then simulated an update by deleting the downloaded taxdata, and
> rerunning the script:
>
> $ perl ./load_ncbi_taxonomy.pl --dbname bioseqdb --driver mysql
> --dbuser root --download true
> Downloading NCBI taxon database to taxdata
> Unable to close datastream at ./load_ncbi_taxonomy.pl line 726
> Loading NCBI taxon database in taxdata:
> ... retrieving all taxon nodes in the database
> ... reading in taxon nodes from nodes.dmp
> ... insert / update / delete taxon nodes
> ... updating new parent IDs
> ... (committing nodes)
> ... rebuilding nested set left/right values
> ... reading in taxon names from names.dmp
> ... deleting old taxon names
> ... inserting new taxon names
> ... cleaning up
> Done.
>
> [Note that after the "unable to close" message I just left the script
> running this time, and it continued fine]
>
> Again, I haven't checked the database.
>
> Peter
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the BioSQL-l
mailing list