[Bioperl-l] Loading taxonomy data into BioSQL

SG Edwards s0460205 at sms.ed.ac.uk
Fri Mar 18 09:25:24 EST 2005


Thanks Hilmar,

Yeah I am using Postgres, should I take it out of auto-commit mode? I thought
the script deals with this but maybe not?

If I run it with:

perl load_ncbi_taxonomy.pl -dbname milk -driver Pg -dbuser s0460205 -dbpass
 password -directory /home/s0460205/

I get the error message:

loading NCBI taxon database in /home/s0460205:
... retrieving all taxon nodes in the database
    ... reading in taxon nodes from nodes.dmp
    ... insert/update/delete taxon nodes
failed to insert node (1;1;1;no rank;1;0): ERROR: column "taxon_id" is of type
integer but expression is of type character varying
HINT: You will need to rewrite or cast the expression


Quoting Hilmar Lapp <hlapp at gmx.net>:

> Why do you believe the script thinks that taxon_id is a varchar? It
> doesn't AFAIK.
>
> Also, not sure why your Pg (you are using PostgreSQL, right?) is in
> auto-commit mode. That doesn't sound right.
>
> 	-hilmar
>
> On Friday, March 18, 2005, at 06:05  AM, SG Edwards wrote:
>
> > I find that if I manually gunzip and tar the download from ncbi then
> > the script
> > finds the file nodes.dmp (N.B not sure if this is a fault with
> > load_ncbi_taxonomy.pl or something with my system?!)
> >
> > The script then tries to load the data into the taxon table but the
> > column
> > "taxon_id" type is INTEGER but the script thinks it is varchar. So
> > either need
> > to change the database column to varchar or change the perl script to
> > INTEGER.
> >
> > Has anyone had this problem?!
> >
> >
> > Quoting s0460205 at sms.ed.ac.uk:
> >
> >> I have been trying:
> >>
> >> perl load_ncbi_taxonomy.pl -dbname milk -driver Pg -dbuser s0460205
> >> -dbpass
> >> password -download
> >>
> >> and this gave me the error message below.
> >> If I download the ncbi_taxonomy data manually it and direct the perl
> >> script
> >> to
> >> this using:
> >>
> >> perl load_ncbi_taxonomy.pl -dbname milk -driver Pg -dbuser s0460205
> >> -dbpass
> >> password -directory /home/s0460205/
> >>
> >> This seems to get a bit further but still results in error,
> >>
> >> "loading NCBI taxon database in /home/s0460205:
> >>    ... retrieving all taxon nodes in the database
> >>    ... reading in taxon nodes from nodes.dmp
> >> Couldn't open data file taxdata/nodes.dmp: No such file or directory
> >> rollback ineffective with AutoCommit enabled at load_ncbi_taxonomy.pl
> >> line
> >> 818.
> >> Use of uninitialized value in concatenation (.) or string at
> >> load_ncbi_taxonomy.pl line 820.
> >> rollback failed
> >>
> >> It seems to be choking on finding the nodes.dmp but I'm not sure why?!
> >>
> >>
> >> Quoting Brian Osborne <brian_osborne at cognia.com>:
> >>
> >>> SG,
> >>>
> >>> =head1 DESCRIPTION
> >>>
> >>> This script loads or updates a biosql schema with the NCBI Taxon
> >>> Database. There are a number of options to do with where the biosql
> >>> database is (i.e., database name, hostname, user for database,
> >>> password, database name).
> >>>
> >>> This script may download the NCBI Taxon Database from the NCBI FTP
> >>> server on-the-fly (ftp://ftp.ncbi.nih.gov/pub/taxonomy/). Otherwise
> >>> it
> >>> expects the files to be downloaded already.
> >>>
> >>>
> >>>
> >>> Brian O.
> >>>
> >>> -----Original Message-----
> >>> From: bioperl-l-bounces at portal.open-bio.org
> >>> [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of SG Edwards
> >>> Sent: Friday, March 18, 2005 6:45 AM
> >>> To: bioperl-l at portal.open-bio.org
> >>> Subject: [Bioperl-l] Loading taxonomy data into BioSQL
> >>>
> >>>
> >>>
> >>>
> >>> Hi,
> >>>
> >>> Can you please help me with an error message? I have just installed a
> >> BioSQL
> >>> database and am trying to run the load_ncbi_taxonomy.pl script to get
> >>> taxonomy
> >>> data into my database before I start to load sequences in. The
> >>> database has
> >>> been created and is empty, however, I get the following error
> >>> message:
> >>>
> >>>
> >>> Cannot open Local file taxdata/taxdump.tar.gz: No such file or
> >>> directory at
> >>> load_ncbi_taxonom.pl line 628
> >>> gunzip: taxdata/taxdump.tar.gz: No such file or directory
> >>> sh: line 1: cd: taxdata: No such file or directory
> >>> tar: taxdump.tar: cannot open: No such file or directory
> >>> tar: error is not recoverable: exiting now
> >>> loading NCBI taxon database in taxdata:
> >>>        ... retrieving all taxon nodes in the database
> >>>        ... reading in taxon nodes from nodes.dmp
> >>> Couldn't open data file taxdata/nodes.dmp: No such file or directory
> >>> rollback ineffective with AutoCommit enabled at
> >>> load_ncbi_taxonomy.pl line
> >>> 818.
> >>> Use of uninitialized value in concatenation (.) or string at
> >>> load_ncbi_taxonomy.pl line 820.
> >>> rollback failed
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at portal.open-bio.org
> >>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> --
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
>
>
>




More information about the Bioperl-l mailing list