[Bioperl-l] New GO Parser and errors loading biosql database
Hilmar Lapp
hlapp at gmx.net
Fri Feb 20 03:37:17 EST 2004
On Thursday, February 19, 2004, at 12:50 PM, Law, Annie wrote:
> However, many of the entries are not able to be
> inserted (roughly 200).
> Mostly complaining about how the column name cannot be null. However,
> I'm
> not sure if it is related to
> The make test errors I am having with bioperl-db that I have listed
> below or
> if this is an acceptable result.
> In general how should a user gauge how successful a load of the
> database
> was? I guess you can sort
> of look at the total number of expected number entries.
It's always a good idea to look over the errors and check whether there
are any that just don't make sense. The one below is an example:
>
> -------------------- WARNING ---------------------
> MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values
> were
> ("BBD_pathwayID:C1cyc","","","") FKs (2)
> Column 'name' cannot be null
'BBD_pathwayID:C1cyc' is *not* a GO term (all GO terms have identifiers
that start with GO:). It's in fact a dbxref of a term that erroneously
ends up as a term because in the 1.4 release of bioperl a bug had been
introduced into the dagflat parser (which the GO parser basically is
identical to). I strongly recommend you upgrade at a minimum the module
Bio/OntologyIO/dagflat.pm with the one from cvs (tag branch-1-4).
Alternatively, update the entire bioperl distribution from cvs (again,
use branch-1-4).
Doing so will get rid of most if not all of the errors.
Generally speaking, there should be no or only a few terms that fail to
load, and if any fail then they should only fail because of column
width constraints or something similar.
>
> 2) I have a question about The make test bioperl-db results which may
> be
> related to the results that I am getting. I seem to be having problems
> with
> the make test for bioperl-db. I downloaded the tarball from the CVS
> website
> and installed it.
> I looked at the documentation and I created User biosql which has been
> given
> all the permissions it needs. I also renamed the files as stated in
> the
> steps below. In the t directory of bioperl-db $ cd t $ cp
> DBHarness.conf.example DBHarness.biosql.conf $ cp
> DBHarness.conf.example
> DBHarness.markerdb.conf
You do not need to create DBHarness.markerdb.conf anymore. It's not
used.
>
> I also put a copy of those file in the bioperl-db in the home directory
> since that was documented for the newest version Of bioperl-db.
Not sure where you found that. The only place where this file needs to
reside is in the t/ directory.
> I did a make test in the bioperl-db directory and go the following
> results.
> Most of the tests seem to fail. I am not sure why.
Generally speaking, just read the error message. It often says why, and
so does it here.
>
> [root at microarray bioperl-db]# maket test
> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
> "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
> t/cluster.......install_driver(mysql) failed: Can't load
> '/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/
> mysql/mys
> ql.so' for module DBD::mysql:
> /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/
> mysql/mysq
> l.so: undefined symbol: mysql_ssl_set at
> /usr/lib/perl5/5.8.0/i386-linux-thread-multi/DynaLoader.pm line 229.
> at
> (eval 4) line 3 Compilation failed in require at (eval 4) line 3.
> Perhaps a
> required shared library or dll isn't installed where expected at
> t/DBTestHarness.pm line 211
This says that your DBI driver could not be loaded. It has nothing to
do with bioperl-db. You have either not or not successfully installed
the mysql DBI driver, or you have installed it at a non-standard
location, or you have installed it under another version of perl.
Make sure the tests for the DBD::mysql module pass before trying to use
the driver.
Obviously, if the DBI driver can't be loaded, none of the tests will
succeed, as then no database connection can be opened.
>
> 3) Previously when I did a make test for the Bioperl 1.4 installation
> most
> of the tests passed 97% I'm not sure whether the errors are expected
> or not
>
Generally, *all* tests of a stable bioperl distribution (which 1.4 is)
are supposed to pass. If one or more don't, then chances are high that
something is wrong.
> Here are the results of the make test. I only cut out the beginning
> of the
> test and the summary at the end. Installation of bioperl
>
> ------------- EXCEPTION -------------
> MSG: Failed to load module Bio::SeqIO::game. Can't locate IO/String.pm
The message pretty much says it all. Bioperl does depend at a lot of
places on IO::String, so I'd strongly recommend you go ahead and
install it.
>
> 4) Also, hopefully when I get this all running I would like to know
> what is
> the best order for loading the database. I know you mentionned that
> the GO
> database information should be loaded before the locuslink
> information. Here
> is the list of proposed order of entering information into the
> database.
> Can you use load_seqdatabase.pl for loading unigene information?
Yes you can. Make sure you read the POD of load_seqdatabase.pl to see
how.
> 1. load NCBI taxonomy database with load_ncbi_taxonomy.pl
> 2. GO information
The only things for which order matters are those which are referenced,
but provided only in an incomplete manner, by annotated data sources.
Hence, species information and any ontology that your data source uses
for annotation should be loaded in advance so that upon loading of the
annotated sequences the referenced entities are found by look-up.
> 3. load locuslink database information
> 4. unigene information which I also had problems with loading
> information
> in
> [root@ bioperl-1.4]#perl
> /root/bioperl-db/scripts/biosql/load_seqdatabase.pl
> --dbuser=root --dbpass=ms22 --dbname bioseqdb
> --namespace "Unigene" -format unigene
> /root/bioperl--1.4/unigenedata/Hs.data
> Loading /root/bioperl-1.4/unigenedata/Hs.data ...
> Bio::SeqIO: unigene cannot be found
> Exception
> ------------- EXCEPTION -------------
> MSG: Failed to load module Bio::SeqIO::unigene. Can't locate
> Bio/SeqIO/unigene.pm in @INC (@INC contains:
The message pretty much says it. The indicated module, which is the
bioperl unigene parser, fails to load. The reason is most likely that
you didn't install bioperl, or installed in a location that is not in
Perl's default search path. If the latter is the case, you need to
setup the PERL5LIB environment variable prior to running any code that
uses those modules.
-hilmar
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the Bioperl-l
mailing list