[Bioperl-l] New to BioPerl ... part II
Fields, Christopher J
cjfields at illinois.edu
Fri Mar 21 04:59:44 UTC 2014
Agree with Francisco, Strawberry Perl is definitely the way to go. You can do it with ActivePerl but IIRC the PPM installs were always a pain.
chris
On Mar 20, 2014, at 10:54 PM, Francisco J. Ossandón <fossandonc at hotmail.com> wrote:
> Hi Olivier,
> Currently there is no PPM package from the latest version, the last one is
> like 4 years old, so is better to download the latest version from CPAN
> (1.6.923).
> I have one recommendation though... unless you are very attached to
> ActivePerl, uninstall it and install Strawberry Perl instead
> (http://strawberryperl.com/).
>
> When I started to learn Bioperl several years ago I used ActivePerl too, but
> it always gave me trouble installing new modules from CPAN. I switched to
> Strawberry a few years ago and is much better because it includes by default
> additional libraries and compiling tools (gcc, mingw, dmake, etc) that makes
> easier the installation of CPAN modules (including some Bioperl
> dependencies)... Since the installation instructions at the wiki only
> considered ActivePerl, I updated the wiki a few days ago to include
> Strawberry Perl as an additional option
> (http://www.bioperl.org/wiki/Special:RecentChanges).
>
> By the way... something odd among the errors you are getting is that it says
> that " perl514.dll cannot be found", but if you installed 5.18 you should
> have perl518.dll instead... did you install 5.18 above 5.14?? Because I can
> tell that overwriting it can produce obscure and mysterious bugs, is better
> to uninstall the old one first, then delete the whole folder, and then make
> a clean install of the new one.
>
> Cheers and good luck,
>
> Francisco J. Ossandon
>
> -----Mensaje original-----
> De: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-bio.org] En nombre de Hilmar Lapp
> Enviado el: jueves, 20 de marzo de 2014 20:49
> Para: Olivier BUHARD
> CC: BioPerl List
> Asunto: Re: [Bioperl-l] New to BioPerl ... part II
>
> I think the exception message appears due to the actual problem you're
> already describing. Bioperl-db will catch the failure and then print the
> error message. Since it doesn't seem to say anything about failing to
> connect to the database, it's either failing before or past that point for
> an unexpected reason. My suspicion is that it fails to load the DBD driver
> for Perl DBI.
>
> You can test that by writing a small script (not using Bioperl or
> Bioperl-DB) that simply opens a connection to the database. If that fails,
> that's where the problem is.
>
> You can also try that with Bioperl-db:
>
> $conn = $dbadp->dbcontext()->dbi()->new_connection();
>
> This should give you an open DBI-compliant connection.
>
> If that part works, then the problem is somewhere with the dynamic
> auto-loading code.
>
> -hilmar
>
>
>
> On Wed, Mar 19, 2014 at 6:46 PM, Olivier BUHARD
> <Olivier.Buhard at inserm.fr>wrote:
>
>> Hi,
>>
>> thank you all for your answers. I had the (wrong) notion that windows
>> did ignore the shebang...
>>
>> I am using Bioperl 1.6.1... I'll try to install 1.6.923 but I can't
>> find it with ActivePerl ppm (I switched to 5.18), so I'll have to try
>> with command line ppm. I'm running windows XP... perhaps also have to
>> try with Linux.
>>
>> I'm getting in the doc and tutos I can find about Bioperl-db and I'm
>> beginning to understand how it deals with parsing and organizing the data.
>> However, my first attempts to load sequences in my BioSQL db are
>> unsuccessful. From the gbpri1.seq I downloaded from NCBI FTP,
>> load_seqdatabase.pl crashes at the moment it tries to INSERT in the
>> db, telling me perl514.dll cannot be found, and sending a bunch of
>> error messages in a last breath:
>>
>> C:\tmp>perl load_seqdatabase.pl -dbname biosql_hs -dbuser biodb_user
>> -dbpass ******** gbpri1.seq Loading gbpri1.seq ...
>> UNIVERSAL->import is deprecated and will be removed in a future perl
>> UNIVERSAL->at
>> C:/Perl/site/lib/Bio/Tree/TreeFunctionsI.pm line 94.
>> UNIVERSAL->import is deprecated and will be removed in a future perl
>> UNIVERSAL->at
>> C:/Perl/site/lib/Bio\Tree\TreeFunctionsI.pm line 94.
>>
>> ------------- EXCEPTION -------------
>> MSG: failed to open connection:
>> STACK Bio::DB::DBI::base::new_connection
>> C:/Perl/site/lib/Bio/DB/DBI/ba
>> se.pm:267
>> STACK Bio::DB::DBI::base::get_connection
>> C:/Perl/site/lib/Bio/DB/DBI/ba
>> se.pm:227
>> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::dbh
>> C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1498
>> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::rollback
>> C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1417
>> STACK toplevel load_seqdatabase.pl:636
>> -------------------------------------
>>
>> I think I'm doing something wrong, but I can't find what. The mySQL
>> server (version 4.1.9) is on, I've installed the DBI and DBD::mySQL. I
>> checked the user for its privileges and password, so what can be wrong?
>>
>> I've tried writing a shorter script to get in the command that breaks
>> the process (below) and I found it was when it tries to ->create():
>>
>> #!perl
>>
>> use strict;
>> use Bio::DB::BioDB;
>> use Bio::SeqIO;
>>
>> my $seq_file = shift or die("GB_crawler.pl - Usage : perl
>> GB_crawler.pl <SEQ_FILE>\n\n");
>>
>> my $dbadp = Bio::DB::BioDB->new( -database => 'biosql',
>> -host => 'localhost',
>> -user => 'biodb_user',
>> -pass => 'THE_PASSWORD_HERE',
>> -dbname => 'biosql_hs',
>> -driver => 'mysql'
>> );
>> $dbadp->verbose(1);
>>
>> my $seqio_obj = Bio::SeqIO->new(-file => "<$seq_file", -format =>
>> 'genbank' ); while (my $seq_obj = $seqio_obj->next_seq()){
>> print $seq_obj->display_id(),"\n";
>> my $species = $seq_obj->species();
>> my $seq_spec = $species->binomial();
>> if ($seq_spec eq 'Homo sapiens') { # I'm just interested in Hs seq
>> my $p_seq = $dbadp->create_persistent($seq_obj);
>> $p_seq->create();
>> }
>> }
>>
>> The output is: (I can't put all here, it' too long...)
>>
>> C:\tmp>perl GB_crawler.pl gbpri1.seq
>>
>> UNIVERSAL->import is deprecated and will be removed in a future perl
>> UNIVERSAL->at
>> C:/Perl/site/lib/Bio/Tree/TreeFunctionsI.pm line 94.
>> AB000095
>> attempting to load adaptor class for Bio::Seq::RichSeq
>> attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
>> attempting to load adaptor class for Bio::Seq
>> attempting to load module Bio::DB::BioSQL::SeqAdaptor
>> instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
>>
>> .../...
>>
>> attempting to load adaptor class for Bio::Tree::TreeFunctionsI
>> attempting to load module Bio::DB::BioSQL::TreeFunctionsIAdaptor
>> attempting to load module
>> Bio::DB::BioSQL::TreeFunctionsAdaptor
>> UNIVERSAL->import is deprecated and will be removed in a future perl
>> UNIVERSAL->at
>> C:/Perl/site/lib/Bio\Tree\TreeFunctionsI.pm line 94.
>> no adaptor found for class Bio::Tree::Tree no adaptor found for class
>> Bio::Annotation::TypeManager no adaptor found for class
>> Bio::DB::Taxonomy::list no adaptor found for class Bio::Tree::Tree
>> attempting to load adaptor class for BioNamespace
>> attempting to load module Bio::DB::BioSQL::BioNamespaceAdaptor
>> instantiating adaptor class Bio::DB::BioSQL::BioNamespaceAdaptor
>> no adaptor found for class Bio::Annotation::TypeManager no adaptor
>> found for class Bio::DB::Taxonomy::list no adaptor found for class
>> Bio::Tree::Tree attempting to load driver for adaptor class
>> Bio::DB::BioSQL::
>> BioNamespaceAdaptor
>>
>> attempting to load driver for adaptor class Bio::DB::BioSQL::
>> BasePersistenceAdaptor
>> Using Bio::DB::BioSQL::mysql::BasePersistenceAdaptorDriver as driver
>> peer for Bio::DB::BioSQL::BioNamespaceAdaptor
>>
>> ------------- EXCEPTION -------------
>> MSG: failed to open connection:
>> STACK Bio::DB::DBI::base::new_connection
>> C:/Perl/site/lib/Bio/DB/DBI/ba
>> se.pm:267
>>
>> STACK Bio::DB::DBI::base::get_connection
>> C:/Perl/site/lib/Bio/DB/DBI/ba
>> se.pm:227
>>
>> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::dbh
>> C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1498
>> STACK Bio::DB::BioSQL::BaseDriver::insert_object
>> C:/Perl/site/lib/Bio/DB/
>> BioSQL/BaseDriver.pm:970
>> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
>> C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:212
>> STACK Bio::DB::Persistent::PersistentObject::create
>> C:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm:257
>> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
>> C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182
>> STACK Bio::DB::Persistent::PersistentObject::create
>> C:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm:257
>> STACK toplevel GB_crawler.pl:52
>> -------------------------------------
>>
>> Again, the script halts asking for Perl514.dll, then the EXCEPTION MSG
>> appears...
>> Is there a chance ActivePerl 5.18 couldn't work with Bioperl-DB (I
>> have
>> 1.006000 version)?
>>
>> Thanks for any answer !
>>
>> Best regards
>>
>> Olivier
>>
>> Le 06/03/2014 21:04, Smithies, Russell a écrit :
>>
>>> BioPerl-1.6.923.tar.gz installed OK for me and I can run your script
>>> on that gbk file from Windows with ActivePerl 5.16.1 and I get no
>>> warnings at all.
>>>
>>> --Russell
>>>
>>> -----Original Message-----
>>> From:bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces@
>>> lists.open-bio.org] On Behalf Of Smithies, Russell
>>> Sent: Friday, 7 March 2014 8:03 a.m.
>>> To: Olivier BUHARD;bioperl-l at lists.open-bio.org
>>> Subject: Re: [Bioperl-l] New to BioPerl - A little presentation...
>>> and a question about GenBank and Bioperl
>>>
>>> Hi,
>>> -w on the shebang line is for displaying warnings, so naturally if
>>> you leave it off you won't get warnings.
>>> A slightly better method is to 'use warnings' instead - but be aware
>>> it gives you slightly different results Eg.
>>>
>>> #!perl
>>>
>>> use strict;
>>> use warnings;
>>> use Bio::SeqIO;
>>>
>>> And on Windows systems you can shorten the shebang line to #!perl as
>>> obviously the usual path of #!/usr/bin/perl isn't relevant.
>>>
>>> When I run your code on the exact same file I get the required output
>>> and no warnings - though admittedly that's on a Linux system.
>>> Is it possible you're running an older version of BioPerl?
>>> I'll update my Windows BioPerl (to CJFIELDS/BioPerl-1.6.923.tar.gz)
>>> install and give it a go.
>>>
>>> --Russell
>>>
>>>
>>>
>>> -----Original Message-----
>>> From:bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces@
>>> lists.open-bio.org] On Behalf Of Olivier BUHARD
>>> Sent: Friday, 7 March 2014 12:06 a.m.
>>> To:bioperl-l at lists.open-bio.org
>>> Subject: [Bioperl-l] New to BioPerl - A little presentation... and a
>>> question about GenBank and Bioperl
>>>
>>> Hello,
>>>
>>>
>>> I'm new to BioPerl and would like to ask you for a few advice about
>>> the use of Bioperl.
>>>
>>> I am a molecular biologist and I frequently use Perl to write scripts
>>> to prepare or analyse files I get from various databases, so I'm
>>> familiar enough with Perl.
>>> We work in my lab on a particular type of tumorigenic process called
>>> MSI, for MicroSatellite Instability. I'll not go through all the
>>> story but a hallmark of the associated cancers is that the size of
>>> their genomic repeated DNA sequences spread throughout the genome, is
> altered.
>>> Up to now, we got a list of those sequences from a collaboration who
>>> could make that for us. But now the list we have is old and we have
>>> to get this information by our own means and naturally I started
>>> looking at Bioperl.
>>> And before I go through learning all I need (which I guess, will take
>>> some time), I will really appreciate if someone could tell me if I
>>> Bioperl can help from start to end.
>>>
>>> In summary, I plan to search all the short repetitive sequences (I'm
>>> just interested in human genome at the moment) I can find in the
>>> Genbank flat file provided by the NCBI FTP site. The idea is to
>>> create a BioSQL database (I already installed using a schema for
>>> mySQL) that I could query using an appropriate algorithm.
>>> I saw Bioperl is made to read those files with multiple entries. So
>>> building the BioSQL database would not be a problem. My first
>>> question is about how I will crawl through the genomic sequences to
>>> detect short repeat tandem sequences of defined size and patterns
>>> (some are mononucleotides repeats, like (A)27, other could be
>>> dinucleotides redpeats like (CA)12, etc.). BLAST is not design for
>>> such a job... Are there some tools already available in Bioperl to
>>> deal with low complexity DNA in general and short tandem repeats in
>>> particular, something like repeatmasker or windowmasker but with a
>>> different kind of output? I'm interested in retrieving some of the
>>> features provided with the genbank format (find repeats in coding or
>>> non-coding regions, get their position in the genes or the transcripts
> with respect to exon position, intron-exon proximity...).
>>>
>>> I also have a more direct and "practical" question. I just tried a
>>> few sample codes provided in the beginners' toturials on the Bioperl
>>> site. I just ran the following on the gbpri1.seq provided on the NCBI
>>> FTP but I got some errors and warnings for many (but not all) sequences.
>>>
>>> #!/usr/bin/perl -w
>>>
>>> use strict;
>>> use Bio::SeqIO;
>>>
>>> my $seqio_obj = Bio::SeqIO->new(-file => "<$seq_file", -format =>
>>> 'genbank' ); while (my $seq_obj = $seqio_obj->next_seq()){
>>> print $seq_obj->display_id,"\n"; }
>>>
>>> This is what I get for AB000095 locus:
>>>
>>> Replacement list is longer than search list at
>>> C:/Perl/site/lib/Bio/Range.pm lin e 251.
>>> UNIVERSAL->import is deprecated and will be removed in a future perl
>>> UNIVERSAL->at
>>> C:/Perl/
>>> site/lib/Bio/Tree/TreeFunctionsI.pm line 94 Subroutine new redefined
>>> at C:/Perl/site/lib/Bio\Location\Simple.pm line 93, <GE
>>> N0> line 41.
>>> Subroutine start redefined at C:/Perl/site/lib/Bio\Location\Simple.pm
>>> line 115,
>>> <GEN0> line 41.
>>> Subroutine end redefined at C:/Perl/site/lib/Bio\Location\Simple.pm
>>> line 144, <G
>>> EN0> line 41.
>>> Subroutine length redefined at
>>> C:/Perl/site/lib/Bio\Location\Simple.pm
>>> line 190,
>>> <GEN0> line 41.
>>> Subroutine location_type redefined at
>>> C:/Perl/site/lib/Bio\Location\Simple.pm li ne 281, <GEN0> line 41.
>>> Subroutine to_FTstring redefined at
>>> C:/Perl/site/lib/Bio\Location\Simple.pm line
>>> 328, <GEN0> line 41.
>>> Subroutine trunc redefined at C:/Perl/site/lib/Bio\Location\Simple.pm
>>> line 370,
>>> <GEN0> line 41.
>>> AB000095
>>>
>>> But when I remove the shebang option -w... the warnings disappear.
>>> (I use ActivePerl 5.14.2 on a Windows XP computer. I had the idea
>>> that shebang was not used under Windows, but it seems tat's wrong here...
>>> Is that due to some problem about my Perl installation, or is it
>>> Bio::SeqIO code related?
>>>
>>> Thank in advance for any answer.
>>>
>>> Kind regards
>>>
>>>
>> --
>>
>> --------------------
>>
>> BUHARD Olivier
>>
>> "Instabilité de microsatellites et cancer"
>> Centre de Recherche Saint Antoine
>> équipe 11/INSERM UMRS 938
>> Bâtiment Kourilsky,
>> Hôpital Saint Antoine
>> 34 rue Crozatier
>> 75012 PARIS
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
>
> --
> Hilmar Lapp -:- lappland.io
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list