[Bioperl-l] questions on Bio::Tools::Run::Alignment::Clustalw
Lorenzo Carretero
locarpau at upvnet.upv.es
Wed May 18 22:41:26 UTC 2011
Hi all,
I have a few questions regarding the package
Bio::Tools::Run::Alignment::Clustalw. The following script:
#!/usr/local/bin/perl -w
use 5.010;
use strict;
use lib "/Library/Perl/";
use Bio::Perl;
use Bio::Seq;
use Bio::SeqIO;
# definition of the environmental variable CLUSTALDIR
BEGIN {$ENV{CLUSTALDIR} =
'/Applications/Bioinformatics/clustalw-2.0.10-macosx/ '}
use Bio::Tools::Run::Alignment::Clustalw;
my $sequencesfilename =
"/Users/Lorenzo/Documents/SequencesDatabase/plaza_public_02_Apr27/plaza_public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.besth.pep1.fas
";
my $format = 'fasta';
#my $inseq = Bio::SeqIO->new(-file => "<$sequencesfilename",
# -format => $format );
my $factory = Bio::Tools::Run::Alignment::Clustalw->new (); #use
default parameters
#my @seq_object_array = read_all_sequences( -file =>
"<$sequencesfilename",
# -format => $format );
#my $seq_array_ref = \@seq_object_array;
#my $aln = $factory->align($seq_array_ref);
my $aln = $factory->align($sequencesfilename);
my $avgpercentid = $aln->percentage_identity;
my $alnlength = $aln->length();
my $numberalnresidues = $aln->no_residues;
print "$avgpercentid and $alnlength and $numberalnresidues\n";
is returning the following error message:
Use of uninitialized value in concatenation (.) or string at
/Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 753.
Use of uninitialized value in concatenation (.) or string at
/Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 754.
sh: align: command not found
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: ClustalW call ( align
-infile="/Users/Lorenzo/Desktop/test_vs_test.besth.pep1.fas"
-output=gcg
-outfile="/var/folders/rA/rApd7cXoFyWK-Yhn66cxZk+++TI/-Tmp-/O3Was62L0X/exicCvJnrF"
2>&1) crashed: 32512
STACK: Error::throw
STACK: Bio::Root::Root::throw /Library/Perl//5.10.0/Bio/Root/Root.pm:368
STACK: Bio::Tools::Run::Alignment::Clustalw::_run
/Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:768
STACK: Bio::Tools::Run::Alignment::Clustalw::align
/Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:515
STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:22
<http://test.pl:22>
-----------------------------------------------------------
What would be more efficient in term of memory usage:
i.-performing the alignment directly over a fasta sequences file or
ii.-performing the alignment over a ref to an array of seq objects:
my @seq_object_array = read_all_sequences( -file =>
"<$sequencesfilename",
-format => $format );
my $seq_array_ref = \@seq_object_array;
my $aln = $factory->align($seq_array_ref);
Unfortunately my script is not running neither in this form. I checked
and custalw is properly installed in the given dir It appears as the
script is not reading properly my file (see attached document). Should I
move the seqs files to the clustalw dir?
FInally, is there any way of geting the number of aminoacids in the
aligned region in eg. the longer or the shorter sequence implemented or
should I loop over the sequences in the $aln Bio::SimpleAlign object etc?.
Thanks for your help-
Greetings from Spain,
Lorenzo
--
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
Lorenzo Carretero Paulet
Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV)
Integrative Systems Biology Group
C/ Ingeniero Fausto Elio s/n.
46022 Valencia, Spain
Phone: +34 963879934
Fax: +34 963877859
e-mail: locarpau at upvnet.upv.es
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_vs_test.besth.pep1.fas
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20110519/b2ef38bb/attachment.ksh>
More information about the Bioperl-l
mailing list