[Bioperl-l] questions on Bio::Tools::Run::Alignment::Clustalw

Wed May 18 22:41:26 UTC 2011

Hi all,
I have a few questions regarding the package 
Bio::Tools::Run::Alignment::Clustalw.  The following script:

    #!/usr/local/bin/perl -w
    use 5.010;
    use strict;

    use lib "/Library/Perl/";
    use Bio::Perl;
    use Bio::Seq;
    use Bio::SeqIO;
    # definition of the environmental variable CLUSTALDIR
    BEGIN {$ENV{CLUSTALDIR} =
    '/Applications/Bioinformatics/clustalw-2.0.10-macosx/ '}
    use Bio::Tools::Run::Alignment::Clustalw;

    my $sequencesfilename =
    "/Users/Lorenzo/Documents/SequencesDatabase/plaza_public_02_Apr27/plaza_public_02/BLAST_Parsed_results/PerSpecies/test_vs_test.besth.pep1.fas
    ";
    my $format = 'fasta';
    #my $inseq = Bio::SeqIO->new(-file => "<$sequencesfilename",
    #                            -format => $format );
    my $factory = Bio::Tools::Run::Alignment::Clustalw->new (); #use
    default parameters
    #my @seq_object_array = read_all_sequences(    -file =>
    "<$sequencesfilename",
    #                                            -format => $format );
    #my $seq_array_ref = \@seq_object_array;
    #my $aln = $factory->align($seq_array_ref);
    my $aln = $factory->align($sequencesfilename);
    my $avgpercentid = $aln->percentage_identity;
    my $alnlength = $aln->length();
    my $numberalnresidues = $aln->no_residues;
    print "$avgpercentid and $alnlength and $numberalnresidues\n";

is returning the following error message:

    Use of uninitialized value in concatenation (.) or string at
    /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 753.
    Use of uninitialized value in concatenation (.) or string at
    /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm line 754.
    sh: align: command not found

    ------------- EXCEPTION: Bio::Root::Exception -------------
    MSG: ClustalW call ( align 
    -infile="/Users/Lorenzo/Desktop/test_vs_test.besth.pep1.fas"
    -output=gcg  
    -outfile="/var/folders/rA/rApd7cXoFyWK-Yhn66cxZk+++TI/-Tmp-/O3Was62L0X/exicCvJnrF"
    2>&1) crashed: 32512
    STACK: Error::throw
    STACK: Bio::Root::Root::throw /Library/Perl//5.10.0/Bio/Root/Root.pm:368
    STACK: Bio::Tools::Run::Alignment::Clustalw::_run
    /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:768
    STACK: Bio::Tools::Run::Alignment::Clustalw::align
    /Library/Perl//5.10.0/Bio/Tools/Run/Alignment/Clustalw.pm:515
    STACK: /Users/Lorenzo/Documents/workspace/PlantEvolGen/test.pl:22
    <http://test.pl:22>
    -----------------------------------------------------------

  What would be more efficient in term of memory usage:
i.-performing the alignment directly over a fasta sequences file or
ii.-performing the alignment over a ref to an array of seq objects:

    my @seq_object_array = read_all_sequences(    -file =>
    "<$sequencesfilename",
                                                 -format => $format );
    my $seq_array_ref = \@seq_object_array;
    my $aln = $factory->align($seq_array_ref);

Unfortunately my script is not running neither in this form. I checked 
and custalw is properly installed in the given dir It appears as the 
script is not reading properly my file (see attached document). Should I 
move the seqs files to the clustalw dir?

FInally, is there any way of geting the number of aminoacids in the 
aligned region in eg. the longer or the shorter sequence implemented or 
should I loop over the sequences in the $aln Bio::SimpleAlign object etc?.

Thanks for your help-
Greetings from Spain,
Lorenzo

-- 
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
Lorenzo Carretero Paulet
Institute for Plant Molecular and Cell Biology - IBMCP (CSIC-UPV)
Integrative Systems Biology Group
C/ Ingeniero Fausto Elio s/n.
46022 Valencia, Spain

Phone:  +34 963879934
Fax:    +34 963877859
e-mail: locarpau at upvnet.upv.es
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_vs_test.besth.pep1.fas
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20110519/b2ef38bb/attachment.ksh>