[Bioperl-l] clustalw.pm: could not open sequence file error

Barry Moore bmoore at genetics.utah.edu
Thu Dec 1 08:14:44 EST 2005


Olena,

Does the filename for the file in question have any spaces anywhere in
the path?  I know clustalx won't open files with a space in the path
even though Windows allows that.  Don't know for sure on clustalw, but
seems like it might behave the same way.

Barry
-----Original Message-----
From: bioperl-l-bounces at portal.open-bio.org
[mailto:bioperl-l-bounces at portal.open-bio.org] On Behalf Of Olena
Morozova
Sent: Tuesday, November 29, 2005 3:34 PM
To: bioperl-ml List
Subject: [Bioperl-l] clustalw.pm: could not open sequence file error

Hi all,

I am trying to use this script

use Bio::Tools::Run::Alignment::Clustalw;

$ENV{CLUSTALDIR} = 'C:/perl/clustalw1.8/';
 my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM',

'outfile'=> 'al_mouse.txt');
 my $factory =

Bio::Tools::Run::Alignment::Clustalw->new(@params);
 $inputfilename = 'c:/perl/mouse_unique.txt';
 my $aln = $factory->align($inputfilename);

to do a MSA, and it works for a test file with 2 or 3 sequences.
However, when I try to run it on my actual file (has 97 sequences)
which is in exactly the same format as the test file (fasta), I get a
"could not open the sequence file" error.
Is this because the file is too big and is there a way to fix this?
Thanks a lot for your help!

Olena

On 11/29/05, Jason Stajich <jason.stajich at duke.edu> wrote:
>
>
> Begin forwarded message:
>
> > From: neeti somaiya <neetisomaiya at gmail.com>
> > Date: November 29, 2005 1:27:27 AM EST
> > To: Jason Stajich <jason.stajich at duke.edu>
> > Subject: Re: [Bioperl-l] need BLAT parse code
> >
> > I use the following code :
> >
> > open(FH,"output.psl");
> > while(<FH>)
> > {
> >     if( /^psLayout/ )
> >     {
> >           for( 1..4 ) { <> }
> >       }
> >      my @line = split;
> >      my ( $matches,$mismatches,$rep_matches,$n_count,
> >             $q_num_insert,$q_base_insert,
> >             $t_num_insert, $t_base_insert,
> >             $strand, $q_name, $q_length, $q_start,
> >             $q_end, $t_name, $t_length,$t_start, $t_end,
$block_count,
> >             $block_sizes,  $q_starts,      $t_starts
> >             ) = split;
> >
> >
> >       print $t_start;
> >       print "\n";
> >       print $t_end;
> >
> > }
> >
> > for output.psl file :
> >
> > match   mis-    rep.    N's     Q gap   Q gap   T gap   T gap
> > strand  Q               Q       Q       Q       T
> > T       T       T       block   blockSizes      qStarts  tStarts
> >         match   match           count   bases   count
> > bases           name            size    start   end
> > name            size    start   end     count
> >
----------------------------------------------------------------------
> >
----------------------------------------------------------------------
> > -------------------
> > 27025   0       0       0       0       0       0       0
> > +       query_sequence3 27025   0       27025
> > database_sequence3      57701691        132995  160020  1
> > 27025,  0,      132995,
> > ~
> >
> >
> > It gave me output :
> >
> > Q
> > Q
> >
> > 132995
> > 160020
> >
> > What is the Q? Cant I obtain the coordinates (132995, 160020) alone?
> >
> > Please let me know.
> > Thanks.
> >
> > On 11/28/05, Jason Stajich <jason.stajich at duke.edu> wrote:
> > Bio::SearchIO::psl can parse psl output.
> >
> > or more simply:
> >
> > while(<>) {
> >    if( /^psLayout/ ) { # if there is a header
> >    for( 1..4 ) { <> }  # take next 4 lines to skip the header
> >    }
> >   my @line = split;
> >   my ( $matches,$mismatches,$rep_matches,$n_count,
> >              $q_num_insert,$q_base_insert,
> >              $t_num_insert, $t_base_insert,
> >              $strand, $q_name, $q_length, $q_start,
> >              $q_end, $t_name, $t_length,$t_start, $t_end,
> > $block_count,
> >              $block_sizes,  $q_starts,      $t_starts
> >              ) = split;
> >
> >   #  query aln vals are  $q_start, and $q_end values
> >   # hit aln vals are $t_start, $t_end
> > }
> >
> > On Nov 28, 2005, at 8:06 AM, neeti somaiya wrote:
> >
> > > Hi,
> > >
> > > I am using BLAT in a project.I am having simple .psl output files
> > > after
> > > running BLAT of a gene sequences against full chromosomal
> > > sequences.Doesanyone have a simple BLAT parse code. I am only
> > > interested in obtaining the
> > > alignment start and end positions on the target.
> > > --
> > > -Neeti
> > > Even my blood says, B positive
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > --
> > Jason Stajich
> > Duke University
> > http://www.duke.edu/~jes12
> >
> >
> >
> >
> >
> > --
> > -Neeti
> > Even my blood says, B positive
>
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

_______________________________________________
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list