[Bioperl-l] Creating FASTA library.

Ewan Birney birney@ebi.ac.uk
Fri, 30 Aug 2002 04:12:02 -0400 (EDT)


On Thu, 29 Aug 2002, Martin Hirst wrote:

> Hi,
>
> Sorry if this is a really naive question but is there a simple way of
> generating a single file from a directory of peptides suitable to input into
> clustalw?  I have only been able to find a third party application that runs
> on windows to do it (I am running the bioperl package in OSX).
>

I am writing this code without testing it - perhaps some other people on
the list can improve on it etc:


Easy unix way: (will work on MacOS X), assumming all the files are called
.pep and they are all fasta files:


cat *.pep > file_for_clustalw



Harder Bioperl way, but could cope with more complex data formatting



use Bio::SeqIO;

my $seqout = Bio::SeqIO->new( -file => '>file_for_clustalw.fa', -format =>
'fasta');

opendir(D,"."); # opens current directory
@files = readdir(D); # reads all the filenames
closedir(D);


foreach my $filename ( @files ) {
  $filename =~ /\.pep$/ || next; # only open .pep files

  open(F,"$filename");

  # assumme each .pep file is a fasta file
  $seqin = Bio::SeqIO->new( -fh => \*F , -format => 'fasta');

  # assumme there is only one sequence in each file

  $seq = $seqin->next_seq();

  # write sequence out

  $seqout->write_seq($seq);

}



The script is of course much more flexible but the command line one liner
will work assumming you have just .pep files in fasta format





>
> Cheers
>
> Martin
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>