[Bioperl-l] Creating FASTA library.

Ewan Birney birney@ebi.ac.uk
Fri, 30 Aug 2002 04:12:02 -0400 (EDT)

On Thu, 29 Aug 2002, Martin Hirst wrote:

> Hi,
> Sorry if this is a really naive question but is there a simple way of
> generating a single file from a directory of peptides suitable to input into
> clustalw?  I have only been able to find a third party application that runs
> on windows to do it (I am running the bioperl package in OSX).

I am writing this code without testing it - perhaps some other people on
the list can improve on it etc:

Easy unix way: (will work on MacOS X), assumming all the files are called
.pep and they are all fasta files:

cat *.pep > file_for_clustalw

Harder Bioperl way, but could cope with more complex data formatting

use Bio::SeqIO;

my $seqout = Bio::SeqIO->new( -file => '>file_for_clustalw.fa', -format =>

opendir(D,"."); # opens current directory
@files = readdir(D); # reads all the filenames

foreach my $filename ( @files ) {
  $filename =~ /\.pep$/ || next; # only open .pep files


  # assumme each .pep file is a fasta file
  $seqin = Bio::SeqIO->new( -fh => \*F , -format => 'fasta');

  # assumme there is only one sequence in each file

  $seq = $seqin->next_seq();

  # write sequence out



The script is of course much more flexible but the command line one liner
will work assumming you have just .pep files in fasta format

> Cheers
> Martin
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l