[Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
Jay Hannah
jay at jays.net
Fri Oct 13 16:24:48 UTC 2006
So I'm doing the following:
1) Using Bio::SeqIO to read in a genbank file and kick out fasta.
2) Reading that fasta file w/ command line formatdb.
3) Using that output for command line blastall.
4) Using Bio::SearchIO to read the blast results.
(If there's a better way, do tell. -grin-)
This sequence is working great for nucleotide BLASTing, but I'm stuck on step 1 when trying protein BLAST.
my $seq_in = Bio::SeqIO->new(
-file => "<Organism1.genbank",
-format => "genbank",
-alphabet => "protein"
);
my $seq_out_protein = Bio::SeqIO->new(
-file => ">out",
-format => 'fasta',
-alphabet => 'protein'
);
while (my $inseq = $seq_in->next_seq) {
$inseq->molecule("protein");
$seq_out_protein->write_seq($inseq);
}
This creates a nucleotide file "out". Setting -alphabet doesn't seem to do anything. Setting molecule("protein") doesn't seem to do anything either.
I was expecting that it would just pull all the CDS strings out of the genbank file and dump those into fasta format?
Am I missing something obvious?
Thanks,
j
More information about the Bioperl-l
mailing list