[Bioperl-l] Converting Genbank to fasta via SeqIO

Heikki Lehvaslaiho heikki at ebi.ac.uk
Mon Jul 21 10:29:13 EDT 2003


Wes,

There is no routine to do it, but suppose there will be if someone goes
through the trouble of making sure the id can be rebuilt correctly in
all cases. The following seems to work in your case where the sequence
is from RefSeq.

	-Heikki

#------------------------------------------------------
use Bio::SeqIO;

my $in = Bio::SeqIO->new(-format => 'genbank');
my $out = Bio::SeqIO->new(-format => 'fasta');

while ( my $seq = $in->next_seq ) {
    if ($seq->accession =~ '_') {
        my $newid = "gi|". $seq->primary_id. "|ref|".
            $seq->accession. ".". $seq->version. "|";
        $seq->id($newid);
        $out->write_seq($seq);
    }
}
#------------------------------------------------------


On Mon, 2003-07-21 at 06:36, Wes Barris wrote:
> Hi,
> 
> I am using the following code to convert a genbank file into a fasta
> file:
> 
> my $seq_in = Bio::SeqIO->new('-file' => "<$infile", '-format' => 'genbank');
> my $seq_out = Bio::SeqIO->new('-file' => ">$outfile", '-format' => 'fasta');
> 
> while ( my $inseq = $seq_in->next_seq ) {
>     if ($seq->accession =~ '_') {
>        $seq_out->write_seq($inseq);
>     }
> }
> 
> The genbank entry (NM_174198) results in the following defline
> in the fasta file:
> 
>  >TLR4 Bos taurus toll-like receptor 4 (TLR4), mRNA.
> 
> However, I prefer to have a defline containing the accession number similar to
> what is shown at the NCBI site:
> 
>  >gi|31342611|ref|NM_174198.2| Bos taurus toll-like receptor 4 (TLR4), mRNA
> 
> Is there a way to have the SeqIO routines do this?
-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________



More information about the Bioperl-l mailing list