[Bioperl-l] get cds from genbank record
Heikki Lehvaslaiho
heikki at nildram.co.uk
Fri Jan 16 02:50:01 EST 2004
Perdo,
It should work again if you spesify the sequence format for the output:
my $out = new Bio::SeqIO(-file => ">$ARGV[0].tfa", -format => 'fasta');
It should have been spesified even before 1.4, I an surprised that it worked
without it.
-Heikki
On Thursday 15 Jan 2004 20:35, Pedro Antonio Reche wrote:
> !/usr/sbin/perl -w
> #use strict;
> use Bio::SeqIO;
> use Bio::Seq;
>
>
> my $in = new Bio::SeqIO(-file => "$ARGV[0]", -format => 'genbank');
> my $out = new Bio::SeqIO(-file => ">$ARGV[0].tfa");
>
> while( my $seq = $in->next_seq ) {
>
> foreach my $f ( grep { $_->primary_tag eq 'CDS' }
> $seq->top_SeqFeatures ) {
> my ($gname);
> if( $f->has_tag('gene') ) {
> ($gname) = $f->each_tag_value('gene');
> } elsif( $f->has_tag('product') ) {
> ($gname) = $f->each_tag_value('product');
> }
>
> my ($ref) = $f->has_tag('protein_id') &&
> $f->each_tag_value('protein_id');
> my ($gi) = $f->has_tag('db_xref') && $f->each_tag_value('db_xref');
>
> my ($translation) = $f->has_tag('translation') &&
> $f->each_tag_value('translation');
>
> unless( $gi && $ref && $gname && $translation ) {
> print STDERR "not fully annotated CDS ($gi,$ref,$gname),
> skipping...\n";
> next;
> }
> my $outseq = Bio::PrimarySeq->new(-seq => $translation,
> -display_id =>
> sprintf("gi|%s|gb|%s|%s",$gi,$gname,$ref));
> $out->write_seq($outseq);
> }
--
______ _/ _/_____________________________________________________
_/ _/ http://www.ebi.ac.uk/mutations/
_/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk
_/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
_/ _/ _/ Wellcome Trust Genome Campus, Hinxton
_/ _/ _/ Cambs. CB10 1SD, United Kingdom
_/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________
More information about the Bioperl-l
mailing list