[Bioperl-l] extracting CDS portion of RefSeqs

Barry Moore bmoore at genetics.utah.edu
Wed Dec 14 12:14:50 EST 2005


Amit,

Don't do the substr and instantiate a new seqobj yourself, let bioperl
do it.

my $seqstr = $feat->seq;

Barry

> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
> bounces at portal.open-bio.org] On Behalf Of Amit Indap
> Sent: Wednesday, December 14, 2005 9:14 AM
> To: bioperl-l at portal.open-bio.org
> Subject: [Bioperl-l] extracting CDS portion of RefSeqs
> 
> Hi,
> 
> I want to extract the CDS portion of human refseqs. I downloaded the
> genbank flat file of the most recent Refseq release. I was going to
> parse the Genbank file and write out the CDS porition of the sequence
> like so:
> 
> my $seqio = Bio::SeqIO->new(-file => $ARGV[0],
> 		      -format => 'GenBank');
> 
> 
> foreach my $feat ( $seq->get_SeqFeatures() ) {
>              if( $feat->primary_tag eq 'CDS' ) {
> 		 my $start = $feat->start;
> 		 my $end = $feat->end;
> 	 my $seqstr   = $seq->subseq($start,$end); #
> 		 my $displayid = $seq->display_name;
> 		 #my $seqobj = Bio::Seq->new( -display_id =>
> "$displayid:$start..$end",
> 		#			     -seq => $seqstr);
> 		# my $out = Bio::SeqIO->new(-format => 'Fasta');
> 		# $out->write_seq($seqobj);
> 
> 
> #                 print STDOUT "Location ",$feat->start,":",
> #                    $feat->end," GFF[",$feat->gff_string,"]\n";
>              }
>          }
> --
> Amit Indap
> http://www.bscb.cornell.edu/Homepages/Amit_Indap/
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list