[Bioperl-l] extracting CDS portion of RefSeqs
Amit Indap
indapa at gmail.com
Wed Dec 14 11:13:31 EST 2005
Hi,
I want to extract the CDS portion of human refseqs. I downloaded the
genbank flat file of the most recent Refseq release. I was going to
parse the Genbank file and write out the CDS porition of the sequence
like so:
my $seqio = Bio::SeqIO->new(-file => $ARGV[0],
-format => 'GenBank');
foreach my $feat ( $seq->get_SeqFeatures() ) {
if( $feat->primary_tag eq 'CDS' ) {
my $start = $feat->start;
my $end = $feat->end;
my $seqstr = $seq->subseq($start,$end); #
my $displayid = $seq->display_name;
#my $seqobj = Bio::Seq->new( -display_id => "$displayid:$start..$end",
# -seq => $seqstr);
# my $out = Bio::SeqIO->new(-format => 'Fasta');
# $out->write_seq($seqobj);
# print STDOUT "Location ",$feat->start,":",
# $feat->end," GFF[",$feat->gff_string,"]\n";
}
}
--
Amit Indap
http://www.bscb.cornell.edu/Homepages/Amit_Indap/
More information about the Bioperl-l
mailing list