[Bioperl-l] Pulling exons out of a Genbank mRNA

Amir Karger akarger at CGR.Harvard.edu
Mon Feb 13 20:57:08 UTC 2006

I'm trying to get the sequences of each exon in a gene. I have a genbank
file with mRNA and exon features (among others) that look like: 
     mRNA            join(complement(22257..22386),complement(22067..22186),
     exon            complement(13840..13962)

I want to make a FASTA file with 6 sequences corresponding to the 6 exons in
the mRNA above. I tried writing the below code, but it doesn't do what I
want. (You'll note that the code is stolen from the Bio::Seq and Feature

my $inseq = Bio::SeqIO->new(-file   => "<$file", -format => $format );
while (my $seq = $inseq->next_seq) {
    my @features = $seq->get_SeqFeatures(); # just top level
    foreach my $feat ( @features ) {
        my $type = $feat->primary_tag;
        if ($type eq "mRNA") {
                print "Feature ",$feat->primary_tag,
                      " starts ",$feat->start," ends ", $feat->end,
                      " strand ",$feat->strand,"\n";
                my @feats = $feat->get_SeqFeatures();
                print "Found ", scalar @feats, " sub-features\n";
        } elsif ($type eq "exon") {
                print "Feature ",$feat->primary_tag,
                      " starts ",$feat->start," ends ", $feat->end,
                      " strand ",$feat->strand,"\n";

When I run the above, it says that the mRNA features have no sub-features.
So how do I pull out the 6 sequences?

- Amir Karger
Computational Biology Group
Bauer Center for Genomics Research
Harvard University

More information about the Bioperl-l mailing list