[Bioperl-l] Pulling exons out of a Genbank mRNA
Hilmar Lapp
hlapp at gmx.net
Mon Feb 13 23:58:46 UTC 2006
Why you want subfeatures? This is genbank format you're parsing,
right? Your mRNA features will have a split location. Loop over
$feat->location->each_Location() and get $seq->subseq() with the start
and end of each sublocation. If you don't know how to do this check
out the implementation of $feature->splice_seq().
This should be in the HOWTO. Is it not?
-hilmar
On 2/13/06, Amir Karger <akarger at cgr.harvard.edu> wrote:
> I'm trying to get the sequences of each exon in a gene. I have a genbank
> file with mRNA and exon features (among others) that look like:
> mRNA join(complement(22257..22386),complement(22067..22186),
> complement(16753..17101),complement(13840..13962),
> complement(10649..10820),complement(502..3028))
> /gene="ENSG00000005812"
> /note="transcript_id=ENST00000355619"
> exon complement(13840..13962)
> /note="exon_id=ENSE00000802462"
>
> I want to make a FASTA file with 6 sequences corresponding to the 6 exons in
> the mRNA above. I tried writing the below code, but it doesn't do what I
> want. (You'll note that the code is stolen from the Bio::Seq and Feature
> HOWTOs.)
>
> my $inseq = Bio::SeqIO->new(-file => "<$file", -format => $format );
> while (my $seq = $inseq->next_seq) {
> my @features = $seq->get_SeqFeatures(); # just top level
> foreach my $feat ( @features ) {
> my $type = $feat->primary_tag;
> if ($type eq "mRNA") {
> print "Feature ",$feat->primary_tag,
> " starts ",$feat->start," ends ", $feat->end,
> " strand ",$feat->strand,"\n";
> my @feats = $feat->get_SeqFeatures();
> print "Found ", scalar @feats, " sub-features\n";
> } elsif ($type eq "exon") {
> print "Feature ",$feat->primary_tag,
> " starts ",$feat->start," ends ", $feat->end,
> " strand ",$feat->strand,"\n";
> }
> }
> }
>
> When I run the above, it says that the mRNA features have no sub-features.
> So how do I pull out the 6 sequences?
>
> Thanks,
> - Amir Karger
> Computational Biology Group
> Bauer Center for Genomics Research
> Harvard University
> 617-496-0626
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------
More information about the Bioperl-l
mailing list