[Bioperl-l] extracting coordinates from EMBL join annotation

Hilmar Lapp hlapp@gnf.org
Mon, 28 Oct 2002 09:55:21 -0800


On Monday, October 28, 2002, at 05:47 AM, Zayed Albertyn wrote:

> Hello All
>
> I cant seem to get m head around this problem and I would like to know
> if anybody could please help me out. I need to do comparisons of
> sensitivity and to this end I need to extract the coordinates from an
> embl file. I have code that extrants the sequence but I would like the
> corresponding coordinates.
>
> Here is the code I would hopefully like to alter
>
>
> my $embl_file = $ARGV[0];
> my $in = Bio::SeqIO->new(-file => $embl_file,
>                          -format => 'EMBL');
>
> my $seq;
> my $i;
>
> while (defined($seq = $in->next_seq())) {
>   foreach my $feature ($seq->top_SeqFeatures) {
>
>     if ($feature->primary_tag eq 'mRNA_span') {
>       my $new_seq_name = $seq->display_id . '_MRNA';
>       my $new_seq = Bio::Seq->new(-seq => '',
>                              -id => ($seq->id . '_MRNA'),
>                              -moltype => 'dna');
>
>       my $namef = $seq->display_id;
>       my @exons = $feature->sub_SeqFeature();

Feature tables in EMBL, Genbank, Swissprot don't have nested 
features. Hence, there will be no sub-seqfeatures unless you create 
them.

As I believe I told you earlier you do need to access the 
$feature->location() object in order to obtain the sub-locations for 
split locations.

If all you need is the spliced sequence, call 
$feature->spliced_seq() (only from release 1.1.1 on).

	-hilmar


>       open (FILE,">$namef.fa") || die "Error: $!";
>
>       if ($exons[0]->strand() == -1) {
>         @exons = sort { $b->start() <=> $a->start() } @exons;
>       } else {
>         @exons = sort { $a->start() <=> $b->start() } @exons;
>       }
>       $i = 0;
>       foreach my $subfeature (@exons) {
>         $i++;
>         # append each span of mRNA to the new sequence
>         $new_seq->seq($new_seq->seq() . $subfeature->seq->seq());
>         print FILE ">$namef".".Exon$i\n";
>         print FILE $subfeature->seq->seq(),"\n";
>       }
>     }
>   }
> }
>
>
> Thanks
>
> Zayed
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
--
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------