[Bioperl-l] extracting coordinates from EMBL join annotation
Hilmar Lapp
hlapp@gnf.org
Mon, 28 Oct 2002 09:55:21 -0800
On Monday, October 28, 2002, at 05:47 AM, Zayed Albertyn wrote:
> Hello All
>
> I cant seem to get m head around this problem and I would like to know
> if anybody could please help me out. I need to do comparisons of
> sensitivity and to this end I need to extract the coordinates from an
> embl file. I have code that extrants the sequence but I would like the
> corresponding coordinates.
>
> Here is the code I would hopefully like to alter
>
>
> my $embl_file = $ARGV[0];
> my $in = Bio::SeqIO->new(-file => $embl_file,
> -format => 'EMBL');
>
> my $seq;
> my $i;
>
> while (defined($seq = $in->next_seq())) {
> foreach my $feature ($seq->top_SeqFeatures) {
>
> if ($feature->primary_tag eq 'mRNA_span') {
> my $new_seq_name = $seq->display_id . '_MRNA';
> my $new_seq = Bio::Seq->new(-seq => '',
> -id => ($seq->id . '_MRNA'),
> -moltype => 'dna');
>
> my $namef = $seq->display_id;
> my @exons = $feature->sub_SeqFeature();
Feature tables in EMBL, Genbank, Swissprot don't have nested
features. Hence, there will be no sub-seqfeatures unless you create
them.
As I believe I told you earlier you do need to access the
$feature->location() object in order to obtain the sub-locations for
split locations.
If all you need is the spliced sequence, call
$feature->spliced_seq() (only from release 1.1.1 on).
-hilmar
> open (FILE,">$namef.fa") || die "Error: $!";
>
> if ($exons[0]->strand() == -1) {
> @exons = sort { $b->start() <=> $a->start() } @exons;
> } else {
> @exons = sort { $a->start() <=> $b->start() } @exons;
> }
> $i = 0;
> foreach my $subfeature (@exons) {
> $i++;
> # append each span of mRNA to the new sequence
> $new_seq->seq($new_seq->seq() . $subfeature->seq->seq());
> print FILE ">$namef".".Exon$i\n";
> print FILE $subfeature->seq->seq(),"\n";
> }
> }
> }
> }
>
>
> Thanks
>
> Zayed
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------