[Bioperl-l] Generating multiple fasta files from an embl file

Jason Stajich jason at cgt.duhs.duke.edu
Thu Jun 26 13:57:43 EDT 2003


This should get you what you want:
  my $geneseq = $cds_feature->spliced_seq();

-jason
On Thu, 26 Jun 2003, Adam Witney wrote:

> Hi,
>
> I am passing an embl file and generating a multiple fasta file of the genes.
> In the embl file there are entries like this
>
> FT   CDS             complement(join(104291..105577,107535..108365))
>
> The two joined pieces of DNA actually span two other genes (insertion
> elements)
>
> I am passing the file like so:
>
> ......
> my $embl_obj = Bio::SeqIO->new('-file' => "$embl_file",
>                               '-format' => 'embl');
>
> while (my $seq = $embl_obj->next_seq())  # gets the genome sequence object
>   {
>    my @features = $seq->all_SeqFeatures;
>
>    foreach my $feat (@features)         # gets the genes
>      {
>       if($feat->primary_tag eq 'CDS')
>          {
>           my $seq_obj = $feat->seq;
>
>           $seq = $seq_obj->seq;
> ......
>
> However the $seq variable now contains the whole piece of DNA from
> 104291-108365 ie including the two insertion elements.
>
> What I would like is to get the two joined fragments separately... Is there
> a way to do this with bioperl? Or a better way of doing the whole process
> here?
>
> Thanks
>
> Adam
>
>
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list