[Bioperl-l] Generating multiple fasta files from an embl file

Adam Witney awitney at sghms.ac.uk
Thu Jun 26 16:00:47 EDT 2003


Hi,

I am passing an embl file and generating a multiple fasta file of the genes.
In the embl file there are entries like this

FT   CDS             complement(join(104291..105577,107535..108365))

The two joined pieces of DNA actually span two other genes (insertion
elements)

I am passing the file like so:

......
my $embl_obj = Bio::SeqIO->new('-file' => "$embl_file",
                              '-format' => 'embl');

while (my $seq = $embl_obj->next_seq())  # gets the genome sequence object
  {
   my @features = $seq->all_SeqFeatures;
   
   foreach my $feat (@features)         # gets the genes
     {
      if($feat->primary_tag eq 'CDS')
         {
          my $seq_obj = $feat->seq;
          
          $seq = $seq_obj->seq;
......

However the $seq variable now contains the whole piece of DNA from
104291-108365 ie including the two insertion elements.

What I would like is to get the two joined fragments separately... Is there
a way to do this with bioperl? Or a better way of doing the whole process
here?

Thanks

Adam


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the Bioperl-l mailing list