[Bioperl-l] Modify / Delete features from Bio::Seq object?

simon andrews (BI) simon.andrews@bbsrc.ac.uk
Wed, 30 Jan 2002 09:45:00 -0000


I'm trying to put together a script which runs through and EMBL file and
removes or modifies the features contained within it.  I can parse through
the features OK, and pick out those I need to deal with, but I can't find
any methods which allow me to delete or modify existing features (only add
new ones).

I realise I could do this by creating a new sequence and selectively copying
features across, but as these sequences can be big I'd rather just modify an
existing one.  Also, I couldn't find a new() method defined in
Bio::SeqFeatureI.  How do you create a new feature?

In the code below, I'd like to delete each feature / tag whose name is
currently printed.  I've tried calling undef on those features (hoping that
they were references back to the Bio::RichSeq object), but this had no
effect on the sequence written back out.

Any help is greatly appreciated.

Simon.



#!/usr/bin/perl -w
use strict;
use Bio::SeqIO;

my ($filein,$fileout) = @ARGV;

unless ($fileout){die "Command line is convert [infile] [outfile]\n";}

my $seqio_in = Bio::SeqIO -> new (-format => 'embl',
				  -file => $filein);

my $seqio_out = Bio::SeqIO -> new (-format => 'embl',
				  -file => ">$fileout");

die "Couldn't read $filein\n" unless ($seqio_in);

my $inseq = $seqio_in -> next_seq();

foreach my $feature ($inseq -> all_SeqFeatures()){

  my $tag = $feature -> primary_tag;

  if ($tag eq "exon" or $tag eq "prediction"){

    print "$tag\n"; # Need to remove these features

  }

  elsif ($tag eq 'CDS'){


    foreach my $alltag ($feature -> all_tags()){

      if ($alltag eq 'transcript' or $alltag eq 'cds'){

	print "\t$alltag\n"; # Need to remove these tags

      }
    }
  }
}

$seqio_out -> write_seq($inseq);