[Bioperl-l] The way to extract a segment of an EMBL record?

Phillip San Miguel pmiguel at purdue.edu
Tue Jun 24 23:29:30 UTC 2008


    When I use the trunc method to pull a segment of a sequence object 
derived from large EMBL file, the annotations are not propagated. Do I 
need to specify a heavy weight sequence object somehow for annotation in 
the EMBL file to be carried along?

Phillip

PS The code I'm talking about is below. This produces what appears to be 
a valid EMBL file, but only bare-bones annotation is carried along, or 
generated on the fly:

ID   NC_003070; SV 1; linear; unassigned DNA; STD; UNC; 70001 BP.
XX
AC   NC_003070;
XX
DE   Arabidopsis thaliana chromosome 1, complete sequence.
XX
XX
FH   Key             Location/Qualifiers
FH
XX
SQ   Sequence 70001 BP; 20783 A; 14173 C; 13896 G; 21149 T; 0 other;


Here is the code:

use Bio::SeqIO;
use Bio::Seq;

my $rh_opts =parseOptions();

#create input sequence object
my  $in  = Bio::SeqIO->new(-file => $rh_opts->{INFILE} , -format =>   
$rh_opts->{F_IN} );

#pull in sequence. (Ignores all but first sequence)
my $seq =   $in->next_seq()
    ||  die "Hey, $rh_opts-{INFILE} contains no sequence of format 
$rh_opts->{F_IN}!";
$in->close();

#xtract segment of sequence to output
my $end     =(defined $rh_opts->{END})
            ? $rh_opts->{END}
            : $seq->length();
my $begin   = $rh_opts->{BEGIN};
my $segment =$seq->trunc($begin, $end);

#create output object
my $out = Bio::SeqIO->new(-file => ">$rh_opts->{OUTFILE}" , -format => 
$rh_opts->{F_OUT});
$out->write_seq($segment);




More information about the Bioperl-l mailing list