[Bioperl-l] avoiding feature parsing
Danny Rice
dwrice at indiana.edu
Fri Jun 10 01:41:46 EDT 2005
I'm cranking through a bunch of genbank or fasta files named by their
ncbi gi. The large genbank files take a huge amount of time to parse
all the feature info but I am only interested in the sequence. I've
looked at the modules and read the docs but haven't found good
documentation on how to read a genbank file without parsing all the
feature info. I tried
my $seqio = Bio::SeqIO->new(-file => "$dir/$gi", -format => "fasta");
and to my surprise it seems to parse the genbank files correctly but
only gets the sequence, which seems to solve the problem. My only
question is "Is this the expected behavior and can I rely on this
working? And. Is their any documentation on this behavior?". I
suppose this figures out that I mean: "I'm only interested in the
sequence but go ahead and figure out the format of the input file if it
isn't already in fasta format." If there is a more standard or faster
way to just get the sequence from a genbank file I'd be interested in
that also.
-Danny
More information about the Bioperl-l
mailing list