[Bioperl-l] SearchIO speed up
Chris Fields
cjfields at uiuc.edu
Mon Aug 14 13:54:14 UTC 2006
On Aug 14, 2006, at 8:04 AM, Sendu Bala wrote:
> aaron.j.mackey at gsk.com wrote:
>> A "pull parser" need not read everything (i.e. the entire file) into
>> memory, just the current/next chunk, right?
>
> The problem arises when you need random-access to the input data in
> order to do what you need to do, like get just the next chunk or
> bit of
> information.
>
> So I don't see a way for a generalized pull-parser to cope with piped
> input, because most operations are going to have use seek() to
> work, and
> you can't seek piped input.
>
> What I do at the moment, then, is on detecting piped input, I'm forced
> to read all the input data in in one go and spit it out into seekable
> memory or a temp file. After which normal behaviour resumes - you
> don't
> read everything, just the bit you want.
The traditional route has been using a tempfile. Bio::Root::IO has
several methods for creating tempdirs/tempfiles.
I would have the option available for a tempfile, at least, for the
guys who deal with large BLAST files. I think the XML files can also
be quite long.
Speaking of XML, is the current idea to get this running on text-
based BLAST initially?
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list