[Bioperl-l] Modifying BPlite

Hilmar Lapp lapp@gnf.org
Fri, 02 Mar 2001 15:16:01 -0800


"Mark, Terry" wrote:
> 
> I figured I would have to do something like this because of the way BPlite
> works, i.e. it only reads in hits on calls to nextSbjct, instead of reading
> all the hits into memory at once a la Blast.pm (presumably this is one of
> the things that makes it 'lite' ?).

Maybe. The stamp 'lite' is primarily due to the much better
maintainability of the code, which of course (well, in this case)
comes with a trade-off in functionality. Not slurping the whole file
may also reduce the memory consumption (BLAST reports can be very big,
and Blast.pm does have some garbage collection problems, which may be
due to the complex back-linking of objects (parent() and friends).

> 
> However, I realized that calling 'tell' would NOT work if a script was
> called in a pipe, since these obviously provide no file position and no way
> to seek backwards.

Right. My suggestion would be the following: the methods to retrieve
the statistics initiate parsing of the stream up to the position where
the statistics section is (and farther of course). The hits
encountered on the way to the end are stored in a queue, which is
consumed by nextSbjct() prior to consuming more input (there will then
be no more input anyway). This makes the statistics available on
request, with the trade-off that when they are requested before all
subjects, memory consumption may be increased by intermediate storing
of the subjects/hits. It is also compatible with taking input from any
stream (BTW socket streams are also not seekable), and someone who
doesn't request the statistics wouldn't even notice a difference.

This may also require some hacks in order to prevent the subject/hit
parsing code from eating up the whole statistics section after the
last hit.

May not be the best solution, was just off the top of my head.

	Hilmar

-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp@gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------