[Bioperl-l] Problems with Bio::SearchIO

Chris Fields cjfields at illinois.edu
Tue Nov 11 14:47:15 UTC 2008


On Nov 11, 2008, at 1:59 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Nov 10, 2008, at 4:29 PM, Dan Bolser wrote:
>>> 2008/11/7 Chris Fields <cjfields at illinois.edu>:
>>>> On Nov 7, 2008, at 8:27 AM, Dan Bolser wrote:
>>>>> ...
>>>>> Looking closer I found that $parser->result_count() only gets set
>>>>> after calling $parser->next_result. Any way to force this? In some
>>>>> Perl objects I've seen a 'parse' method that kicks the object into
>>>>> (silently) calling all its get methods. Is there an equivalent  
>>>>> (but
>>>>> apparently undocumented) method? Actually, I think it should kick
>>>>> itself when called... or not? Certainly the docs do not suggest  
>>>>> that
>>>>> is won't return a the number of results ("Function: Gets the  
>>>>> number of
>>>>> Blast results that have been parsed.") So I think this is a bug.
>>>>
>>>> We could make it so that the result_count() is eager (parses the  
>>>> results and
>>>> reports the total back).  Not sure, but we could optionally cache  
>>>> the
>>>> already-parsed Result objects (that could run into memory issues  
>>>> if one is
>>>> parsing a ton of reports, so it needs to be off by default).
>>>
>>> I see (I think). Anyone first calling result_count() and *then*
>>> iterating over the results is getting a performance hit by  
>>> effectively
>>> parsing the results twice? I would suggest that you make this  
>>> function
>>> eager, but document the potential performance issue so that people  
>>> can
>>> choose not to call it first. However, I don't think I can have
>>> understood correctly. How can its value be set correctly after  
>>> calling
>>> next() only once?
>> It's highly possible that result_count() is meant to indicate total  
>> ResultI iteration parsed up to the point of being called (as  
>> opposed to the total number of ResultI), but that isn't made  
>> exactly clear.
>
> Yes, this is the case. I always thought that was pretty unambiguous  
> from the function description. "the number of Blast results that  
> have been parsed". Not "the number of Blast results".

We'll leave as is then and try implementing it blastxml.

chris



More information about the Bioperl-l mailing list