[Bioperl-l] SiteMatrix changes

Sendu Bala bix at sendu.me.uk
Fri Sep 1 12:25:57 UTC 2006


Sendu Bala wrote:
> skirov wrote:
>> Sounds OK with me. To summarize:
>> 1. Correction is disabled by default.
>> 2. Correction should be applied to all positions.
>> 3. Thresholds for IUPAC consensus can be user defined.
>> 4. A fix for IUPAC consensus calculation: change the defaukt behavior.
>> 5. Document the options
>> Does this sounds right?
> 
> Yes, sounds good to me. I'll code those up shortly.

This is now done. I didn't quite do it the way you suggested: the 
'threshold' for IUPAC consensus is implemented as significance level for 
rounding the frequencies. This way we don't have to suffer some 
arbitrary cutoff that does unexpected things. You may also want to check 
the changes to the documentation (mostly in new() and the description) 
to make sure I understood and explained everything well enough.

The consensus string now also enforces the supplied or default 
threshold, and treats the threshold the way most people might think of 
such a thing - the minimum acceptable value (inclusive). This doesn't 
seem to actually change the answer for the few test matrices used by the 
test scripts (though the test script answers have changed since we're 
not doing pseudo-count correction anymore).

One issue is that there's no way for the user to decide to do 
pseudo-count correction or not when using the PSM::IO modules. The 
correction should probably be farmed out to a separate method. I don't 
plan to do this myself.



More information about the Bioperl-l mailing list