[Bioperl-l] Possible bug in Bio::Tools::SeqStats->get_mol_wt?

Roy Chaudhuri roy.chaudhuri at gmail.com
Thu Mar 24 13:47:27 UTC 2011

Hi all,

I have discovered a possible bug in Bioperl, although maybe it's my 
expectations that are wrong, not the code.

I noticed that when calculating molecular weights for a bunch of protein 
sequences using Bio::Tools::SeqStats->get_mol_wt, the values I was 
getting were slightly different from the ones given by Emboss pepstats. 
This was due to my protein sequences ending with *, since they were 
derived from translating annotated genes including the stop codon. 
Surprisingly (to me, at least) Bio::Seq->length gives a value that 
counts the terminal *, so one greater than the number of amino acids. 
SeqStats->get_mol_wt calls Bio::Seq->length to determine the number of 
water molecules to subtract from the total molecular weight, so the 
reported weights for my sequence were the weight of one water molecule 
less than they should have been. I'm not sure if this is a bug in 
get_mol_wt, in Bio::Seq->length, or if it's bad practice to use protein 
sequences with a terminal asterisk (I've never had a problem doing so 


