[Bioperl-l] Bioperl-run: Testing alignments generated externally

Thu Oct 26 10:26:36 UTC 2006

Nathan Haigh wrote:
> Sendu Bala wrote:
>   
>> Nathan Haigh wrote:
>>     
>>> I'm thinking that it's not wise to test for things like
>>> overall_percentage_identity etc in alignments that are generated by
>>> external software like T-Coffee, Clustalw etc. Changes to software
>>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>>> alignment produced in different versions and thus affect the value
>>> returned by such methods. Therefore, I think these methods should only
>>> be tested from alignments loaded directly from t/data.
>>>       
>> Did you discover some specific problem cases?
>>     
> My messages seem to be taking a while to come through, but, yes. It may
> be due to the software changing default parameters, but it makes testing
> the output for specific details pretty difficult and inconsistent. For
> example, running T-Coffee, the following command from t/TCoffee.t
> results in slightly different alignment:
> $aln = $factory->run('-type' => 'profile',
>                      '-profile' => $aln1,
>                      '-seq'  =>
> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>
> Of particular note, is the gaps on the last line of the sequences. In
> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
> <v4.45 this is ('gkn----mcg').
>   
I'm not a T-coffee user but usually you can come across
these problems when you use different scoring parameters
when align sequences.

Could it be possible that they have simply changed the
default parameters for gap penalties and that kind of
stuff? It is possible to set them?

If so you can just run the test by defining
the scores in the param hash without using the default.

HTH

Remo