[Bioperl-l] Bioperl-run: Testing alignments generated externally

Nathan Haigh n.haigh at sheffield.ac.uk
Thu Oct 26 10:33:55 UTC 2006

Remo Sanges wrote:
> Nathan Haigh wrote:
>> Sendu Bala wrote:
>>> Nathan Haigh wrote:
>>>> I'm thinking that it's not wise to test for things like
>>>> overall_percentage_identity etc in alignments that are generated by
>>>> external software like T-Coffee, Clustalw etc. Changes to software
>>>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>>>> alignment produced in different versions and thus affect the value
>>>> returned by such methods. Therefore, I think these methods should only
>>>> be tested from alignments loaded directly from t/data.
>>> Did you discover some specific problem cases?
>> My messages seem to be taking a while to come through, but, yes. It may
>> be due to the software changing default parameters, but it makes testing
>> the output for specific details pretty difficult and inconsistent. For
>> example, running T-Coffee, the following command from t/TCoffee.t
>> results in slightly different alignment:
>> $aln = $factory->run('-type' => 'profile',
>>                      '-profile' => $aln1,
>>                      '-seq'  =>
>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>> Of particular note, is the gaps on the last line of the sequences. In
>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
>> <v4.45 this is ('gkn----mcg').
> I'm not a T-coffee user but usually you can come across
> these problems when you use different scoring parameters
> when align sequences.
> Could it be possible that they have simply changed the
> default parameters for gap penalties and that kind of
> stuff? It is possible to set them?
> If so you can just run the test by defining
> the scores in the param hash without using the default.
> Remo
That is true, but it depends on the whether the wrapper is complete
enough to be able to set all the parameters provided by the software.


