[Biojava-dev] multiple alignment

Jose Manuel Duarte jose.duarte at psi.ch
Wed Jun 17 08:08:31 UTC 2015


Hi Stefan

Just a couple of comments, but not much direct help.

 From the source code I can see that the multiple alignment proceeds in 
4 steps: 1) pairwise alignments for all pairs, 2) hierarchical 
clustering into a guide tree, 3) progressive alignment and 4) 
refinement. However the refinement step doesn't seem to be implemented 
yet (there's a TODO in the code). That might explain the poorer result.

Another thing to take into account is that there are a couple of known 
bugs in pairwise alignments at the moment:

https://github.com/biojava/biojava/issues/274

https://github.com/biojava/biojava/issues/213

 From those, #213 may have some relation to the problem you are seeing, 
but it's hard to tell.

Jose


On 17.06.2015 03:07, stefan harjes wrote:
> Hi biojava,
>
> I am fighting with the multiple alignment of several DNASequences. 
> When I use the biojava computation I get alignments errors regarding 
> the gaps. Clustalx computes a much better result in comparison:
>
> biojava
> TTGGGGCCTCTAAACGGGGTCTT
> TTGGGGC-TCTAAC--GGGTCTT
> TTGGGGCCTCTAAACGGG-TCTT
>
> clustal
> TTGGGGCCTCTAAACGGGGTCTT
> TTGGGG-CTCT-AACGGG-TCTT
> TTGGGGCCTCTAAACGGG-TCTT
> ****** **** ****
> The most important difference is the second gap in the middle 
> sequence, which is obviously better aligned in clustal. Any hints as 
> to how to improve the biojava parameters/algorithms?
>
> Cheers
> Stefan
> p.s.
> I already tried to implement the actual gapPenalty which clustal uses 
> which is 10/.1 for the pairwise and 10/.2 for the multiple alignment. 
> (i.e. I changed all java short types to int, scaled all scoring 
> parameters including the matrix by 10 and implemented two different 
> gapPenalties in the two alignments). Unfortunately this does not 
> change anything.
> Does any of you guys have a copy of the IUB scoring matrix? which 
> would be my next try?
>
>
>
>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biojava-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biojava-dev/attachments/20150617/031ccc98/attachment.html>


More information about the biojava-dev mailing list