[Biojava-l] comparison of the pairwise aligner to emboss' needle

Wim De Smet Wim.DeSmet at UGent.be
Mon Apr 18 15:22:26 UTC 2011


Hi all,

I've been trying to generate some global alignments with biojava and 
comparing them with what needle returns. Doing this, I can't seem to 
reproduce needle's alignment with biojava. The score returned from 
biojava seems to be worse than that from needle, so I'm not sure what's 
happening here.

The sequences are AB004720 and Y17238 (I didn't attach a fasta file to 
avoid spamming people, let me know if you want one). I align them with:
GapPenalty penalty = new SimpleGapPenalty((short)-14, (short)-4);
PairwiseSequenceAligner<DNASequence, NucleotideCompound> aligner = 
Alignments.getPairwiseAligner(
new DNASequence(query, AmbiguityDNACompoundSet.getDNACompoundSet()),
new DNASequence(target, AmbiguityDNACompoundSet.getDNACompoundSet()),
PairwiseSequenceAlignerType.GLOBAL,
penalty, SubstitutionMatrixHelper.getNuc4_4());
SequencePair<DNASequence, NucleotideCompound>
alignment = aligner.getPair();

This gives me an alignment with only 23% similarity and a gap at the 
end. Varying the gap penalties can give me a gap in front too, but 
that's about it. When aligning in needle, I get a sequence with a higher 
score (6784 vs (-)5862) and 94% similarity (which seems closer to home). 
Needle I just run with defaults (so it uses EDNAFULL) and a go/ge of 14/4.

Could this be a bug or am I misunderstanding some of the options?

BTW, if I use a really large gapextend, say -4000, I also get a 
nullpointer exception.

TIA,
Wim De Smet

-- 
Wim De Smet
http://www.straininfo.net/



More information about the Biojava-l mailing list