[Biojava-l] Setting anchors in AnchoredPairwiseSequenceAligner

Andreas Prlic andreas at sdsc.edu
Wed Nov 27 04:34:15 UTC 2013


needleman -wunsch is doing a global alignment, try smith waterman instead.

can you file a bug report on github?
https://github.com/biojava/biojava/issues

Thanks,

Andreas


On Tue, Nov 26, 2013 at 7:54 PM, Daniel Cameron <cameron.d at wehi.edu.au>wrote:

>  That API does make sense but unfortunately, it crashes when I try it.
> Having traced through the source & test cases from github, it appears that
> there isn't actually any test coverage for the (anchor != null) path in
> align().
>
> Additionally, align() also explicitly forces the final base of the query
> to align to the final base of the target which in my case is not supposed
> to be an anchor. I've copied my test cases with expected behaviour. Is
> there a bug track system I should be raising this with?
>
> Cheers
> Daniel
>
>     @Test
>     public void testNeedlemanWunschDNAAlignment() {
>         DNASequence query = new DNASequence("ACGTAACCGGTT",
> AmbiguityDNACompoundSet.getDNACompoundSet());
>         DNASequence target = new DNASequence("AACGTAACCGGTTACGTACGT",
> AmbiguityDNACompoundSet.getDNACompoundSet());
>         //                                    123456789012345678900
>         // Expected:                           |----------|
>         NeedlemanWunsch<DNASequence, NucleotideCompound> nw = new
> NeedlemanWunsch<DNASequence, NucleotideCompound>(query, target, new
> SimpleGapPenalty((short)5, (short)2), SubstitutionMatrixHelper.getNuc4_4());
>         AlignedSequence<DNASequence, NucleotideCompound> aligned =
> nw.getPair().getQuery();
>         assertEquals(2, (int)aligned.getStart().getPosition());
>         assertEquals(13, (int)aligned.getEnd().getPosition());
>     }
>     @Test
>     public void testAnchoredDNAAlignment() {
>         DNASequence query = new DNASequence("ACGTAACCGGTT",
> AmbiguityDNACompoundSet.getDNACompoundSet());
>         DNASequence target = new DNASequence("AACGTAACCGGTTACGTACGT",
> AmbiguityDNACompoundSet.getDNACompoundSet());
>         //                                    123456789012345678900
>         // Expected:                          |_----------|
>         AnchoredPairwiseSequenceAligner<DNASequence, NucleotideCompound>
> aligner = new AnchoredPairwiseSequenceAligner<DNASequence,
> NucleotideCompound>(query, target, new SimpleGapPenalty((short)5,
> (short)2), SubstitutionMatrixHelper.getNuc4_4());
>         int[] anchors = new int[query.getLength()];
>         for (int i = 0; i < anchors.length; i++) anchors[i] = -1;
>         anchors[0] = 0;
>         AlignedSequence<DNASequence, NucleotideCompound> aligned =
> aligner.getPair().getQuery();
>         assertEquals(1, (int)aligned.getStart().getPosition());
>         assertEquals(13, (int)aligned.getEnd().getPosition());
>
>     }
>
> On 27/11/2013 12:00 PM, Andreas Prlic wrote:
>
> Hi Daniel,
>
>  Clearly we need some documentation for this.
>
>  Looking at the source: if you get to AbstractMatrixAligner.align() you
> can see how the anchors are being used.
>
>  I just took a brief look, so I might be wrong, but I think the int[]
> array should have the length of the query sequence and each position
> indicates the counterpart in the target sequence. Positions that are <=0
> should be considered not to be anchored. If this is right, then this should
> be very close to what you were expecting.
>
>  Can you take a closer look and confirm if this works for you?
>
>  Thanks,
>
>  Andreas
>
> On Sun, Nov 24, 2013 at 11:22 PM, Daniel Cameron <cameron.d at wehi.edu.au>wrote:
>
>> Hello all
>>
>> I'm looking the BioJava API for anchored alignment and I'm unsure of how
>> to set alignment anchors. The API exposes int[] get/setAnchor() which is
>> confusing me somewhat and I'm unsure of what a BioJava anchor actually is.
>> My use case is the comparison of two sequences which I have a priori
>> knowledge of particular positions so I was expecting an API where I could
>> specify base positions in the query to correspond different positions in
>> the target. Eg: I was query base 4 to align to target base 10 and query
>> base 100 to align to target base 80. Is this possible?
>>
>> With anchor being only an int[] and the source looking like this is not
>> an array of paired values, how would one go about performing the above
>> alignment?
>>
>>
>> Thanks
>> Daniel Cameron
>>
>> ______________________________________________________________________
>> The information in this email is confidential and intended solely for the
>> addressee.
>> You must not disclose, forward, print or use it without the permission of
>> the sender.
>> ______________________________________________________________________
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>
> ______________________________________________________________________
> The information in this email is confidential and intended solely for the
> addressee.
> You must not disclose, forward, print or use it without the permission of
> the sender.
> ______________________________________________________________________
>



More information about the Biojava-l mailing list