[Biojava-dev] Using DNASequence reverseComplement

Scooter Willis HWillis at scripps.edu
Mon May 17 16:31:33 UTC 2010


Trevor

Andy and I have been working on the concept of views and how to handle this so it is not confusing. We have not accomplished that goal yet. We both have different approaches we are trying to sort out and you may have come across a bug in methods that should be private or we haven't explored all possible use cases with corresponding test cases.

The problem is ReverseComplement implies negative strand and where the api currently makes it confusing in that methods are exposed that allow these type of non-sensical use cases.
Comments below


On May 17, 2010, at 11:24 AM, PATERSON Trevor wrote:

> Sorry for raising that behemoth earlier..
> 
> I have a separate problem with the DNASequence API -
> 
> Probably I just don't understand how to use the View objects
> 
> 
> If I make a DNASequence
> 
> DNASequence seq = new DNASequence("AAAAACCCCGGGTT");
> 
> i.e. length = 14,
> 
> I might reasonably want to get the ReverseComplement of bases 11-14, which should 'be' "AACC"
> 
> But I cannot manage to get this in one easy step....
> 
> seq.toString(): AAAAACCCCGGGTT --> FINE
> 
> seq.getReverseComplement().getSequenceAsString(): AACCCGGGGTTTTT --> FINE
> 
> But when I try to use bounds on this complement - methods refer back to the original seq's iterator, not the complement
> 
We would like to minimize the use of String as a intermediary to go between objects. Strings should be used for creating the sequence and export. Andy has some sequence views where you would pass in DNASequence to a SequenceView to get the desired transformation. I will let Andy comment on what he recommends. If you look through the test cases you will see what he has setup.
> seq.getReverseComplement().getSequenceAsString(11,14,Strand.POSITIVE): GGTT 
> 	i.e the same as seq.getSequenceAsString(11,14,Strand.POSITIVE)
> seq.getReverseComplement().getSequenceAsString(11,14,Strand.NEGATIVE): TTGG
> 	i.e the same as seq.getSequenceAsString(11,14,Strand.NEGATIVE)
> 
> Is this the desired behaviour? How would I get the desired reverseComplement fragment?
> 
> The only obvious way that I can see is  
> 
>                DNASequence subseq = new DNASequence(seq.getSequenceAsString(11, 14, Strand.POSITIVE));
>                System.out.println(""+ subseq.getReverseComplement().getSequenceAsString());
> 
> _____________________________________________________________________________________________
> 
> On a related point I was mightily confused by the Strand.POSITIVE/Strand.NEGATIVE enumeration
> 
This is a struggle between the computer science domain and the biology domain. Currently Sense and Strand have meaning and we need to stick with that if only to force the correct vocabulary on the programmer who needs to discuss with the biologists. A DNA sequence or RNASequence has different interpretations depending on the context where it is used. You may want to look at the sequence in both directions and that does not imply reverseComplement behavior. For example you are working on DNA patterns for DNA docking/binding you need to consider both strands of DNA as they form a 3d structure.
> I was naively interpreting them to refer to the strand of the DNA: 
> Whereas they infact refer to the directionality of the Iterator *on the same Strand*
> 
> A better name might be Direction:FORWARDS/Direction.BACKWARDS?
> Positive and negative strand has loaded biological meaning for newbies like me ( sense versus antisense )
> So I made the assumption that a Strand.NEGATIVE call would itself reverseComplement
> -- 
Keep the input/feedback coming as that is the best way for us to sort out the API.
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> 
> 
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev

Thanks

Scooter





More information about the biojava-dev mailing list