[Biojava-dev] Changes to Sequence in BioJava3

George Waldon gwaldon at geneinfinity.org
Wed Nov 3 15:35:24 UTC 2010


Hi Andy:

Note that the reverse of a sequence is usually used to indicate the sequence in reverse order, from the 3' end to the 5' end. I think you should name your method getReverseComplement if you want to return the reverse & complement of a sequence:

sequence: TGCG
reverse: GCGT
complement: ACGC
reverse & complement: CGCA

Regards,
George

On Tue, Nov 2, 2010 at 8:16 AM, Andy Yates <ayates at ebi.ac.uk> wrote:

    Hi everyone,

    As a caution to people with implementations already built on the Sequence interface I'm proposing a couple of changes to it. This will cause a binary class incompatibility & will have impacts in the methods you need to implement but I'll sort them out at the BioJava core end.

    1). Removal of getSequenceAsString(Integer,Integer,Strand)
    ** The implementation is patchy & buggy often exposing data from backing stores

    2). Addition of SequenceView<C> getReverse()
    ** Will return the sequence in the reverse strand
    ** Also complemented if applicable

    3). Addition of isComplementable() to CompoundSet
    ** Used to support the above function

    This means substrings of Sequences are retrieved as so:

    DNASequence d = new DNASequence("ATGCGC");
    d.getSubSequence(2, 5).getSequenceAsString(); //Returns TGCG
    d.getSubSequence(2, 5).getReverse().getSequenceAsString(); //Returns CGCT



More information about the biojava-dev mailing list