[Biojava-dev] Changes to Sequence in BioJava3

PATERSON Trevor trevor.paterson at roslin.ed.ac.uk
Fri Nov 5 16:04:38 UTC 2010


Hi Andy et al.

I have just being looking at the changes to Sequence that you were discussing and checked in last week...

Just to let you know :)..

These were a bit awkward for me, as I have implemented my own Reader/BackingStore to lazy load sequences from Ensembl, and I hadn't implemented most of the methods that would be needed to use the new way of getting 
Sequence strings

So as a temporary fix my subclass overrides the AbstractSequence method getSequenceAsString(), to do it the old way through the Reader, as does a method I am using getSequenceAsString(Integer, Integer)

As we are trying to get a publishable version of our Ensembl API together (which will use your first BioJava release version)  - I don't want to spend much time alterring things to do it the new way at this stage. If I get time (& money) I will have a look at implementing a fully functional reader using your new approach.


On a related tack, and something you have helped me out with before..

If we want to get 'reverse' and 'complement' s of a subsequence, it still seems to be the case that you need to make an intermediate Sequence object from the SubSequence View as these methods aren't available on the View interface... Is that correct?

As I have mentioned we are trying to write the Ensembl API up, and as a demo of potential usage we have made a little plug-in for the Savant genome browser that uses the Ensembl Java API  to pull chromosomes and annotations out of Ensembl... We have a SourceForge Project for all this now ( http://jensembl.sourceforge.net/ ) - so it will be excellent when we can tie our code to your first release version.

Cheers 
Trevor

Trevor Paterson PhD
email trevor.paterson at roslin.ed.ac.uk

Bioinformatics 
The Roslin Institute
The Royal (Dick) School of Veterinary Studies
University of Edinburgh
Scotland EH25 9PS
phone +44 (0)131 5274197
http://bioinformatics.roslin.ed.ac.uk/

Please consider the environment before printing this e-mail

The University of Edinburgh is a charitable body, registered in Scotland with registration number SC005336
Disclaimer:This e-mail and any attachments are confidential and intended solely for the use of the recipient(s) to whom they are addressed. If you have received it in error, please destroy all copies and inform the sender. 



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


-----Original Message-----
From: biojava-dev-bounces at lists.open-bio.org [mailto:biojava-dev-bounces at lists.open-bio.org] On Behalf Of Andy Yates
Sent: 03 November 2010 10:56
To: Andreas Prlic
Cc: biojava-dev
Subject: Re: [Biojava-dev] Changes to Sequence in BioJava3

It is which is why I want people to check their code still works. I can only run tests from my end :)

Andy

On 2 Nov 2010, at 21:54, Andreas Prlic wrote:

> thanks! looks like a major patch...
> A
> 
> On Tue, Nov 2, 2010 at 4:52 PM, Andy Yates <ayates at ebi.ac.uk> wrote:
>> As I said earlier these changes were going in. They are now checked in. Can people check their code still works against this. I've had to do some changes to core (obviously), genomic & alignment. Test cases all pass but I'd be happier once everyone okays this.
>> 
>> If so then I can push out a release
>> 
>> Andy
>> 
>> On 2 Nov 2010, at 15:16, Andy Yates wrote:
>> 
>>> Hi everyone,
>>> 
>>> As a caution to people with implementations already built on the Sequence interface I'm proposing a couple of changes to it. This will cause a binary class incompatibility & will have impacts in the methods you need to implement but I'll sort them out at the BioJava core end.
>>> 
>>> 1). Removal of getSequenceAsString(Integer,Integer,Strand)
>>> ** The implementation is patchy & buggy often exposing data from 
>>> backing stores
>>> 
>>> 2). Addition of SequenceView<C> getReverse()
>>> ** Will return the sequence in the reverse strand
>>> ** Also complemented if applicable
>>> 
>>> 3). Addition of isComplementable() to CompoundSet
>>> ** Used to support the above function
>>> 
>>> This means substrings of Sequences are retrieved as so:
>>> 
>>> DNASequence d = new DNASequence("ATGCGC"); d.getSubSequence(2, 
>>> 5).getSequenceAsString(); //Returns TGCG d.getSubSequence(2, 
>>> 5).getReverse().getSequenceAsString(); //Returns CGCT
>>> 
>>> To support -ve strand indexes you can use the Location objects (the returned Location is expressed in +ve coordinates):
>>> 
>>> Location l = Location.Tools.location(5, 2, Strand.NEGATIVE, 
>>> d.getLength()); SequenceView<NucleotideCompound> locationSeq = 
>>> l.getSubSequence(d); locationSeq.getSequenceAsString(); //Returns 
>>> CGCT
>>> 
>>> Hopefully the implications of these changes will be small & will 
>>> benefit the code
>>> 
>>> Andy
>>> 
>>> p.s. If you are wondering why I am not proposing a deprecation is 
>>> because I do not want developers writing quite complex code 
>>> depending on this functionality. If this was not an alpha release 
>>> then deprecation would be the only way to go
>>> 
>>> 
>>> _______________________________________________
>>> biojava-dev mailing list
>>> biojava-dev at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>> 
>> --
>> Andrew Yates                   Ensembl Genomes Engineer
>> EMBL-EBI                       Tel: +44-(0)1223-492538
>> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
>> Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> biojava-dev mailing list
>> biojava-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>> 

-- 
Andrew Yates                   Ensembl Genomes Engineer
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/





_______________________________________________
biojava-dev mailing list
biojava-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-dev



More information about the biojava-dev mailing list