[Biojava-dev] biojava 3 progress

Scooter Willis HWillis at scripps.edu
Wed Mar 17 16:09:29 UTC 2010


Andy

Let me know if you have any major code changes for the core sequencing handling that have been or could be checked in. So far I haven't needed to touch any of the core sequence code but want to avoid merging code if you have made any significant changes.

I should have code to check in today and if we can't come up with a better name I will ask Andreas to create a biojava3-genes module and I can then check that code in for your review. The current problem is that we have ExonSequence extending DNASequence when it could also be described as a feature. One way to look at this that a TranscriptSequence is also a feature of a DNA sequence and only when you want to have a stand alone class with internal links back to parent sequence do you return a TranscriptSequence. The TranscriptFeature would have ExonFeature and IntronFeature as children. You can ask for a ExonSequence based on the ExonFeature. Once you get a ProteinSequence you should be able to reverse the process and get back the TranscriptSequence and the corresponding ExonFeatures and some sort of mapping from a protein sequence position back to the three DNA sequence positions that coded for it. This would need to handle the case where you have a the end of an exon and the start of the next exon coding for a particular amino acid sequence position.

We also need to add in the ability to have tracks as a way to group features. This way you export features based on a particular track as a GFF/GFF3 file for importing into various genome browsers. You have one genome you are working on with genes added in from three different gene prediction algorithms each organized by a track. You should then be able to determine overlaps of genes that were predicted and validated via blast against uniprot and create another summary track of validated genes and non-validate genes. If the feature classes we put together can make this easy then I think we will have a solid design.
 

Scooter





More information about the biojava-dev mailing list