[Biojava-dev] biojava 3 progress

Andreas Prlic andreas at sdsc.edu
Wed Mar 17 17:46:19 UTC 2010


I like biojava-feature as a module name  for the GFF and features related
code. (should we try to keep the module names singular?) Let me know if you
want me to create the module for this...
A

On Wed, Mar 17, 2010 at 9:09 AM, Scooter Willis <HWillis at scripps.edu> wrote:

> Andy
>
> Let me know if you have any major code changes for the core sequencing
> handling that have been or could be checked in. So far I haven't needed to
> touch any of the core sequence code but want to avoid merging code if you
> have made any significant changes.
>
> I should have code to check in today and if we can't come up with a better
> name I will ask Andreas to create a biojava3-genes module and I can then
> check that code in for your review. The current problem is that we have
> ExonSequence extending DNASequence when it could also be described as a
> feature. One way to look at this that a TranscriptSequence is also a feature
> of a DNA sequence and only when you want to have a stand alone class with
> internal links back to parent sequence do you return a TranscriptSequence.
> The TranscriptFeature would have ExonFeature and IntronFeature as children.
> You can ask for a ExonSequence based on the ExonFeature. Once you get a
> ProteinSequence you should be able to reverse the process and get back the
> TranscriptSequence and the corresponding ExonFeatures and some sort of
> mapping from a protein sequence position back to the three DNA sequence
> positions that coded for it. This would need to handle the case where you
> have a the end of an exon and the start of the next exon coding for a
> particular amino acid sequence position.
>
> We also need to add in the ability to have tracks as a way to group
> features. This way you export features based on a particular track as a
> GFF/GFF3 file for importing into various genome browsers. You have one
> genome you are working on with genes added in from three different gene
> prediction algorithms each organized by a track. You should then be able to
> determine overlaps of genes that were predicted and validated via blast
> against uniprot and create another summary track of validated genes and
> non-validate genes. If the feature classes we put together can make this
> easy then I think we will have a solid design.
>
>
> Scooter
>
>



More information about the biojava-dev mailing list