[Biojava-dev] biojava 3 progress

Andreas Prlic andreas at sdsc.edu
Wed Mar 17 22:14:40 UTC 2010


ok, a new module biojava3-genome is now in SVN...
A

On Wed, Mar 17, 2010 at 11:17 AM, Scooter Willis <HWillis at scripps.edu>wrote:

> Andreas
>
> The problem with putting feature classes in a separate module is that
> biojava-core sequences would then have a dependency on biojava-feature. A
> sequence needs to hold a collection of features so feature classes need to
> go in core. If features are created from gff the core module doesn't care
> where features come from.
>
> We could go with biojava-genomes and code related to dealing with genomes
> goes in that module. If you like biojava-genome or biojava-genomes go ahead
> and create it and email me so I can check it out.
>
> Thanks
>
> Scooter
>
>
>
> On Mar 17, 2010, at 1:46 PM, Andreas Prlic wrote:
>
> I like biojava-feature as a module name  for the GFF and features related
> code. (should we try to keep the module names singular?) Let me know if you
> want me to create the module for this...
> A
>
> On Wed, Mar 17, 2010 at 9:09 AM, Scooter Willis <HWillis at scripps.edu>wrote:
>
>> Andy
>>
>> Let me know if you have any major code changes for the core sequencing
>> handling that have been or could be checked in. So far I haven't needed to
>> touch any of the core sequence code but want to avoid merging code if you
>> have made any significant changes.
>>
>> I should have code to check in today and if we can't come up with a better
>> name I will ask Andreas to create a biojava3-genes module and I can then
>> check that code in for your review. The current problem is that we have
>> ExonSequence extending DNASequence when it could also be described as a
>> feature. One way to look at this that a TranscriptSequence is also a feature
>> of a DNA sequence and only when you want to have a stand alone class with
>> internal links back to parent sequence do you return a TranscriptSequence.
>> The TranscriptFeature would have ExonFeature and IntronFeature as children.
>> You can ask for a ExonSequence based on the ExonFeature. Once you get a
>> ProteinSequence you should be able to reverse the process and get back the
>> TranscriptSequence and the corresponding ExonFeatures and some sort of
>> mapping from a protein sequence position back to the three DNA sequence
>> positions that coded for it. This would need to handle the case where you
>> have a the end of an exon and the start of the next exon coding for a
>> particular amino acid sequence position.
>>
>> We also need to add in the ability to have tracks as a way to group
>> features. This way you export features based on a particular track as a
>> GFF/GFF3 file for importing into various genome browsers. You have one
>> genome you are working on with genes added in from three different gene
>> prediction algorithms each organized by a track. You should then be able to
>> determine overlaps of genes that were predicted and validated via blast
>> against uniprot and create another summary track of validated genes and
>> non-validate genes. If the feature classes we put together can make this
>> easy then I think we will have a solid design.
>>
>>
>> Scooter
>>
>>
>
>



More information about the biojava-dev mailing list