[Biojava-dev] Code Update

Sat Feb 6 15:12:11 UTC 2010

Finally it's in. I've managed to get enough time to finish the  
transcription code off. The main features of this check-in are:

* DNA -> RNA -> Codon -> Peptide translation
* Support for all IUPAC tables
* New views for reversing sequences & complementing them
* Windowed sequence views giving portions of a sequence as requested
* TranscriptionEngine & TranscriptionEngine.Builder deal with the  
business of assembling the classes together as required
* Singletons provided from the classes they are in (e.g. IUPACParser  
has one) *but* no class requires a singleton!
* Utilities for working with IO streams & classpath resources (useful  
for testing)
* Test case shows 1000 translations of BRCA2 (from DNA) in 0.7seconds  
(on my MacBook Pro; YMMV); test case will break if it takes longer  
than a second
** This is a vast improvement over my first attempt that had a rate of  
1 per second hence why that was not checked in

Limitations are:

* Not much checking WRT lengths of sequence given to the code; need a  
strict & lenient mode
* Stop codon trimming controlled by a boolean
* No init-met translation (very important as some programs get a bit  
annoyed if they're given a V as an initiator AA)
* Not sure if there is a way to ask if a codon is a start codon  
easily; I'm sure it can be done just not as easily as we may want
* No way of modifying a badly translated peptide which we expect to  
badly translate

However it's workable & means if you have a DNASequence technically  
you can get a peptide by saying:

DNASequence s = getSeq();
ProteinSequence p = s.getRNASequence().getProteinSequence();

Now how's that for easy :)

Share & enjoy!

Andy