[Biojava-l] BioJava translation

Scooter Willis willishf at ufl.edu
Wed Oct 13 17:16:01 UTC 2010


The Biojava3 has an additional validation layer and object creation going
from DNA sequence to RNA sequence and then using the appropriate translation
rules to return a protein sequence. Could be easily twice as fast if you
went from DNA sequence to ProteinSequence which would put it at 8 seconds.
We are going to carry a performance penalty setting everything up as a
proper object versus doing a simple String to String translation.



On Wed, Oct 13, 2010 at 12:34 PM, Pjotr Prins <pjotr.public23 at thebird.nl>wrote:

> On Wed, Oct 13, 2010 at 05:25:41PM +0100, Andy Yates wrote:
> > That's great news and should be even faster once we get rid of the
> requirement to upper case since you're having to parse the same sequence
> twice.
> >
> > I wonder what the C version does to make itself even faster
>
> The EMBOSS implementation is fastest by a mile - takes less than 3
> seconds. But the code is, uhm, hard to read.
>
> I think table lookups will win in C, whatever you try. But it may be an
> interesting exercise if we can get close. Note I am perhaps not using the
> fastest JVM.
>
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) Server VM (build 16.3-b01, mixed mode)
>
> Pj.
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>
>



More information about the Biojava-l mailing list