[Biojava-l] issue with translating codons with N

Nick England nickengland at gmail.com
Fri Sep 20 14:16:39 UTC 2013


Everyone,

I've stepped through with a debugger, and this is a bad bug.

The code to translate from RNA->Protein does the following:
- Take the ASCII Value for the 3 RNA bases, and multiple the first pos by
16, second by 4 and third by 1 and add them up.
- Assume there won't be any collisions.

Here are the values which it then uses:

A:65
G:71
C:67
U:85
N:78
ANA: 1417
CAU: 1417
ANG: 1423
CGA: 1423

Notice any hash collisions?

I don't get why this wasn't done in a standard JavaHashMap which would
ensure that any collisions were resolved. This is a pretty critical bug for
a biology informatics package.

Nick


On 20 September 2013 13:45, Nick England <nickengland at gmail.com> wrote:

> Hara,
>
> Hmm this is rather odd. I get the same issue with that sequence with a
> custom engine as well.
>
> My code has:
> Builder builder = new TranscriptionEngine.Builder();
>     builder.initMet(false);
>     builder.translateNCodons(true);
>     builder.trimStop(false);
>     TranscriptionEngine engine = builder.build();
>     Sequence<AminoAcidCompound> seq=engine.translate(new
> DNASequence("GTNTGTTAGTGT"));
>     assertEquals("XC*C", seq.toString());
>     Sequence<AminoAcidCompound> seq2=engine.translate(new
> DNASequence("ANAANG"));
>     System.out.println(seq2);
> the first sequence translates as expected, but your sequence is
> translating as HR, when it should be XX. This looks like a pretty bad bug!
>
> Nick
>
>
> On 19 September 2013 19:59, Hara Dilley <hdilley at sutrobio.com> wrote:
>
>> Hi,
>>
>> Is there an issue with the DNA Translation in biojava3.core?
>> It appears that it wants to translate "N" in certain cases
>> Executing:
>> new
>> DNASequence("ANAANG").getRNASequence().getProteinSequence().getSequenceAsString();
>> will produce  aa HR.
>>
>> thanks
>> Hara
>>
>> ________________________________
>>
>> This email and any attachments thereto may contain private, confidential,
>> and privileged material for the sole use of the intended recipient. Any
>> review, copying, or distribution of this email (or any attachments thereto)
>> by others is strictly prohibited. If you are not the intended recipient,
>> please contact the sender immediately and permanently delete the original
>> and any copies of this email and any attachments thereto.
>>
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>
>



More information about the Biojava-l mailing list