[Biojava-dev] RE: [Biojava-l] How to create a SymbolList with a String thatcontains illegal Char

David Huen david.huen at ntlworld.com
Thu Dec 11 03:42:23 EST 2003


On Thursday 11 Dec 2003 2:33 am, mark schreiber wrote:
> > -----Original Message-----
> > From: Matthew Pocock [mailto:matthew_pocock at yahoo.co.uk]
> > Sent: Wednesday, 10 December 2003 4:52 a.m.
> > To: mark schreiber
> > Cc: smh1008 at cus.cam.ac.uk; taoxu at bioinformatics.ubc.ca;
> > biojava-l at biojava.org
> > Subject: Re: [Biojava-l] How to create a SymbolList with a
> > String thatcontains illegal Char
> >
> > mark schreiber wrote:
> > >Is 'i' actually a legal symbol from the RNA alphabet, in
> >
> > terms of biojava?
> >
> > >If not how should we define it? Would it be best modelled as
> >
> > an atomic
> >
> > >symbol or some kind of ambiguity? Stretching back to my biochem
> > >undergrad days I think it should be atomic. That will mean the RNA
> > >Alphabets size is 5.
> >
> > Atomic. Our alphabets don't manage modifications well (e.g.
> > methylated DNA). Another thing to think about for v2.
>
> Hi -
>
> It's pretty simple to add this to nucleotide and to RNA, however we get
> into all sorts of trouble with making the complement table. The
> SimpleReversibleTranslationTable won't tolerate RNA having a size of 5
> when DNA only has a size of 4.
>
> I have done some investigations and inosine is complemented by 'c'. Is
> there a way to make an unequal translation table for this arrangement?
>

The trouble is inosine is used in tRNAs to introduce wobble.  It is 
complementary to A,C and U within RNAs.  Inosine occurs in mRNA extremely 
rarely and usually as a consequence of RNA editing (particularly of 
mammalian glutamate receptor transcripts) where A is changed to I.  The 
ribosome translates the edited I as if it were a G under these 
circumstances.  So CAG gets edited to read CGG frinstance.

Perhaps the better thing to do is to define a tRNA/snRNA alphabet.  Those 
can contain all the weird stuff and will never be translated thereby not 
needing a translation table.  That will deal with most of the cases 
(including the very many exotic edited bases in tRNAs) but that still 
doesn't solve the above where your proposal seems sensible.

> If anyone has some ideas I would like to hear them but won't be able to
> code them up till after Xmas as I'm off on holiday from tommorrow.
>
Holiday?  Hmmmph. <add Scroogian lines as necessary>


Regards,
David Huen


More information about the biojava-dev mailing list