[Biojava-l] DNA letters are lowercase???

Matthew Pocock matthew_pocock@yahoo.co.uk
Mon, 12 Aug 2002 17:52:37 +0100


Ryan Golhar wrote:
> Can anyone tell me why the the letters for DNA (a,c,t,g) are lowercase in
> DNATools?

Hi Ryan,

The static methods used to retrieve the bases are in lower case. The 
AtomicSymbol instances returned can be spat out as lower or upper case 
tepending on the SymbolTokenization you use.

(Someone who knows): does the default tokenization for DNA use upper or 
lower case? I don't care either way.

Ryan: To maintain the upper/lower case info in cromatograph files we 
would need to do a little trickery. If you send a file (mixed case) and 
a couple of use-cases, we can probably sort this out quickly enough. If 
the case is important to you (e.g. you need to know where the uncertain 
calls are), we can do this, and if you want to discard this information 
then we can also do that trivialy. I'm thinking thoughts like alighment 
of DNA against booleans (or 0/1) where A,1 would be A and supported 
(upper case), and T,0 would be T and not well supported (lower case).

Has this already been done?

Matthew

> 
> Some chromatogram files contains a mix of A,C,T,G and some lowercase letters
> for peaks that it could not absolutely determine.
> 
> Regardless, DNA is always represented with uppercase letters...
> 
> If there is no argument against it, can this be changed to upper case
> letters instead?
> 
> Ryan
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 



__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com