[Biojava-l] DNA letters are lowercase???

Schreiber, Mark mark.schreiber@agresearch.co.nz
Tue, 13 Aug 2002 09:34:27 +1200


> (Someone who knows): does the default tokenization for DNA 
> use upper or 
> lower case? I don't care either way.
>

The default is lower case. I think this is some kind of not very often
observed convention where DNA/RNA symbols are lower case and Amino Acids
are upper case.
 
> Ryan: To maintain the upper/lower case info in cromatograph files we 
> would need to do a little trickery. If you send a file (mixed 
> case) and 
> a couple of use-cases, we can probably sort this out quickly 
> enough. If 
> the case is important to you (e.g. you need to know where the 
> uncertain 
> calls are), we can do this, and if you want to discard this 
> information 
> then we can also do that trivialy. I'm thinking thoughts like 
> alighment 
> of DNA against booleans (or 0/1) where A,1 would be A and supported 
> (upper case), and T,0 would be T and not well supported (lower case).
> 
> Has this already been done?
> 
> Matthew
> 

There is not currently a Boolean or Binary alphabet in biojava, however
you could fake it by making a SubInteger Alphabet (from 0 to 1) using
the static method from IntegerAlphabet. This would be more binary than
boolean but should do the same thing.

- Mark
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================