[Biojava-dev] Case-sensitive ProteinSequences
    Scooter Willis 
    HWillis at scripps.edu
       
    Wed Nov 30 02:08:23 UTC 2011
    
    
  
Once we load the amino acid sequence we would not maintain the upper case
or lower case as each amino acid is a static reference to the
corresponding amino acid compound to save on memory. FastaReader is fairly
flexible in that you can create your own SequenceCreator that does upper
case conversion and then you can parse upper lower case and add as a
feature to the Protein Sequence. Not sure if this solves your problem in
using the sequence alignment code as I think this returns a new sequence
that is aligned. If you look in Biojava3-genome module GeneFeatureHelper
has a method loadFastaAddGeneFeaturesFromUpperCaseExonFastaFile that use
upper lower case in the fasta file to designate exons as an example.
Thanks
Scooter 
On 11/29/11 8:29 PM, "Spencer Bliven" <sbliven at ucsd.edu> wrote:
>I'm currently trying to read a FASTA file which encodes some information
>in
>the case of each amino acid. Specifically, the FASTA contains an alignment
>where upper case letters are aligned and lower case are unaligned.
>
>The first problem I ran into was that lower-case letters are not valid as
>input to AminoAcidCompoundSet.getCompoundForString(String), which gets
>called indirectly from the FastaReader. This could be fixed by subclassing
>AminoAcidCompoundSet and calling toUpper() on the input. However, the
>second problem is that I need to extract that case information later on.
>My
>current solution is a subclass of AminoAcidCompoundSet which contains two
>copies of each amino acidone upper and one lower. This seems like a very
>ugly solution and it breaks all the Alignment algorithms (due to missing
>amino acids in the scoring matrices). Does anyone have a better
>suggestion?
>
>Thanks,
>Spencer
>
>_______________________________________________
>biojava-dev mailing list
>biojava-dev at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/biojava-dev
    
    
More information about the biojava-dev
mailing list