[Biojava-l] case-sensitive sequences

ilhami visne ilhami.visne at gmail.com
Tue Feb 27 17:54:13 UTC 2007


Thank you for quick answer. Here is the part of my code:

BufferedReader br = new BufferedReader(new FileReader("seq.fasta"));
RichSequenceIterator iter = RichSequence.IOTools.readFastaDNA(br,null);
RichSequence rs = iter.nextRichSequence();

Richard Holland wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> DNA is not case-sensitive. What I suspect you are parsing is the output
> of some sequencing software which is using case as a rough indicator of
> base calling quality?
>
> The case will have been lost when the file was parsed, not at the moment
> you iterate over the resulting sequences. This means that you have to
> modify your file parsing method to become case-sensitive.
>
> The default DNA alphabet is not case-sensitive. It makes no distinction
> between the two, and will convert everything to one case.
>
> If you need to preserve case, you will need to use a custom alphabet
> which treats the cases differently, and also specify a tokenizer which
> is case-sensitive. See the help pages at http://biojava.org/ for help on
> creating new alphabets. Or, have a look at the ABITools.QUALITY alphabet
> in BioJava, which interprets the case and stores the quality scores
> separately.
>
> Note however that your custom alphabet is NOT the same as the original
> DNA alphabet, and so you may not be able to use it in all the standard
> transforms (RNA etc.). If you do want to use these then you will have to
> make a second copy of each sequence using the normal DNA alphabet and
> pass that copy to the routines.
>
> If you post to this list the code you are using to read the file, then I
> can show you where to insert the reference to this new alphabet.
>
> cheers,
> Richard
>
> Ilhami Visne wrote:
>   
>> my sequence files contain case-sensitive symbols (TAATAACgagagg) and i am
>> using now RichSequenceIterator to iterate over the sequences.
>>
>> How can i tell biojava that it should parse it case-sensitive? if i call
>> seq.seqString() method, it should return exactly like it was in the file
>> with upper- and lower-case.
>>
>> thanx.
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>>     
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2.2 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFF5Etv4C5LeMEKA/QRAnGBAJ45eeQhmb4AT0CLTQCVyn5HxFS/cQCfXXgv
> uZKlrdE8y6vMfKcOlm9yBZA=
> =2VZC
> -----END PGP SIGNATURE-----
>
>   




More information about the Biojava-l mailing list