[Biojava-dev] About masking low complexity regions in Protein sequences

mark.schreiber at group.novartis.com mark.schreiber at group.novartis.com
Tue Apr 27 21:07:19 EDT 2004


Hi Xinuo -

I'm not entirely clear what you are asking. 

Do you mean is biojava able to detect low complexity regions? The answer 
is no not directly but it would be very easy to code any algorithm that 
you want to use to do this in biojava. I would expect you would make use 
of the distribution package to do this.

If you just want to annotate low complexity you could just add a feature 
to the low complexity part of your sequence.

If you mean does biojava deal with mixed case soft masking (eg 
ATGAAGTATaaaaaataaaaataaaaatGTGTGA) where the lower case is low complexity 
then the short answer is biojava is case insensitive and does not. The 
longer answer is that you could define your own FiniteAlphabet and 
SymbolTokenization such that A is not equivalent to a. You might want to 
make a cross product alphabet of DNA and a binary alphabet {complex, 
not_complex} to achieve this result. For a clue try looking at the 
org.biojava.bio.program.phred for how the Phred alphabet combines DNA and 
sequence quality in a cross product alphabet.

Good luck,

- Mark

Mark Schreiber
Principal Scientist (Bioinformatics)

Novartis Institute for Tropical Diseases (NITD)
10 Biopolis Road
#05-01 Chromos
Singapore 138670
www.nitd.novartis.com

phone +65 6722 2973
fax  +65 6722 2910





"Xinuo Chen" <xinuo.chen at warwick.ac.uk>
Sent by: biojava-dev-bounces at portal.open-bio.org
04/27/2004 06:28 PM

 
        To:     <biojava-dev at biojava.org>
        cc: 
        Subject:        [Biojava-dev] About masking low complexity regions in Protein   sequences


Hello.

I am just wondering whether BioJava provides the facility to mask the low
complexity regions in Protein sequences.

Could you please tell me whether BioJava provides ? if it provides, could
you please tell me with class or package I should use ?

Thanks you.


Yours sincerely

Xinuo CHEN
Research Associate and PhD
High Performance System Group
Department of Computer Science
University of Warwick
U.K

_______________________________________________
biojava-dev mailing list
biojava-dev at biojava.org
http://biojava.org/mailman/listinfo/biojava-dev





More information about the biojava-dev mailing list