[Biojava-l] Alignment consensus calculation

mark.schreiber at novartis.com mark.schreiber at novartis.com
Fri May 19 02:16:00 UTC 2006

Hi -

To get a Distribution[] over an alignment you could use 
DistributionTools.distOverAlignment(a) or one of the other overloaded 

To get a consensus you could simply find the most frequent Symbol in each 
Distribution. To make a more sophisticated consensus you could have 
thresholds below which you would report an ambiguity.

eg if:

a = 0.50
t = 0.40
c = 0.0
g = 1.0

Your routine would need to decide if the consensus should be 'a' or 'w' or 
the IUPAC symbol for [atg] which I cannot remember. You would probably use 
some sort of cutoff value. It might be a routine like this:

public SymbolList consensus(Alignment a, double threshold){

It might be a method that others find useful so please post it back to the 

Hope this helps,

- Mark

Mark Schreiber
Research Investigator (Bioinformatics)

Novartis Institute for Tropical Diseases (NITD)
10 Biopolis Road
#05-01 Chromos
Singapore 138670

phone +65 6722 2973
fax  +65 6722 2910

"Nathan S. Haigh" <n.haigh at sheffield.ac.uk>
Sent by: biojava-l-bounces at lists.open-bio.org
05/18/2006 11:44 PM
Please respond to n.haigh

        To:     <biojava-l at lists.open-bio.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] Alignment consensus calculation

I was wondering if there were any methods for generating a consensus
sequence for alignments? Or any suggestions for calculating the frequency 
symbols at each position in an alignment.
I had a look at the DistributionTools after seeing a past e-mail to the 
but couldn't figure if this would do the job as I'm new to Java.
Dr. Nathan S. Haigh
Bioinformatics PostDoctoral Research Associate
Room B2 211                                            Tel: +44 (0)114 22
Department of Animal and Plant Sciences                Mob: +44 (0)7742 
University of Sheffield                                Fax: +44 (0)114 22
Western Bank                                           Web:
S10 2TN                                                

avast! Antivirus: Outbound message clean.
Virus Database (VPS): 0620-2, 18/05/2006
Tested on: 18/05/2006 16:44:01
avast! - copyright (c) 1988-2006 ALWIL Software.

Biojava-l mailing list  -  Biojava-l at lists.open-bio.org

More information about the Biojava-l mailing list