[Biojava-l] how to calculate consensus from a fasta file

mark.schreiber at group.novartis.com mark.schreiber at group.novartis.com
Tue Jan 13 20:01:53 EST 2004

Hi Eric -

I'm not sure if this will solve your problem but you could make an 
Alignment object from the sequences and then use the methods of 
DistributionTools to get a Distribution object for each position in the 
Alignment. These distributions will tell you the frequency of each base at 
each position in the Alignment which you could use to make a consensus. 
You can also use DistributionTools to calculate information or entropy at 
each position.

Alternatively you could generate a markov model that represents the 
alignment and probabilistically represents the consensus.

Hope this helps


Mark Schreiber
Principal Scientist (Bioinformatics)

Novartis Institute for Tropical Diseases (NITD)
1 Science Park Road
#04-14 The Capricorn
Singapore 117528

phone +65 6722 2973
fax  +65 6722 2910

Eric BELLARD <eric_bellard at yahoo.com>
Sent by: biojava-l-bounces at portal.open-bio.org
01/13/2004 09:35 PM
Please respond to eric

        To:     biojava-l at biojava.org
        Subject:        [Biojava-l] how to calculate consensus from a fasta file


I'd like to first thank you all for your great job on
this project.

I'm using biojava in a project to store some
sequencing result.

In my application the user upload sequences from a
fasta file, and I like to build an alignment from it.

With your project, I can easily parse the fasta file
and get all the sequences. 

Let's consider the sequences as lines.
I'd like to calculate the column consensus using dna
degenerate alphabet.

Does biojava implements a way to do this?

Thanks by advance.


Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
Biojava-l mailing list  -  Biojava-l at biojava.org

More information about the Biojava-l mailing list