[Biojava-l] how to calculate consensus from a fasta file

Tue Jan 13 20:01:53 EST 2004

Hi Eric -

I'm not sure if this will solve your problem but you could make an 
Alignment object from the sequences and then use the methods of 
DistributionTools to get a Distribution object for each position in the 
Alignment. These distributions will tell you the frequency of each base at 
each position in the Alignment which you could use to make a consensus. 
You can also use DistributionTools to calculate information or entropy at 
each position.

Alternatively you could generate a markov model that represents the 
alignment and probabilistically represents the consensus.

Hope this helps

Mark

Mark Schreiber
Principal Scientist (Bioinformatics)

Novartis Institute for Tropical Diseases (NITD)
1 Science Park Road
#04-14 The Capricorn
Singapore 117528

phone +65 6722 2973
fax  +65 6722 2910

Eric BELLARD <eric_bellard at yahoo.com>
Sent by: biojava-l-bounces at portal.open-bio.org
01/13/2004 09:35 PM
Please respond to eric

        To:     biojava-l at biojava.org
        cc: 
        Subject:        [Biojava-l] how to calculate consensus from a fasta file

Hi,

I'd like to first thank you all for your great job on
this project.

I'm using biojava in a project to store some
sequencing result.

In my application the user upload sequences from a
fasta file, and I like to build an alignment from it.

With your project, I can easily parse the fasta file
and get all the sequences. 

Let's consider the sequences as lines.
I'd like to calculate the column consensus using dna
degenerate alphabet.

Does biojava implements a way to do this?

Thanks by advance.

Eric

__________________________________
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus
_______________________________________________
Biojava-l mailing list  -  Biojava-l at biojava.org
http://biojava.org/mailman/listinfo/biojava-l