[Biojava-l] Evolutionary distances
vineith kaul
vineith at gmail.com
Tue Oct 23 06:59:29 UTC 2007
This is what I have .....Thanks a lot fr the help.
//Method to calculate the Kimura 2 parameter distance
public static double K2P(String sequence1,String sequence2){
long p=0,q=0,numberOfAlignedSites=0; // P= transitional differences
(A<->G & T<->C) ; Q= transversional differences (A/G<-->C/T)
char[] seq1array=sequence1.toCharArray();
char[] seq2array=sequence2.toCharArray();
for(int i=0;i<seq1array.length;i++){
// Number of aligned sites
if(((seq1array[i]=='a') ||
(seq1array[i]=='A')||(seq1array[i]=='g') ||
(seq1array[i]=='G')||(seq1array[i]=='c') || (seq1array[i]=='C') ||
(seq1array[i]=='t') || (seq1array[i]=='T')) && ((seq2array[i]=='a') ||
(seq2array[i]=='A')||(seq2array[i]=='c') ||
(seq2array[i]=='C')||(seq2array[i]=='t') ||
(seq2array[i]=='T')||(seq2array[i]=='g') || (seq2array[i]=='G'))) {
numberOfAlignedSites++;
}
if(((seq1array[i]=='a') || (seq1array[i]=='A')) &&
((seq2array[i]=='g') || (seq2array[i]=='G'))) {
p++;
}
else
if(((seq1array[i]=='g') || (seq1array[i]=='G')) &&
((seq2array[i]=='a') || (seq2array[i]=='A'))) {
p++;
}
else
if(((seq1array[i]=='t') || (seq1array[i]=='T')) &&
((seq2array[i]=='c') || (seq2array[i]=='C'))) {
p++;
}
else
if(((seq1array[i]=='c') || (seq1array[i]=='C')) &&
((seq2array[i]=='t') || (seq2array[i]=='T'))) {
p++;
}
else
if(((seq1array[i]=='a') || (seq1array[i]=='A')) &&
((seq2array[i]=='c') || (seq2array[i]=='C'))) {
q++;
}
else
if(((seq1array[i]=='a') || (seq1array[i]=='A')) &&
((seq2array[i]=='t') || (seq2array[i]=='T'))) {
q++;
}
else
if(((seq1array[i]=='g') || (seq1array[i]=='G')) &&
((seq2array[i]=='c') || (seq2array[i]=='C'))) {
q++;
}
else
if(((seq1array[i]=='g') || (seq1array[i]=='G')) &&
((seq2array[i]=='t') || (seq2array[i]=='T'))) {
q++;
}
else
if(((seq1array[i]=='t') || (seq1array[i]=='T')) &&
((seq2array[i]=='a') || (seq2array[i]=='A'))) {
q++;
}
else
if(((seq1array[i]=='t') || (seq1array[i]=='T')) &&
((seq2array[i]=='g') || (seq2array[i]=='G'))) {
q++;
}
else
if(((seq1array[i]=='c') || (seq1array[i]=='C')) &&
((seq2array[i]=='a') || (seq2array[i]=='A'))) {
q++;
}
else
if(((seq1array[i]=='c') || (seq1array[i]=='C')) &&
((seq2array[i]=='g') || (seq2array[i]=='G'))) {
q++;
}
}
double P = 1.0 - (2.0 * ((double)p)/numberOfAlignedSites) -
(((double)q)/numberOfAlignedSites);
double Q = 1.0 - (2.0 * ((double)q)/numberOfAlignedSites);
System.out.print(numberOfAlignedSites+"\t"+p+"\t"+q+"\t");
double dist = (-0.5 * Math.log(P)) - (0.25 * Math.log(Q));
return dist;
}
On 10/22/07, Richard Holland <holland at ebi.ac.uk> wrote:
>
> You should take a look at the latest 1.5 release, in the
> org.biojavax.bio.phylo packages. This code is the beginnings of some
> phylogenetics code that will perform tasks as you describe. The future
> plan is to extend this code to cover a wider range of use cases. Kimura2P
> is already implemented here, in
> org.biojavax.bio.phylo.MultipleHitCorrection.
>
> If you can't find code that will do what you want, but have written some
> before, then please do feel free to contribute it. Even if it is slow, I'm
> sure someone out there will be able to help optimise it!
>
> cheers,
> Richard
>
> On Sun, October 21, 2007 5:30 pm, vineith kaul wrote:
> > Hi,
> >
> > Are there functions to calculate evolutionary pairwise distances like
> > Kimura2P,Finkelstein etc in Biojava
> > I did write smthng on my own but on large sequences it runs terribly
> > slow and I am not even sure if thats right.
> > --
> > Vineith Kaul
> > Masters Student Bioinformatics
> > The Parker H. Petit Institute for Bioengineering and Bioscience (IBB)
> > Georgia Tech, Atlanta
> > _______________________________________________
> > Biojava-l mailing list - Biojava-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-l
> >
>
>
> --
> Richard Holland
> BioMart (http://www.biomart.org/)
> EMBL-EBI
> Hinxton, Cambridgeshire CB10 1SD, UK
>
>
--
Vineith Kaul
Masters Student Bioinformatics
The Parker H. Petit Institute for Bioengineering and Bioscience (IBB)
Georgia Tech, Atlanta
More information about the Biojava-l
mailing list