[Biojava-l] Evolutionary distances

vineith kaul vineith at gmail.com
Tue Oct 23 06:59:29 UTC 2007


This is what I have .....Thanks a lot  fr the help.


//Method to calculate the Kimura 2 parameter distance
public static double K2P(String sequence1,String sequence2){
        long p=0,q=0,numberOfAlignedSites=0; // P= transitional differences
(A<->G & T<->C) ; Q= transversional differences (A/G<-->C/T)


        char[] seq1array=sequence1.toCharArray();
        char[] seq2array=sequence2.toCharArray();

        for(int i=0;i<seq1array.length;i++){
                                // Number of aligned sites
                if(((seq1array[i]=='a') ||
(seq1array[i]=='A')||(seq1array[i]=='g') ||
(seq1array[i]=='G')||(seq1array[i]=='c') || (seq1array[i]=='C') ||
(seq1array[i]=='t') || (seq1array[i]=='T')) && ((seq2array[i]=='a') ||
(seq2array[i]=='A')||(seq2array[i]=='c') ||
(seq2array[i]=='C')||(seq2array[i]=='t') ||
(seq2array[i]=='T')||(seq2array[i]=='g') || (seq2array[i]=='G'))) {

                        numberOfAlignedSites++;
                }

                if(((seq1array[i]=='a') || (seq1array[i]=='A')) &&
((seq2array[i]=='g') || (seq2array[i]=='G'))) {
                        p++;
                }
                else
                if(((seq1array[i]=='g') || (seq1array[i]=='G')) &&
((seq2array[i]=='a') || (seq2array[i]=='A'))) {
                        p++;
                }
                else
                if(((seq1array[i]=='t') || (seq1array[i]=='T')) &&
((seq2array[i]=='c') || (seq2array[i]=='C'))) {
                        p++;
                }
                else
                if(((seq1array[i]=='c') || (seq1array[i]=='C')) &&
((seq2array[i]=='t') || (seq2array[i]=='T'))) {
                        p++;
                }
                else
                if(((seq1array[i]=='a') || (seq1array[i]=='A')) &&
((seq2array[i]=='c') || (seq2array[i]=='C'))) {
                                q++;
                        }
                else
                if(((seq1array[i]=='a') || (seq1array[i]=='A')) &&
((seq2array[i]=='t') || (seq2array[i]=='T'))) {
                                q++;
                        }
                else
                if(((seq1array[i]=='g') || (seq1array[i]=='G')) &&
((seq2array[i]=='c') || (seq2array[i]=='C'))) {
                                        q++;
                                }
                else
                if(((seq1array[i]=='g') || (seq1array[i]=='G')) &&
((seq2array[i]=='t') || (seq2array[i]=='T'))) {
                                        q++;
                                }
                else
                if(((seq1array[i]=='t') || (seq1array[i]=='T')) &&
((seq2array[i]=='a') || (seq2array[i]=='A'))) {
                                        q++;
                                }
                else
                if(((seq1array[i]=='t') || (seq1array[i]=='T')) &&
((seq2array[i]=='g') || (seq2array[i]=='G'))) {
                                        q++;
                                }
                else
                if(((seq1array[i]=='c') || (seq1array[i]=='C')) &&
((seq2array[i]=='a') || (seq2array[i]=='A'))) {
                                        q++;
                                }
                else
                if(((seq1array[i]=='c') || (seq1array[i]=='C')) &&
((seq2array[i]=='g') || (seq2array[i]=='G'))) {
                                        q++;
                                }




        }

         double P = 1.0 - (2.0 * ((double)p)/numberOfAlignedSites) -
(((double)q)/numberOfAlignedSites);
         double Q = 1.0 - (2.0 * ((double)q)/numberOfAlignedSites);
         System.out.print(numberOfAlignedSites+"\t"+p+"\t"+q+"\t");
         double dist = (-0.5 * Math.log(P)) - (0.25 * Math.log(Q));
         return dist;
}



















































On 10/22/07, Richard Holland <holland at ebi.ac.uk> wrote:
>
> You should take a look at the latest 1.5 release, in the
> org.biojavax.bio.phylo packages. This code is the beginnings of some
> phylogenetics code that will perform tasks as you describe. The future
> plan is to extend this code to cover a wider range of use cases. Kimura2P
> is already implemented here, in
> org.biojavax.bio.phylo.MultipleHitCorrection.
>
> If you can't find code that will do what you want, but have written some
> before, then please do feel free to contribute it. Even if it is slow, I'm
> sure someone out there will be able to help optimise it!
>
> cheers,
> Richard
>
> On Sun, October 21, 2007 5:30 pm, vineith kaul wrote:
> > Hi,
> >
> > Are there functions to calculate evolutionary pairwise distances like
> > Kimura2P,Finkelstein etc in Biojava
> > I did write smthng on my own but on large sequences it runs terribly
> > slow and I am not even sure if thats right.
> > --
> > Vineith Kaul
> > Masters Student Bioinformatics
> > The Parker H. Petit Institute for Bioengineering and Bioscience (IBB)
> > Georgia Tech, Atlanta
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-l
> >
>
>
> --
> Richard Holland
> BioMart (http://www.biomart.org/)
> EMBL-EBI
> Hinxton, Cambridgeshire CB10 1SD, UK
>
>


-- 
Vineith Kaul
Masters Student Bioinformatics
The Parker H. Petit Institute for Bioengineering and Bioscience (IBB)
Georgia Tech, Atlanta



More information about the Biojava-l mailing list