[Biojava-dev] Biojava implementation of CE algorithm

Andreas Prlic andreas at sdsc.edu
Wed Jun 8 01:55:39 UTC 2011


Hi Ambikesh,

The short answer is:

You reported that 3 protein pairs show different results. For two of
them I don't see any difference, for one pair differences in the
parsing of the data and its representation can explain the slightly
different result.

The long answer:

Overall the BioJava PDB file parser and data representation is used
for both jCE and jFATCAT. This is different from the C code that is
run at the CE and by the FATCAT servers. As such one can expect some
minor differences in a few cases, since the underlying parsers can be
interpret some PDB files slightly differently.

Here the details about the 3 pairs:

2aza.A AND 1paz - the server reports 123 aminos and BioJava reports
120 aminos here.  In 1PAZ there are  three residues reported in the
SEQRES that have not been observed int the ATOM section (it has 120
groups with atoms). If I remember correctly the original CE source can
use "empty" amino acids, if they have not been observed. This would
explain the displayed numbers. BioJava only uses the amino acids that
have observed atoms. (also the servers' chain ID is reported as "_"
which indicates an ancient version of the file. it now has chain ID
"A")

1cew.I and 1mol.A - What is the difference that you can observe? Looks
exactly the same result to me

1cid and 2rhe - Also looks the same to me.

Andreas


On Tue, Jun 7, 2011 at 12:56 PM, Ambikesh Jayal <ambi1999 at gmail.com> wrote:
> Hi All,
>
> There seems to be some discrepancy for some protein sequences in results of
> Biojava implementation of CE algorithm and the implementation on CE website
> http://cl.sdsc.edu/ce/ce_align.html
> For example between protein sequences [2aza.A] AND [1paz]. Other such
> example are 1cew.I and 1mol.A, 1cid and 2rhe.
>
> Is there some reason for this discrepancy?
>
> Results using BioJava implementation of CE algorithm
>
> ************* [2aza.A] AND [1paz] ************
> CE
> afpChain.getTotalRmsdOpt() 2.5267815014062553
> afpChain.getOptLength() 82
>
> Results using CE website http://cl.sdsc.edu/ce/ce_align.html
>
> ************* [2aza.A] AND [1paz] ************
> Rmsd = 2.9Å
> Aligned/gap positions = 84/49
>
>
>
> Kind Regards,
> Ambi.
>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>




More information about the biojava-dev mailing list