[Bioperl-l] PDB ATOM records: name, segid, etc.

Andrew Dalke dalke@dalkescientific.com
Tue, 16 Jul 2002 04:06:51 -0600


Joe Krahn:
> I think 2.1 has SEGID documented. I assumed you left it out of the
> current version because later documents declared it obsolete.
> But, we crystallographers would like it re-instated.

Here's email from  John Westbrook, sent Oct 13, 1999 to the pdb list

   Some clarification was requested regarding the impact
   of this change on existing software, and on whether
   the proposed changes would render older PDB files
   invalid.   The change proposed here will not alter
   the existing format description nor will we propose
   future changes that will reuse the column range
   currently assigned to SEGID.  We plan only to
   discontinue placing this information in the archival
   files produced by PDB.  Application software can
   continue to use the SEGID field as they require.

I'm not saying it hasn't changed since then.

> Could we just use Babel as a proxy for various formats, or do we want to
> avoid executable proxies?

It depends on what you want to do.  For everything related to
bioinformatics, you only need to support PDB.  None of the other formats
I listed (SDF, mol2) support the concept of a "residue" so you would
have to figure out the sequence from scratch.  So people just stick with
PDB.  Also, SD files have a limit of 999 atoms and 999 bonds, which makes
it useless for what's usually in PDB files.  These truely are small
molecule file formats.

> 
> A related issue, what do you think about incorporating bond, angle, etc.
> data, and forcefield parameters?  And things like non-bonded
> interaction analysis, H-bond detection, etc? We have various bits of
> Perl code that we are trying to organize, and BioPerl's PDB module
> would be a good place to put it.

I don't do forcefields anymore.  It's a complex, detailed topic for
which I have personally have little interest.  Luckily, I worked with
people who had more interest, and they coded that part up.  :)

Take a look at Konrad Hinsen's MMTK for something like that in Python.
   http://dirac.cnrs-orleans.fr/MMTK/

That should help a lot in guidance on how to do that.

					Andrew
					dalke@dalkescientific.com