[Biopython-dev] PDB tidy script

Eric Talevich eric.talevich at gmail.com
Mon Apr 13 03:13:32 UTC 2009


Hi Thomas & everyone,

I've started a separate branch on GitHub for this work:
http://github.com/etal/biopython/tree/pdbtidy

I pushed one small change just now (partly to play with git branches), which
is basically the example code I gave earlier. It wraps the PDBLoader and
parse_pdb_header classes, and sticks a finger into PDBList too, so that
parsing and building a structure from a PDB file is a one-liner for both
local and RCSB-hosted files:

>>> from Bio import PDB
>>> prot = PDB.load('pdb2hmb.ent')
>>> dir(prot)
['__doc__', '__init__', '__module__', 'author', 'compound',
'deposition_date', 'head', 'journal', 'journal_reference', 'keywords',
'name', 'release_date', 'resolution', 'source', 'structure',
'structure_method', 'structure_reference']

Or:
>>> PDB.fetch('2hmb')
/usr/lib/python2.5/site-packages/Bio/PDB/PDBList.py:240: UserWarning:
Retrieving
ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/hm/pdb2hmb.ent.gz
  warn("Retrieving %s" % url)
<Bio.PDB.Loader._PDBLoader instance at 0x8d2ec4c>

(The warning is supposed to be a comment, but that cleanup is happening in
another branch: http://github.com/etal/biopython/tree/bug2754 ).


My idea is to pull all of the parse_pdb_header data out of the PDBParser and
Structure classes, and store it in the PDBLoader wrapper instead. The
existing "header" attributes can point to the PDBLoader parent if it exists,
or temporarily contain None or "" if necessary to avoid breaking scripts,
according to the deprecation plan. Annotations could either stay in
Structure or move to Loader. Then we'd have a fast, lean, consistent
hierarchy of classes for 3D structure work, and an easy API for loading and
exploring PDB files interactively.

Part of the pdbtidy concept is to check that the PDB header is consistent
with the structure it represents, so I'd like the API for metadata to be
just as nice as the existing one for 3D structure.

So, this is just a start, but I hope the intent is clear enough that someone
will tell me to stop if the whole idea is misguided.

Thanks,
Eric



More information about the Biopython-dev mailing list