[Biopython-dev] flex, setup.py and Bio.PDB.mmCIF (Bug 2619)

João Rodrigues anaryin at gmail.com
Fri Mar 15 15:53:41 UTC 2013


Hi Michiel,


> 1) PDBParser and mmCIFParser both produce Structure objects, with any
> additional information found in mmCIF files stored as additional attributes
> of Structure objects (and the same thing for PDB files);
>

This approach has a few advantages. First and most obvious, converting one
file format to another seamlessly. Second, reducing the code to something
easier to maintain and to extend too. The disadvantage is that the
Structure objects might become a bit too bloated. On the other hand, we can
make them lighter and take advantage of Python's dynamic attributes (if I
need a b-factor, I just add atom.bfactor). This would also help a lot with
the current parser which is quite "sluggish" for some purposes and bring a
lot more flexibility (parsing pqr files, mol2 files, etc). All we'd need
would be a parser for each file format and a generic container to have the
backbone of the structure and extend is as we need. A simple flag for the
parser type would make checking if function X can be used on this
particular structure easier too.


>
> 2) We make a module mmCIF with a function mmCIF.read that reads an mmCIF
> file and stores the information in a mmCIF.Record object that is optimized
> for storing mmCIF information. The mmCIFParser uses mmCIF.read, and pulls
> out the necessary information from the mmCIF.Record object to create a
> Structure object (which is free of mmCIF-specific stuff). Users can make
> Structure objects if that is all they need, or use mmCIF.read if they want
> to have all information in an mmCIF file.
>

I'm completely unfamiliar with mmCIF files.. how much more information do
they have than a PDB file? And what kind of information is useful to
extract from them?

Speaking of which, we have a Biopython Structural Bioinformatics FAQ (i.e.
> how to use the Bio.PDB module) on the Biopython website with additional
> information on Bio.PDB, including some information on things that are not
> in the main Biopython Tutorial. Perhaps this is a good time to integrate
> this FAQ into the main documentation?


We could also update it a bit because it's been a while and there are some
different things here and there. And additions too.

Best,

João




More information about the Biopython-dev mailing list