[Biopython-dev] PDB tidy script, was: [Bug 275

Peter biopython at maubp.freeserve.co.uk
Sun Mar 22 15:53:21 UTC 2009


On Bug 2754 comment 12, I wrote:
http://bugzilla.open-bio.org/show_bug.cgi?id=2754#c12
>> I have a thought last night about this - how about we keep PERMISSIVE=1
>> as the default but offer a "very permissive" mode:
>>
>> PERMISSIVE=2 (or more), silently ignore problems, continue parsing.
>> PERMISSIVE=1 (or True), use stderr via the warning module, continue
>> parsing.
>> PERMISSIVE=0 (or False), raise exceptions, halt parsing.
>>
>> It would ofter an alternative way to silence the warnings in the unit
>> tests, and could be controlled at the level of individual tests - for
>> example where we want to make sure certain errors are caught.
>>
>> It might also be useful in ordinary scripts.

Eric replied:
> I like the idea. I still have to comb through the documentation for the
> warnings module some more, but I think it should be possible to do all of
> this through that API -- loading PERMISSIVE=0 turns the warnings into full
> exceptions, =1 makes them messages on stderr, and =2 switches them off.

It doesn't really matter - all the PDB contruction warning/errors go though
_handle_PDB_exception() to this would be the least invasive way to
implement this.

> At some point I'd like to make a script called something like pdbtidy.py
> which parses a potentially not-quite-conformant PDB file in a permissive
> mode, lists all complaints (including things like discontinuously-numbered
> residues, atom collisions, psi-phi outliers, etc.), and writes out a fixed
> version of the file. The model for this is HTML Tidy. Do you think this
> would have a place in the Biopython distribution?

It sounds useful to me, it can probably go in the scripts subdirectory,
along with the PDB surface exposure script.

One drawback is that currently Bio.PDB's header parsing leaves a lot to
be desired, and very little of the header is output when saving a PDB file
(Thomas' focus is/was very much on the 3D data).

Peter



More information about the Biopython-dev mailing list