[Biopython-dev] pull request: Handle MMCIF with multiple models (closes 2943)

Lenna Peterson arklenna at gmail.com
Mon Apr 23 23:05:03 UTC 2012


On Mon, Apr 23, 2012 at 4:10 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
>
> Ack, I didn't look at that closely enough. Check out this patch to see
> the current situation:
> https://github.com/biopython/biopython/commit/abdab1a1132ec811f9636f8ba805bbb6cda6dbe9
>
> The models associated with a structure are numbered with a sequential
> integer id, starting from 0. It's always been like that in our PDB
> parser and we haven't changed it. To ensure that model numbers
> specified in the PDB file are preserved when writing the PDB back to
> file, the above patch introduced a new attribute on the Model object
> called serial_num (also an integer, equal to model.id unless specified
> otherwise). That attribute is only used when writing a new PDB file;
> Model.__getitem__ still uses Model.id as before.
>
> Perhaps that's surprising now that we read the serial numbers, but it
> kept backward compatibility. Plus, it preserves list-like behavior
> (item access via integers), even though the models are actually stored
> in a dict.
>
> So!
>
> In the mmCIF parser, the calls to structure_builder.init_model should
> be given two arguments instead of one: an integer id counting from 0,
> and then another integer (probably) containing the model "serial
> number" specified in the mmCIF file. In the event that an mmCIF file
> doesn't specify the model number, the serial number should be the same
> as the sequential id.
>
> Cool? This will also help us convert between PDB and mmCIF formats in
> the future.


Got it. I'm working on implementing the serial_number/model_number
dichotomy for MMCIF.


> As for accessing the models by their serial number, using string keys
> seems like an effective workaround, but still obviously a workaround
> rather than an ideal situation. Let's discuss that a little more,
> perhaps file another bug when we've reached some consensus.


Er, I made and then lost (still haven't *quite* gotten the hang of git
rebase) a patch that applied int() to the MMCIF model numbers. I'll
add that back so both model and serial numbers are ints.


Lenna



More information about the Biopython-dev mailing list