[Biopython] Overhauling of Bio.PDB module

rob miller rob.miller.gh at gmail.com
Sat Oct 30 12:42:05 EDT 2021


For anyone interested in this thread on developing a single large array of
atom coordinates for chains in Bio/PDB, please have a look at
https://github.com/biopython/biopython/pull/3774 .  The atom array part
wasn't the goal of my work, just came out in developing the internal <->
XYZ coordinate code to work better with NumPy - so please just consider it
as a sample implementation.

In this work buildAtomArray() runs when calculating internal coordinates
(dihedral angles) or has to be explicitly called, but after that the Atom
coordinates are views into the larger array.  This means existing code
should be unaffected, while additional code can work with the atom array or
not.

See https://github.com/rob-miller/rtm-biopython-scripts for sample code.

rob miller.


On Tue, Oct 22, 2019 at 11:41 PM Patrick Kunzmann <padix.kleber at gmail.com>
wrote:

> Hello Joao,
>
> sounds like a good idea. How do I join the 3DSIG Slack workspace?
>
> Best regards,
>
> Patrick Kunzmann
>
>
> On 16.10.19 19:54, João Rodrigues wrote:
>
> Hi Patrick,
>
> Thank you again for bringing this up. I do agree with you that this is a
> necessity.
>
> When Bio.PDB first showed up, there were not so many Python libraries out
> there for molecular structures. Now there are a few, so we should think
> carefully about what features we want to offer - not to overlap with others
> and duplicate efforts. My opinion is that BioPython is very good at
> generally handling structures, allowing you do change fields, select bits,
> etc, and do very simple calculations like distances or superimpositions. At
> the CodeFest in Basel, we talked that it would be awesome to have a
> selection language built-in bioptyhon to allow us to do something like
> `mol.select("chain A").write("chain_A.pdb")`. This also requires an
> overhaul of the data structures we use to store atomic data.
>
> I have some time in the next few months I could spare to work on this.
> Interfacing with Biotite would be interesting as well (as well as with
> other packages). I'll start a #biopython channel at the 3DSIG slack so that
> we can coordinate efforts. How does this sound?
>
> Cheers and again, thanks for bringing this up!
>
> Joao
>
> Patrick Kunzmann <padix.kleber at gmail.com> escreveu no dia quarta,
> 16/10/2019 à(s) 09:38:
>
>> Hello Biopythoneers,
>>
>> at the BOSC this year we talked about overhauling the Bio.PDB module.
>> The problem is that currently the atom coordinates are stored in a
>> separate NumPy array for each atom. This design prevents efficient
>> computation of all kinds of analyses (distances, angles,
>> superimpositions, etc.). One proposed possible solution to this problem,
>> we talked about, was to put the coordinates of the entire structure in
>> one NumPy array, and let the Atom, Residue, Chain and Structure objects
>> point to positions in this array. The benefit of this approach is that
>> functions could be directly applied onto the entire array, harnessing
>> the power of vectorization.
>>
>> For the analysis we could adapt the vectorized functions from the Python
>> package Biotite, a project I am currently working on
>> (https://www.biotite-python.org/apidoc/biotite.structure.html). Usually,
>> these functions already accept the coordinates as NumPy array, so I
>> think only a few tweaks would be necessary for every function.
>>
>> However, we would require one person or a small team who makes the
>> effort to implement the new structure types and adapts the analysis
>> functions. I could offer a pair of helping hands in the adaption of the
>> analysis functions, but I don't have the time for anything more.
>>
>> So the question is: Is there anyone out there, who is willing to do this
>> work? Alternatively, I would propose to write a 'bridge' package between
>> Biopython and Biotite, that converts the Biopython structure
>> representation into the representation in Biotite and vice versa. I
>> think, this solution is less elegant but would also require less effort.
>>
>> Best regards
>>
>> Patrick Kunzmann
>>
>> _______________________________________________
>> Biopython mailing list  -  Biopython at mailman.open-bio.org
>> https://mailman.open-bio.org/mailman/listinfo/biopython
>>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> https://mailman.open-bio.org/mailman/listinfo/biopython
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20211030/8a348ecb/attachment.htm>


More information about the Biopython mailing list