[Biopython] Overhauling of Bio.PDB module

Patrick Kunzmann padix.kleber at gmail.com
Mon Oct 21 10:20:31 UTC 2019


Hello Joao,

sounds like a good idea. How do I join the 3DSIG Slack workspace?

Best regards,

Patrick Kunzmann


On 16.10.19 19:54, João Rodrigues wrote:
> Hi Patrick,
>
> Thank you again for bringing this up. I do agree with you that this is 
> a necessity.
>
> When Bio.PDB first showed up, there were not so many Python libraries 
> out there for molecular structures. Now there are a few, so we should 
> think carefully about what features we want to offer - not to overlap 
> with others and duplicate efforts. My opinion is that BioPython is 
> very good at generally handling structures, allowing you do change 
> fields, select bits, etc, and do very simple calculations like 
> distances or superimpositions. At the CodeFest in Basel, we talked 
> that it would be awesome to have a selection language built-in 
> bioptyhon to allow us to do something like `mol.select("chain 
> A").write("chain_A.pdb")`. This also requires an overhaul of the data 
> structures we use to store atomic data.
>
> I have some time in the next few months I could spare to work on this. 
> Interfacing with Biotite would be interesting as well (as well as with 
> other packages). I'll start a #biopython channel at the 3DSIG slack so 
> that we can coordinate efforts. How does this sound?
>
> Cheers and again, thanks for bringing this up!
>
> Joao
>
> Patrick Kunzmann <padix.kleber at gmail.com 
> <mailto:padix.kleber at gmail.com>> escreveu no dia quarta, 16/10/2019 
> à(s) 09:38:
>
>     Hello Biopythoneers,
>
>     at the BOSC this year we talked about overhauling the Bio.PDB module.
>     The problem is that currently the atom coordinates are stored in a
>     separate NumPy array for each atom. This design prevents efficient
>     computation of all kinds of analyses (distances, angles,
>     superimpositions, etc.). One proposed possible solution to this
>     problem,
>     we talked about, was to put the coordinates of the entire
>     structure in
>     one NumPy array, and let the Atom, Residue, Chain and Structure
>     objects
>     point to positions in this array. The benefit of this approach is
>     that
>     functions could be directly applied onto the entire array, harnessing
>     the power of vectorization.
>
>     For the analysis we could adapt the vectorized functions from the
>     Python
>     package Biotite, a project I am currently working on
>     (https://www.biotite-python.org/apidoc/biotite.structure.html).
>     Usually,
>     these functions already accept the coordinates as NumPy array, so I
>     think only a few tweaks would be necessary for every function.
>
>     However, we would require one person or a small team who makes the
>     effort to implement the new structure types and adapts the analysis
>     functions. I could offer a pair of helping hands in the adaption
>     of the
>     analysis functions, but I don't have the time for anything more.
>
>     So the question is: Is there anyone out there, who is willing to
>     do this
>     work? Alternatively, I would propose to write a 'bridge' package
>     between
>     Biopython and Biotite, that converts the Biopython structure
>     representation into the representation in Biotite and vice versa. I
>     think, this solution is less elegant but would also require less
>     effort.
>
>     Best regards
>
>     Patrick Kunzmann
>
>     _______________________________________________
>     Biopython mailing list  - Biopython at mailman.open-bio.org
>     <mailto:Biopython at mailman.open-bio.org>
>     https://mailman.open-bio.org/mailman/listinfo/biopython
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20191021/acc371d3/attachment.htm>


More information about the Biopython mailing list