[Biopython] Reading PDB files containing multiple copies of the same molecule
Alister Burt
alisterburt at gmail.com
Fri Nov 1 22:21:12 UTC 2019
Hi all,
Apologies if there’s an easy solution to this but a quick google didn’t turn up anything!
I’m trying to use Bio.PDB.PDBParser.get_structure() to read a pdb file from a collaborator. The file contains multiple copies of the a few different molecules, differentiated by the SEGID entry in columns 73:76 of the file.
When trying to read this file I get the following error once for each atom in a chain which was already defined:
> /Users/alisterburt/anaconda/envs/py37/lib/python3.7/site-packages/Bio/PDB/PDBParser.py:291: PDBConstructionWarning: PDBConstructionException: ('H_POP', 26, ' ') defined twice at line 76812.
> Exception ignored.
> Some atoms or residues may be missing in the data structure.
> % message, PDBConstructionWarning)
This means the resulting Structure object only contains one copy of each molecule.
I know this SEGID entry is not part of the official PDB format, does anyone have a quick solution that will allow me to read in all atoms from this file?
Thanks in advance,
Alister
More information about the Biopython
mailing list