<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Hi João,<div class=""><br class=""></div><div class="">Thanks for the quick reply and solution, much appreciated!</div><div class=""><br class=""></div><div class="">Cheers,</div><div class=""><br class=""></div><div class="">Alister<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On 1 Nov 2019, at 23:27, João Rodrigues <<a href="mailto:j.p.g.l.m.rodrigues@gmail.com" class="">j.p.g.l.m.rodrigues@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="auto" class="">Hi Alister,<div dir="auto" class=""><br class=""></div><div dir="auto" class="">The Biopython parser identifies unique residues based on chain ids. For a quick solution, you can use the pdb_segxchain tool from <a href="https://pypi.org/project/pdb-tools/" class="">https://pypi.org/project/pdb-tools/</a> to swap the segid to the chain id field. Then re-read using bio.pdb</div><div dir="auto" class=""><br class=""></div><div dir="auto" class="">Cheers, </div><div dir="auto" class=""><br class=""></div><div dir="auto" class="">João </div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">A sexta, 1/11/2019, 15:21, Alister Burt <<a href="mailto:alisterburt@gmail.com" class="">alisterburt@gmail.com</a>> escreveu:<br class=""></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi all,<br class="">

<br class="">

Apologies if there’s an easy solution to this but a quick google didn’t turn up anything!<br class="">

<br class="">

I’m trying to use Bio.PDB.PDBParser.get_structure() to read a pdb file from a collaborator. The file contains multiple copies of the a few different molecules, differentiated by the SEGID entry in columns 73:76 of the file. <br class="">

<br class="">

When trying to read this file I get the following error once for each atom in a chain which was already defined:<br class="">

> /Users/alisterburt/anaconda/envs/py37/lib/python3.7/site-packages/Bio/PDB/PDBParser.py:291: PDBConstructionWarning: PDBConstructionException: ('H_POP', 26, ' ') defined twice at line 76812.<br class="">

> Exception ignored.<br class="">

> Some atoms or residues may be missing in the data structure.<br class="">

>   % message, PDBConstructionWarning)<br class="">

<br class="">

This means the resulting Structure object only contains one copy of each molecule.<br class="">

<br class="">

I know this SEGID entry is not part of the official PDB format, does anyone have a quick solution that will allow me to read in all atoms from this file?<br class="">

<br class="">

Thanks in advance,<br class="">

<br class="">

Alister<br class="">

_______________________________________________<br class="">

Biopython mailing list  -  <a href="mailto:Biopython@mailman.open-bio.org" target="_blank" rel="noreferrer" class="">Biopython@mailman.open-bio.org</a><br class="">

<a href="https://mailman.open-bio.org/mailman/listinfo/biopython" rel="noreferrer noreferrer" target="_blank" class="">https://mailman.open-bio.org/mailman/listinfo/biopython</a></blockquote></div>

</div></blockquote></div><br class=""></div></body></html>