<div dir="ltr">Doesn't the biopython polypeptide builder handle this sort of thing? (I think it uses CA distances). <div><br></div><div>e.g. for PDB entry 3beg (which has a chain break in both chain A and chain B):</div><div><br></div><div>Something like this:</div><div><br></div><div>from Bio.PDB.PDBParser import PDBParser</div><div>from Bio.PDB.Polypeptide import PPBuilder</div><div><br></div><div>structure = PDBParser().get_structure('3beg', '/path/to/3beg.pdb')</div><div>ppb = PPBuilder()</div><div>for model in structure:</div><div> for chain in model:</div><div> pp = ppb.build_peptides(chain)</div><div> if len(pp) > 1: </div><div> # Do something!</div><div> print c, len(pp) </div><div><div><br></div><div>This prints</div><div><Chain id=A> 2<br></div><div><Chain id=B> 2<br></div><div> </div></div><div><br></div><div>I'd take a look at the PPBuilder source first just to check it's really doing what you want, but if it's not quite right you can probably write your own subclass that does what you want.</div><div><br></div><div>Cheers,</div><div><br></div><div>Fred</div><div><br></div><div>ps I'd avoid writing your own parser unless there's a good reason for not using the biopython (or another) one. The PDB File format has lots of quirks as Lenna pointed out<br></div><div><br></div><div><br></div><div class="gmail_extra"><br><div class="gmail_quote">On 11 February 2015 at 02:21, Lenna Peterson <span dir="ltr"><<a href="mailto:arklenna@gmail.com" target="_blank">arklenna@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">I want to point out that this approach relies on consecutive residue numbering as a proxy for "no chain break." I've run into cases where this is not true - the main place this breaks down is with insertion codes. Alternate locations may also present challenges.<div><br></div><div>A more robust method would be to check that the coordinates of sequential CA atoms are within the expected distance.</div><div><br></div><div>For rolling one's own PDB parser, I'd recommend looking at the source code of Biopython's PDB parser for the column numbers that correspond to specific fields.</div><div><br></div><div>Cheers,</div><div><br></div><div>Lenna</div></div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Feb 10, 2015 at 8:46 PM, David Shin <span dir="ltr"><<a href="mailto:davidsshin@lbl.gov" target="_blank">davidsshin@lbl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi Patrick,<div><br></div><div>You should be able to write a script to do this (shell script with some python or awk).</div><div><br></div><div>Off the top of my head, for each file you would:</div><div><br></div><div>for each file:</div><div> extract the lines with ^ATOM into a new file to make things easier</div><div> read each line into some list</div><div> subtract the residue number from each line from the next line in the list</div><div> if that value is > 1 </div><div> print something ( the file name, or some flag)</div><div> else there are no breaks... can do something else if you want</div><div>end</div><div><br></div><div>The only tough parts are using spaces to separate items. If say a protein had 1000 residues, then the 1000 will run into the chain ID. So that's something to consider. Using specific column numbers would be the better way. </div><div><br></div><div>That and I'm not sure about the uniformity of PDB files that are really old.</div><div><br></div><div>Let me know if that helps, if not, I can maybe help out further.</div><div><br></div><div>Dave</div><div><br></div><div><br></div><div><br></div><div class="gmail_extra"><div><div><br><div class="gmail_quote">On Tue, Feb 10, 2015 at 2:24 PM, João Rodrigues <span dir="ltr"><<a href="mailto:j.p.g.l.m.rodrigues@gmail.com" target="_blank">j.p.g.l.m.rodrigues@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi,<div><br></div><div>Without manually checking every single one, there is no such list, at least that I know of. Your best bet could be to reduce your resolution as low as possible, usually those structures are of very good quality.</div><div><br></div><div>Cheers,</div><div><br></div><div>João</div></div><div><div><div class="gmail_extra"><br><div class="gmail_quote">2015-02-10 22:35 GMT+01:00 PC <span dir="ltr"><<a href="mailto:patrick.cossins@inbox.com" target="_blank">patrick.cossins@inbox.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<br>
I do know about PISCES lists but I want a list of PDB's without any chain breaks.<br>
<br>
Is there such a list or a way to obtain such a list?<br>
<br>
Thank you.<br>
<br>
____________________________________________________________<br>
FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!<br>
Check it out at <a href="http://www.inbox.com/earth" target="_blank">http://www.inbox.com/earth</a><br>
<br>
<br>
<br>
_______________________________________________<br>
Biopython mailing list - <a href="mailto:Biopython@mailman.open-bio.org" target="_blank">Biopython@mailman.open-bio.org</a><br>
<a href="http://mailman.open-bio.org/mailman/listinfo/biopython" target="_blank">http://mailman.open-bio.org/mailman/listinfo/biopython</a><br>
</blockquote></div><br></div>
</div></div><br>_______________________________________________<br>
Biopython mailing list - <a href="mailto:Biopython@mailman.open-bio.org" target="_blank">Biopython@mailman.open-bio.org</a><br>
<a href="http://mailman.open-bio.org/mailman/listinfo/biopython" target="_blank">http://mailman.open-bio.org/mailman/listinfo/biopython</a><br></blockquote></div><br><br clear="all"><div><br></div></div></div><span><font color="#888888">-- <br><div><div>David Shin, Ph.D</div><div>Lawrence Berkeley National Labs</div><div>1 Cyclotron Road</div><div>MS 83-R0101</div><div>Berkeley, CA 94720</div><div>USA</div></div>
</font></span></div></div>
<br>_______________________________________________<br>
Biopython mailing list - <a href="mailto:Biopython@mailman.open-bio.org" target="_blank">Biopython@mailman.open-bio.org</a><br>
<a href="http://mailman.open-bio.org/mailman/listinfo/biopython" target="_blank">http://mailman.open-bio.org/mailman/listinfo/biopython</a><br></blockquote></div><br></div>
</div></div><br>_______________________________________________<br>
Biopython mailing list - <a href="mailto:Biopython@mailman.open-bio.org" target="_blank">Biopython@mailman.open-bio.org</a><br>
<a href="http://mailman.open-bio.org/mailman/listinfo/biopython" target="_blank">http://mailman.open-bio.org/mailman/listinfo/biopython</a><br></blockquote></div><br></div></div>