<div dir="ltr">Hi Sanjeev,<div><br></div><div>Check breaks. As I told you, iterate over the amino acids and for each consecutive pair (e.g. residue 1 and 2), check the distance between the &quot;C&quot; atom of 1 and the &quot;N&quot; atoms of 2. This is a very well defined distance (peptide bond). Alternatively, and more simply, check CA-CA distances (e.g. &gt;4Å usually means gap).</div><div><br></div><div>Sometimes there is no chain identifier attributed to a particular chain..  check those PDBs for the column 22 of ATOM records.</div><div><br></div><div>Cheers,</div><div><br></div><div>João</div><div><br></div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">2014-10-26 11:31 GMT-05:00 Sanjeev Sariya <span dir="ltr">&lt;<a href="mailto:s.sariya_work@ymail.com" target="_blank">s.sariya_work@ymail.com</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div style="color:#000;background-color:#fff;font-family:HelveticaNeue-Light,Helvetica Neue Light,Helvetica Neue,Helvetica,Arial,Lucida Grande,sans-serif;font-size:16px"><div><br></div><div dir="ltr">Hi Joao,</div><div dir="ltr">Thank you for response.</div><div dir="ltr">If all residues aren&#39;t resolved in crystal, then extracting sequence from pdb, wouldn&#39;t be a good call.</div><div dir="ltr"><br> </div><div><div dir="ltr">I will be working a lot [~100s or 1000s] in near future. Is there any way, I can find break in my pdb file?<br></div><div dir="ltr"><br></div><div dir="ltr">- Another doubt, I&#39;ve, while printing the chain.ids in script. Many times, I get  chain &quot; &quot;, that is a space. </div><div dir="ltr">In script sent, code looks like:</div><div dir="ltr"><br></div><div dir="ltr">        st=PDBParser(QUIET=True).get_structure(&#39;X&#39;,i)<br>        ko=st.get_chains()<br>        for i in ko:<br>            print <a href="http://i.id" target="_blank">i.id</a> </div><div dir="ltr"><br></div><div dir="ltr">Why space name is present? <br></div><div dir="ltr"><br></div><div dir="ltr">Thanks.<br></div><br></div><div><div class="h5"><div style="display:block"> <div style="font-family:HelveticaNeue-Light,Helvetica Neue Light,Helvetica Neue,Helvetica,Arial,Lucida Grande,sans-serif;font-size:16px"> <div style="font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,sans-serif;font-size:16px"> <div dir="ltr"> <font face="Arial"> On Saturday, October 25, 2014 12:32 AM, João Rodrigues &lt;<a href="mailto:anaryin@gmail.com" target="_blank">anaryin@gmail.com</a>&gt; wrote:<br> </font> </div>  <br><br> <div><div><div><div dir="ltr">Hi there,<div><br clear="none"></div><div>The numbering in your PDB file is not continuous and it matches to regions in the structure that are missing residues. Open your PDB structure in Pymol and you&#39;ll see. Alternatively, print the C-N distances (peptide bond) for consecutive residues and you&#39;ll also notice when they are larger than ~3Å it corresponds to your break. <br clear="none"></div><div><br clear="none"></div><div>As for your discrepancy between the sequences in the FASTA file and the PDB, that&#39;s just because not all residues are resolved in the crystal structure.</div><div><br clear="none"></div><div>Cheers,</div><div><br clear="none"></div><div>João</div></div><div><br clear="none"><div>2014-10-24 13:10 GMT-05:00 Sanjeev Sariya <span dir="ltr">&lt;<a rel="nofollow" shape="rect" href="mailto:s.sariya_work@ymail.com" target="_blank">s.sariya_work@ymail.com</a>&gt;</span>:<br clear="none"><blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><div style="color:#000;background-color:#fff;font-family:HelveticaNeue-Light,Helvetica Neue Light,Helvetica Neue,Helvetica,Arial,Lucida Grande,sans-serif;font-size:16px"><div dir="ltr">Hi All,</div><div dir="ltr">I&#39;m having a hard time using and understanding biopython pdb.</div><div dir="ltr">./read_pdb_file.py 3OE6.pdb</div><div dir="ltr"><br clear="none"></div><div dir="ltr">I&#39;m attaching python script, pdb file, fasta file and output with mail.</div><div dir="ltr">I&#39;have following doubts:</div><div dir="ltr">- When I print the sequence I get in broken pieces. Why?</div><div dir="ltr">- Also the sequence printed doesn&#39;t match with the fasta file (attached).</div><div dir="ltr">- Am I doing making a silly mistake?</div><div dir="ltr"><br clear="none"></div><div dir="ltr">I am running script as:<br clear="none"></div><div dir="ltr">python read_pdb_file.py 3OE6.pdb </div><div dir="ltr"><br clear="none"></div><div dir="ltr">Kindly help and guide.<br clear="none"></div><div dir="ltr"><br clear="none"></div></div></div></div><br clear="none">_______________________________________________<br clear="none">

Biopython mailing list  -  <a rel="nofollow" shape="rect" href="mailto:Biopython@mailman.open-bio.org" target="_blank">Biopython@mailman.open-bio.org</a><br clear="none">

<a rel="nofollow" shape="rect" href="http://mailman.open-bio.org/mailman/listinfo/biopython" target="_blank">http://mailman.open-bio.org/mailman/listinfo/biopython</a><br clear="none"></blockquote></div><br clear="none"></div></div></div><br><br></div>  </div> </div>  </div> </div></div></div></div></blockquote></div><br></div>