<div dir="ltr"><div><div>Ok, reading the MMCIFParser.py file, I found out that &quot;len(resseq_list)&quot; is equivalent to &quot;len(chain.get_list())&quot;.<br><br>There is still to understand, using a CIF file, how to get the id for chains equal to the id used for PDB file, that is get &quot;auth_asym_id&quot; instead of &quot;label_asym_id&quot;: is there a builtin option in BioPython?<br><br></div>Thanks,<br></div>Riccardo<br></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature"><div dir="ltr"><div style="text-align:center"><br></div><div style="text-align:center"><img src="https://plus.google.com/u/0/_/focus/photos/public/AIbEiAIAAABECPas2pe47LbIpQEiC3ZjYXJkX3Bob3RvKihkOWFmMzgzOWQ3ZTE1OWYzMGMyNmNhOGI2YmFmNzU2Mzk2NmZlY2NjMAER5cOjjj1HpOQcvD_9Ht-8vXsrPw?sz=32"> <i style="color:rgb(102,102,102)"><a href="http://chembioscripting.hol.es" target="_blank">X3D PyMOL Molecule Viewer (WebGL&ndash;powered)</a></i><i style="color:rgb(102,102,102)"><i style="color:rgb(102,102,102)"><br></i></i><span style="color:rgb(102,102,102);font-family:Verdana,Arial,Helvetica,sans-serif;font-size:16px"></span><i style="color:rgb(102,102,102)"><font face="arial, helvetica, sans-serif">ChemBioScripting | Gioacchino Riccardo Volpe<br></font></i></div></div></div></div>

<br><div class="gmail_quote">2015-01-26 19:36 GMT+01:00 Riccardo <span dir="ltr">&lt;<a href="mailto:mitma07@gmail.com" target="_blank">mitma07@gmail.com</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hello to the BioPython mailing-list,<br>I&#39;m using BioPython to calculate

 the dihedral angles in a protein together with the total number of 

residues for each chain; I made use of this construct for the total 

number of residues:<br><br><div style="margin-left:40px"><b>&nbsp;&nbsp;&nbsp; resseq_list = []</b><br><b>&nbsp;&nbsp;&nbsp; for residue in chain:</b><br><b>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; #print residue</b><br><b>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; residue_full_id = residue.get_full_id()</b><br><b>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; #print residue_full_id</b><br><b>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; resseq = residue_full_id[3][1]</b><br><b>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; #print resseq</b><br><b>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; resseq_list.extend([resseq])</b><br><b>&nbsp;&nbsp;&nbsp; #print resseq_list</b><br><b>&nbsp;&nbsp;&nbsp; print &quot;\nThe first residue of chain %s is %s&quot; % ( str(<a href="http://chain.id" target="_blank">chain.id</a>), resseq_list[0] )</b><br><b>&nbsp;&nbsp;&nbsp; print &quot;The last residue of chain %s is %s&quot; % ( str(<a href="http://chain.id" target="_blank">chain.id</a>), resseq_list[-1] )</b><br><b>&nbsp;&nbsp;&nbsp; print &quot;The total number of residues into chain %s is %s\n&quot; % ( str(<a href="http://chain.id" target="_blank">chain.id</a>), len(resseq_list) )</b><br></div><br>but the IDs for the chains differ from those shown, for example, in PyMOL.<br><br>Trying

 to figure out the cause, and comparing a PDB file with a CIF for the 

same macromolecule, I realized that the cause lies in the variables &quot;<b>_atom_site.label_asym_id</b>&quot; and &quot;<b>_atom_site.auth_asym_id</b>&quot; of CIF file, which correspond to columns [27:28] and [88:89] in the ATOM row of CIF file.<br><br>Reading <a href="http://www.openstructure.org/docs/1.3/io/mmcif/" target="_blank">here</a>, and in particular &quot;<b>AddMMCifPDBChainTr (cif_chain_id, pdb_chain_id)</b>&quot;, I thought that in practice the BioPython CIF parser considers &quot;<b>label_asym_id</b>&quot; instead of &quot;<b>auth_asym_id</b>&quot;. So I opened the file <b>MMCIFParser.py</b>, and effectively I found, at line 37:<br><br><div style="margin-left:40px"><b>&nbsp;&nbsp;&nbsp; chain_id_list=mmcif_dict[&quot;_atom_site.label_asym_id&quot;]</b><br></div><br>I tried to replace it with:<br><br><div style="margin-left:40px"><b>&nbsp;&nbsp;&nbsp; chain_id_list=mmcif_dict[&quot;_atom_site.auth_asym_id&quot;]</b><br></div><br>and reloading my script, the output has been the same as the one reported by PyMOL, for some test CIF files, but not for all.<br><br>Is there an option, in BioPython, that 

enables the output directly in that format? Eventually, it might be a 

good idea to implement it, as seen in <a href="http://www.openstructure.org/docs/1.3/io/mmcif/" target="_blank">that web page</a>?<br>Is there also another better way to know the total number of residues for each chain, such as in mine?<br><br>Thanks a lot, and many greetings to the BioPython mailing-list: this is my first time here!<br><br>Riccardo Volpe<div><div><div dir="ltr"><div style="text-align:center"><br></div><div style="text-align:center"><img src="https://plus.google.com/u/0/_/focus/photos/public/AIbEiAIAAABECPas2pe47LbIpQEiC3ZjYXJkX3Bob3RvKihkOWFmMzgzOWQ3ZTE1OWYzMGMyNmNhOGI2YmFmNzU2Mzk2NmZlY2NjMAER5cOjjj1HpOQcvD_9Ht-8vXsrPw?sz=32"> <i style="color:rgb(102,102,102)"><a href="http://chembioscripting.hol.es" target="_blank">X3D PyMOL Molecule Viewer (WebGL&ndash;powered)</a></i><i style="color:rgb(102,102,102)"><i style="color:rgb(102,102,102)"><br></i></i><span style="color:rgb(102,102,102);font-family:Verdana,Arial,Helvetica,sans-serif;font-size:16px"></span><i style="color:rgb(102,102,102)"><font face="arial, helvetica, sans-serif">ChemBioScripting | Gioacchino Riccardo Volpe<br></font></i></div></div></div></div>

</div>

</blockquote></div><br></div>