<div dir="ltr">Hello to the BioPython mailing-list,<br>I'm using BioPython to calculate
the dihedral angles in a protein together with the total number of
residues for each chain; I made use of this construct for the total
number of residues:<br><br><div style="margin-left:40px"><b> resseq_list = []</b><br><b> for residue in chain:</b><br><b> #print residue</b><br><b> residue_full_id = residue.get_full_id()</b><br><b> #print residue_full_id</b><br><b> resseq = residue_full_id[3][1]</b><br><b> #print resseq</b><br><b> resseq_list.extend([resseq])</b><br><b> #print resseq_list</b><br><b> print "\nThe first residue of chain %s is %s" % ( str(<a href="http://chain.id" target="_blank">chain.id</a>), resseq_list[0] )</b><br><b> print "The last residue of chain %s is %s" % ( str(<a href="http://chain.id" target="_blank">chain.id</a>), resseq_list[-1] )</b><br><b> print "The total number of residues into chain %s is %s\n" % ( str(<a href="http://chain.id" target="_blank">chain.id</a>), len(resseq_list) )</b><br></div><br>but the IDs for the chains differ from those shown, for example, in PyMOL.<br><br>Trying
to figure out the cause, and comparing a PDB file with a CIF for the
same macromolecule, I realized that the cause lies in the variables "<b>_atom_site.label_asym_id</b>" and "<b>_atom_site.auth_asym_id</b>" of CIF file, which correspond to columns [27:28] and [88:89] in the ATOM row of CIF file.<br><br>Reading <a href="http://www.openstructure.org/docs/1.3/io/mmcif/" target="_blank">here</a>, and in particular "<b>AddMMCifPDBChainTr (cif_chain_id, pdb_chain_id)</b>", I thought that in practice the BioPython CIF parser considers "<b>label_asym_id</b>" instead of "<b>auth_asym_id</b>". So I opened the file <b>MMCIFParser.py</b>, and effectively I found, at line 37:<br><br><div style="margin-left:40px"><b> chain_id_list=mmcif_dict["_atom_site.label_asym_id"]</b><br></div><br>I tried to replace it with:<br><br><div style="margin-left:40px"><b> chain_id_list=mmcif_dict["_atom_site.auth_asym_id"]</b><br></div><br>and reloading my script, the output has been the same as the one reported by PyMOL, for some test CIF files, but not for all.<br><br>Is there an option, in BioPython, that
enables the output directly in that format? Eventually, it might be a
good idea to implement it, as seen in <a href="http://www.openstructure.org/docs/1.3/io/mmcif/" target="_blank">that web page</a>?<br>Is there also another better way to know the total number of residues for each chain, such as in mine?<br><br>Thanks a lot, and many greetings to the BioPython mailing-list: this is my first time here!<br><br>Riccardo Volpe<div><div class="gmail_signature"><div dir="ltr"><div style="text-align:center"><br></div><div style="text-align:center"><img src="https://plus.google.com/u/0/_/focus/photos/public/AIbEiAIAAABECPas2pe47LbIpQEiC3ZjYXJkX3Bob3RvKihkOWFmMzgzOWQ3ZTE1OWYzMGMyNmNhOGI2YmFmNzU2Mzk2NmZlY2NjMAER5cOjjj1HpOQcvD_9Ht-8vXsrPw?sz=32"> <i style="color:rgb(102,102,102)"><a href="http://chembioscripting.hol.es" target="_blank">X3D PyMOL Molecule Viewer (WebGL–powered)</a></i><i style="color:rgb(102,102,102)"><i style="color:rgb(102,102,102)"><br></i></i><span style="color:rgb(102,102,102);font-family:Verdana,Arial,Helvetica,sans-serif;font-size:16px"></span><i style="color:rgb(102,102,102)"><font face="arial, helvetica, sans-serif">ChemBioScripting | Gioacchino Riccardo Volpe<br></font></i></div></div></div></div>
</div>