[Biopython-dev] [Biopython - Bug #3433] (New) MMCIFParser fails on python3 for disordered atoms

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Tue May 28 07:50:41 UTC 2013


Issue #3433 has been reported by Alexander Campbell.

----------------------------------------
Bug #3433: MMCIFParser fails on python3 for disordered atoms
https://redmine.open-bio.org/issues/3433

Author: Alexander Campbell
Status: New
Priority: Normal
Assignee: Biopython Dev Mailing List
Category: Main Distribution
Target version: 
URL: 


The new shlex based parser works under python3, but reveals that the changed comparison rules in python3 lead to unhandled exceptions when parsing disordered atoms. Furthermore, it reveals that occupancy and temperature factor attributes of Atom objects were never cast from str to float types when parsed from mmCIF files. 

The comparison code which raises the exception under python3 is at Atom.py line 333: @if occupancy>self.last_occupancy:@ . The exception can be prevented my modifying MMCIFParser.py to cast occupancy and temperature factor to float. The following patch is a basic copy of the equivalent code in PDBParser.py:
<pre>
diff --git a/Bio/PDB/MMCIFParser.py b/Bio/PDB/MMCIFParser.py
index 64d16bc..4be6490 100644
--- a/Bio/PDB/MMCIFParser.py
+++ b/Bio/PDB/MMCIFParser.py
@@ -84,8 +84,15 @@ class MMCIFParser(object):
                 altloc=" "
             resseq=seq_id_list[i]
             name=atom_id_list[i]
-            tempfactor=b_factor_list[i]
-            occupancy=occupancy_list[i]
+            # occupancy & B factor
+            try:
+                tempfactor=float(b_factor_list[i])
+            except ValueError:
+                raise PDBConstructionException("Invalid or missing B factor")
+            try:
+                occupancy=float(occupancy_list[i])
+            except ValueError:
+                raise PDBConstructionException("Invalid or missing occupancy")
             fieldname=fieldname_list[i]
             if fieldname=="HETATM":
                 hetatm_flag="H"

</pre>

This patch was tested with the "mmCIF file for PDB structure 3u8h":http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=cif&compression=NO&structureId=3U8H , which would cause the mmCIF parsing exception under python3.2. After the patch, there were no exceptions during parsing and the occupancy and bfactor attributes had the correct type (float). The patch was also tested under python2.7, which worked just fine and also showed the correct types. I haven't tested earlier versions of python2, but the simple syntax ought to work.

Could a dev apply this patch? Or better yet, suggest a patch for casting the types at the StructureBuilder level, which would make such things independent of the specific parser used. This is just a minimal-quickfix patch, but I'm sure a better solution is possible.


-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org




More information about the Biopython-dev mailing list