[Biopython] Some help to access "hidden" features :-)

Téletchéa Stéphane stephane.teletchea at inserm.fr
Thu Mar 7 21:26:37 UTC 2013


Dear biopythoners,

I am struggling in extracting some informations from a uniprot file.

a) Get the inital file, for instance 
http://www.uniprot.org/uniprot/P02724.xml
b) parse it:

python
 >>> from Bio import SeqIO
 >>> record=list(SeqIO.parse("P02724.xml",'uniprot-xml'))
 >>> print record[0].dbxrefs
...

 >>> for i in record[0].dbxrefs:
...     if 'PDB:' in i:
...             print i
...
PDB:1AFO
PDB:1MSR
PDB:2KPE
PDB:2KPF

In the Uniprot file, there are annotations for the 1AFO model: NMR 
method, starts at 81 and ends at 120.

The corresponding entry in the xml file is:

<dbReference type="PDB" id="1AFO">
<property type="method" value="NMR"/>
<property type="chains" value="A/B=81-120"/>
</dbReference>

According to the module source code 
(http://biopython.org/DIST/docs/api/Bio.SeqIO.UniprotIO-pysrc.html),
it is possible to access these datas, they are correctly handled:

         def  _parse_dbReference(element):  
299                self.ParsedSeqRecord.dbxrefs  <http://biopython.org/DIST/docs/api/Bio.SeqIO.UniprotIO-pysrc.html#>.append  <http://biopython.org/DIST/docs/api/Bio.SeqIO.UniprotIO-pysrc.html#>(element.attrib['type']  +  ':'  +  element.attrib['id'])  
300                #e.g.  
301                # <dbReference type="PDB" key="11" id="2GEZ">  
302                #   <property value="X-ray" type="method"/>  
303                #   <property value="2.60 A" type="resolution"/>  
304                #   <property value="A/C/E/G=1-192, B/D/F/H=193-325" type="chains"/>  
305                # </dbReference>  


However, I'm unable to go futher the "print i" above ...

How can I extract this information for the 'i' object above?
Do I have to use another approach?

Thanks a lot for your comments, links and remarks.

Stéphane

-- 
Equipe DSIMB - Dynamique des Structures et
des Interactions des Macromolécules Biologiques
INTS, INSERM-Paris-Diderot UMR-S665
6 rue Alexandre Cabanel - 75739 Paris cedex 15- France
Tél : +33 144 493 057
Fax : +33 147 347 431
http://www.dsimb.inserm.fr / http://steletch.free.fr





More information about the Biopython mailing list