[Biopython] Some help to access "hidden" features :-)
Téletchéa Stéphane
stephane.teletchea at inserm.fr
Thu Mar 7 21:26:37 UTC 2013
Dear biopythoners,
I am struggling in extracting some informations from a uniprot file.
a) Get the inital file, for instance
http://www.uniprot.org/uniprot/P02724.xml
b) parse it:
python
>>> from Bio import SeqIO
>>> record=list(SeqIO.parse("P02724.xml",'uniprot-xml'))
>>> print record[0].dbxrefs
...
>>> for i in record[0].dbxrefs:
... if 'PDB:' in i:
... print i
...
PDB:1AFO
PDB:1MSR
PDB:2KPE
PDB:2KPF
In the Uniprot file, there are annotations for the 1AFO model: NMR
method, starts at 81 and ends at 120.
The corresponding entry in the xml file is:
<dbReference type="PDB" id="1AFO">
<property type="method" value="NMR"/>
<property type="chains" value="A/B=81-120"/>
</dbReference>
According to the module source code
(http://biopython.org/DIST/docs/api/Bio.SeqIO.UniprotIO-pysrc.html),
it is possible to access these datas, they are correctly handled:
def _parse_dbReference(element):
299 self.ParsedSeqRecord.dbxrefs <http://biopython.org/DIST/docs/api/Bio.SeqIO.UniprotIO-pysrc.html#>.append <http://biopython.org/DIST/docs/api/Bio.SeqIO.UniprotIO-pysrc.html#>(element.attrib['type'] + ':' + element.attrib['id'])
300 #e.g.
301 # <dbReference type="PDB" key="11" id="2GEZ">
302 # <property value="X-ray" type="method"/>
303 # <property value="2.60 A" type="resolution"/>
304 # <property value="A/C/E/G=1-192, B/D/F/H=193-325" type="chains"/>
305 # </dbReference>
However, I'm unable to go futher the "print i" above ...
How can I extract this information for the 'i' object above?
Do I have to use another approach?
Thanks a lot for your comments, links and remarks.
Stéphane
--
Equipe DSIMB - Dynamique des Structures et
des Interactions des Macromolécules Biologiques
INTS, INSERM-Paris-Diderot UMR-S665
6 rue Alexandre Cabanel - 75739 Paris cedex 15- France
Tél : +33 144 493 057
Fax : +33 147 347 431
http://www.dsimb.inserm.fr / http://steletch.free.fr
More information about the Biopython
mailing list