[Biopython] parser for KEGG pathways

Giovanni Marco Dall'Olio dalloliogm at gmail.com
Mon Jun 8 14:06:22 UTC 2009


Hi people,
I am writing a simple parser in python to read the KGML format, used to
store KEGG pathways (http://www.genome.jp/kegg/pathway.html).

Here it is my code:
- http://github.com/dalloliogm/kegg-kgml-parser--python-/tree/master
and here you can find some details:
- http://bioinfoblog.it/2009/06/a-parser-for-kegg-pathways-in-python/

However, before I go further with this, I would like to ask you whether you
know of any existing parser or library to do the same task with python.
I have been looking at this for a while, but I could only find a library in
R and one in Ruby. Moreover, I have not great experience with parsing XML
and I am sure I will soon commit many mistakes without acknowledging.


At the moment I just wrote a simple command-line tool which can be used to
parse a kgml file and draw it with matplotlib, convert to other formats, or
play with it as a networkx graph object. However the plan is to refactore it
as a small library.

Unfortunately I think this would be difficult to integrate it with
biopython, because it needs one new external dependency (networkx -
http://networkx.lanl.gov/index.html) and it uses ElementTree as it is
included in python 2.5, and if I have understood well biopython uses a
different parser for xml.


-- 
Giovanni Dall'Olio, phd student
Department of Biologia Evolutiva at CEXS-UPF (Barcelona, Spain)

My blog on bioinformatics: http://bioinfoblog.it



More information about the Biopython mailing list