Peter biopython at maubp.freeserve.co.uk
Thu Jun 19 13:38:29 UTC 2008

> Bio.CDD is a module with a parser for CDD (NCBI's Conserved Domain Database)
> records. The parser parses HTML pages from CDD's web site. Since the parser
> was written about six years ago, the CDD web site has changed considerably.
> Bio.CDD therefore cannot parse current HTML pages from CDD.

A couple of years ago, I wanted to get the CDD domain name and
description and ended up writing my own very simple and crude parser
to extract just this information.  Doing a proper job would mean
extracting lots and lots of fields, e.g.

I wonder if the NCBI make any of this available as XML via Entrez?  I
had a quick look and couldn't find anything.


