Hi What would be the most direct way of parsing XML files downloaded from PubmedCentral ftp using BioPython? These are files that use the archivearticle.dtd and when parsed using non-DTD based code generate broken paragraphs on the body of the document due to < > between <p> items of the body. Thanks in advance Paulo