[BioRuby] GSOC: BioRuby PhyloXML: Validating XML

Diana Jaunzeikare rozziite at gmail.com
Fri Jul 3 15:01:39 UTC 2009


Hi all,

I have chosen to validate the input xml file at the initialization step of
the parser using libxml validator against specified schema. It is quick (1
second on my machine for tree of life xml) and it solves a lot of problems.
In my parser, I don't have to worry anymore about what if user gives invalid
xml file, and don't have to do error checking for that, thus reducing the
parsing overhead.

 I have a question, if the libxml validator finds something wrong with the
xml file (and in general), where errors should go? Should exception be
raised, printed on stdout, or on error output?

Another question is where should phyloxml.xsd schema file go? Is
lib/bio/db/phyloxml.xsd fine? (the same place where phyloxml_parser.rb and
phyloxml_elements.rb are).

Using the validator  I understood that in xml elements have to go in
specified order. (like name element of phylogeny should go before the clade
element of phylogeny). (Correct me if I am wrong). If thats the case, it
will allow me to simplify some code.

Have a good 4th July weekend!

Diana



More information about the BioRuby mailing list