[BioRuby] [Wg-phyloinformatics] GSOC: BioRuby PhyloXML: Validating XML

Hilmar Lapp hlapp at gmx.net
Sat Jul 4 08:48:05 UTC 2009

On Jul 3, 2009, at 7:40 PM, Diana Jaunzeikare wrote:

> [...]
> Another question is where should phyloxml.xsd schema file go? Is lib/ 
> bio/db/phyloxml.xsd fine? (the same place where phyloxml_parser.rb  
> and phyloxml_elements.rb are).
> What about not placing it anywhere and just using the one at: http://www.phyloxml.org/1.00/phyloxml.xsd
> I was considering it, but then that means that parser is dependent  
> on the computer being online and accessing it through internet. If  
> thats fine, then I can do that.

I agree, you wouldn't want that as a requirement. (Also, if you  
download it from there on-the-fly, you'd incur a further overhead, and  
need to provide ways to specify the necessary parameters for a proxy  
if the user is behind a firewall.)

Aside from that, it may be worth thinking about the question whether  
you want to reject the entire file with an exception if a single  
element (tree, or annotation) fails to validate, as opposed to  
accepting all records that validate and raise an exception on the one  
that doesn't.

The latter is typically how stream parsers of various formats will  
behave (except that they'll stop and abort the stream upon  
encountering a record that is invalid), but it may not apply all that  
well to XML parsing. Just thought I'd raise the question.

: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :

More information about the BioRuby mailing list