[Bioperl-l] XML parser preference?

Chris Fields cjfields at uiuc.edu
Thu Aug 10 13:35:21 UTC 2006


Jurgen,

Thanks for pointing that out!  However, the problem is we want to  
keep the number of dependencies down; there are already four XML  
parser dependencies for Bioperl (XML::Twig is one, but XML::LibXML  
isn't).

Maybe new modules which require XML parsing stick with four XML  
parsers.  However, not the current four (XML::DOM, XML::Twig,   
XML::Parser, XML::SAX).

Maybe we should pick four XML parsers, each with their own particular  
strengths:

1)  XML::SAX  (SAX parsing; flexible, can use pure Perl, ExpatXS, etc)
      Switch using XML::Parser to XML::SAX (done for  
Bio::SeacrhIO::blastxml)
2)  XML::LibXML (DOM parsing; maintained, up to date, fast)
     Switch using XML::DOM to XML::LibXML
3)  XML::Twig (DOM-like, SAX-based) - great for processing 'chunks'  
of XML
      Used in Bio::DB::Taxonomy::entrez
4)  XML::Simple (small XML) - very easy to use XML parser

Since they are currently available for most (all?) OS's, shouldn't be  
a problem.  What do you think Mauricio?

Chris


On Aug 10, 2006, at 7:29 AM, Jurgen Pletinckx wrote:

> | I have no doubt that XML::LibXML is a great parser (I've used
> | it a few
> | times), the problem with it is that it runs on top of libxml2's C
> | library. On *nix systems it's fairly simple to have this dependency
> | compiled and running, but what about having it under other OS's
> (e.g.
> | Windows)?
> |
> | Introducing XML::LibXML as a dependency into the toolkit will
> | probably
> | place EUtilities as a module not usable by everyone, especially
> those
> | who use BioPerl in a OS where installing/compiling C
> | dependencies can be
> | a headache.
>
> Regarding XML::LibXML, there does appear to be an up-to-date ppm
> package (which fetches libxml2.dll) at
>
> http://theoryx5.uwinnipeg.ca/ppms/XML-LibXML.ppd
>
> (and less than a week since the release of the corresponding
> version to cpan, too.)
>
> So the threshold for distribution to Windows, at least, is less
> high than it might have been.
>
> -- 
> Jurgen Pletinckx
> AlgoNomics NV
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list