[Bioperl-l] Bio::SeqIO::tinyseq
Dave Howorth
dhoworth at mrc-lmb.cam.ac.uk
Wed Jan 28 06:24:57 EST 2004
Heikki Lehvaslaiho wrote:
> The best way to do this is to ignore the root level of the xml, use perl to
> parse entries out of it, and pass entry xml only to the parser. This keeps
> the memory usage down and you can parse as large file as you want.
Hmmm, it may be pragmatic but I'm sure there are other meanings of
'best'. By throwing away the root level you're losing any chance to
validate the document or use other XML tools and you run the risk of
making assumptions ...
> local $/ = "</seqDiff>\n";
... There's no reason to expect there will ALWAYS be a newline following
the tag in a valid XML file (suppose it's created by some XSLT tool that
cares nothing about readability). You're making an assumption above and
beyond the specification about how the XML is represented.
Cheers, Dave
More information about the Bioperl-l
mailing list