[Bioperl-l] Bio::Root::IO reads URLs from -file

Allen Day allenday at ucla.edu
Tue Aug 10 15:29:37 EDT 2004


On Tue, 10 Aug 2004, Peter van Heusden wrote:

> Hilmar Lapp wrote:
> 
> > I lean with Ewan to -url as I like explicit commands better than 
> > possibly dubious magic behind the scenes ... imagine someone stores 
> > files by names that match their url ...
> >
> > There's one thing though that's important IMO that Jason brings up: I 
> > don't know how you implemented this but I think Bio::Root::IO must not 
> > be dependent on LWP or any such beast that doesn't come with perl.
> >
> I'm with the majority in that 'magic' creates possible confusion and 
> more room for error. As to Hilmar's idea of not depending on LWP, I 
> think this is also a good idea, and maybe the URL code can be a kind of 
> 'mixin' - i.e. implement it in another module and then have 
> Bio::Root::IO optionally add it as a plugin. What do you intend to do 
> with this capability? Is there going to be another module that depends 
> on the -url ability?

Bio::Root::IO::_initialize_io() now accepts a '-url' argument.

If present, and if LWP is loadable, _initialize_io() attempts to use
LWP::Simple::getstore() to download the url to a local tempfile, and
assigns that tempfile to the equivalent of _initialize_io()'s '-file'
argument.  This works for HTTP, HTTPS, FTP, and all other protocols
supported by LWP.  If a file request fails, there is a retry loop in place
to retry a few times to fetch the file.

If LWP is not loadable, _initialze_io() uses Bio::Root::HTTPget to open a
socket to the file's host and sets '-fh' to read from this socket.  This
only works for HTTP.  There is no retry loop in place here, as
Bio::Root::HTTPget throws an error if it can't open the socket.  It's
possible to modify Bio::Root::HTTPget to do retries, but I didn't feel
like poking around in there.

Still remaining to be done:

  [1] add -url to the documentation
  [2] checking for existance of clashing '-file' or '-fh' arguments
  [3] add additional tests to t/RootIO.t for testing https and ftp 
      retrievals.

Regarding another module depending on this, yes, there will be one, that's
the only reason I added this :).  I have a new FeatureIO subsystem.  One
format it can parse is GFF v3.  Valid GFF v3 requires features to be typed
according to the Sequence Ontology or an extension thereof.  As part of
the parse it downloads the Sequence Ontology DAG-Edit files, parses them
into a Bio::Ontology, and returns Bio::SeqFeatureI objects with
Annotation::OntologyTerms attached.

I will commit the FeatureIO code soon.

-Allen


> 
> Peter
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 


More information about the Bioperl-l mailing list