[BioRuby] RFC Caching (was BioRuby standards)

Pjotr Prins pjotr2008 at thebird.nl
Mon Sep 29 12:34:11 UTC 2008


Hi Naohisa,

On Thu, Sep 25, 2008 at 11:58:17PM +0900, Naohisa GOTO wrote:
> I agree it is good to have microarray support, if it is useful.
> Could you please show short examples and use cases of the
> microarray support?

You mean, like load file, read probe? There are unit tests for that
in BioLib. I'll expand on the Tutorial once this goes into BioRuby.

> Question: Does the microarray support work on Ruby 1.9?
> Most part of bioruby still do not support Ruby 1.9,
> though some code can run on Ruby 1.9.

I will test my sources with 1.9. Should be no problem - no legacy
stuff in there.

> In the current implementation, the singleton object stores
> @subdir, and it is the same as a global variable.
> For example, If a user want to get both GEO and ArrayExpress
> (hopefully supported in the future), and I wrote a code
> like this:
> 
>    Bio::Microarray::Cache.set('/home/who/.bioruby-cache')
>    obj1 = Bio::Microarray::GEO::GSE.new('GSE1')
>    obj2 = Bio::Microarray::ArrayExpress.new('Acc2')
>    obj3 = Bio::Microarray::GEO::GSE.new('GSE3')
>    obj4 = Bio::Microarray::ArrayExpress.new('Acc4')
> 
> In this case, how to specify sub directory?
> Or, am I misunderstanding @subdir?

Well, hey! You are making life a little difficult for me here. In an
earlier mail you wrote:

> Note that some classes use Tempfile class, a standard bundled
> class with Ruby by default, and the Tempfile class depends
> on enviroment variables (TMPDIR, TMP, etc.).

So I introduced tmpdir - which I had to remove later. Also you wrote:

> I think cache isn't suitable for standard, because its purpose
> may differ from program (or class, module, etc.) to program.

so I introduce a cache specific to the GEO module. This Cache
definition is for GEO and used as such. There are no conflicts with
other modules now - as there are none. Loading on demand is not a
solution - as that would be unusable.

The upside of a Singleton is that a cache gets defined once - and is
not part of the normal interfaces. Modules can define their own
subdirectories in the Cache. That would be OK.

Lets not take this further until someone wants to build on this
cache. It is not my itch to scratch. Like you wrote earlier, a cache
implementation is non-trivial. Right. I wasn't intending to do that.
The cache we have now is safe and sufficient for this module.

I'll stick in a warning not to use the cache for other purposes. OK?

> > It is a class factory. I'll have a think.
> 
> I suggest Bio::Microarray::GEO::XML.new(acc).

Not sure about that. The definition of 'new' is tied to initializing a
class. Here we have a factory method, we need to distinguish. Code
should really document itself. I think my 'create' is actually fine
for a factory, but if anyone has another suggestion? These examples
all use 'create':

  http://www.scribd.com/doc/396559/gof-patterns-in-ruby

Pj.



More information about the BioRuby mailing list