[BioRuby] RFC Caching (was BioRuby standards)

Naohisa Goto ngoto at gen-info.osaka-u.ac.jp
Mon Sep 29 20:26:39 UTC 2008


Hi Pjotr,

> Hi Naohisa,
> 
> On Thu, Sep 25, 2008 at 11:58:17PM +0900, Naohisa GOTO wrote:
> > I agree it is good to have microarray support, if it is useful.
> > Could you please show short examples and use cases of the
> > microarray support?
> 
> You mean, like load file, read probe? There are unit tests for that
> in BioLib. I'll expand on the Tutorial once this goes into BioRuby.

OK.

> > Question: Does the microarray support work on Ruby 1.9?
> > Most part of bioruby still do not support Ruby 1.9,
> > though some code can run on Ruby 1.9.
> 
> I will test my sources with 1.9. Should be no problem - no legacy
> stuff in there.

Now, don't mind if it fails to run on Ruby 1.9.
We will be gradually migrating to 1.9 after the relase of Ruby 1.9.1
in the future, not now.

> > In the current implementation, the singleton object stores
> > @subdir, and it is the same as a global variable.
> > For example, If a user want to get both GEO and ArrayExpress
> > (hopefully supported in the future), and I wrote a code
> > like this:
> > 
> >    Bio::Microarray::Cache.set('/home/who/.bioruby-cache')
> >    obj1 = Bio::Microarray::GEO::GSE.new('GSE1')
> >    obj2 = Bio::Microarray::ArrayExpress.new('Acc2')
> >    obj3 = Bio::Microarray::GEO::GSE.new('GSE3')
> >    obj4 = Bio::Microarray::ArrayExpress.new('Acc4')
> > 
> > In this case, how to specify sub directory?
> > Or, am I misunderstanding @subdir?
> 
> Well, hey! You are making life a little difficult for me here. In an
> earlier mail you wrote:
> 
> > Note that some classes use Tempfile class, a standard bundled
> > class with Ruby by default, and the Tempfile class depends
> > on enviroment variables (TMPDIR, TMP, etc.).
> 
> So I introduced tmpdir - which I had to remove later. Also you wrote:
> 
> > I think cache isn't suitable for standard, because its purpose
> > may differ from program (or class, module, etc.) to program.
> 
> so I introduce a cache specific to the GEO module. This Cache
> definition is for GEO and used as such. There are no conflicts with
> other modules now - as there are none. Loading on demand is not a
> solution - as that would be unusable.

The name "Bio::Microarray::Cache" sounds as if this were common
to all microarray classes. 

To make clear the Cache is only for GEO, please move the class
under Bio::Microarray::GEO, i.e. the class name is changed from
Bio::Microarray::Cache to Bio::Microarray::GEO::Cache.
In addition, please move the file to bio/db/microarray/ncbi_geo/cache.rb
(no need to move under bio/io because it is specific to GEO and
not intended to be used with other classes/modules).

> The upside of a Singleton is that a cache gets defined once - and is
> not part of the normal interfaces. Modules can define their own
> subdirectories in the Cache. That would be OK.
> 
> Lets not take this further until someone wants to build on this
> cache. It is not my itch to scratch. Like you wrote earlier, a cache
> implementation is non-trivial. Right. I wasn't intending to do that.
> The cache we have now is safe and sufficient for this module.
> 
> I'll stick in a warning not to use the cache for other purposes. OK?

OK. In BioRuby, there are already many classes/modules/methods
with warning documents "users should not use it directly",
"internal use only", etc.

> > > It is a class factory. I'll have a think.
> > 
> > I suggest Bio::Microarray::GEO::XML.new(acc).
> 
> Not sure about that. The definition of 'new' is tied to initializing a
> class. Here we have a factory method, we need to distinguish. Code
> should really document itself. I think my 'create' is actually fine
> for a factory, but if anyone has another suggestion? These examples
> all use 'create':
> 
>   http://www.scribd.com/doc/396559/gof-patterns-in-ruby

"create" will be used, if no good suggestion given.
Though, maybe bioscientists don't know much about design patterns.

-- 
Naohisa Goto <ngoto at gen-info.osaka-u.ac.jp>




More information about the BioRuby mailing list