[BioRuby] Thread-safety of alignment

Naohisa GOTO ngoto at gen-info.osaka-u.ac.jp
Tue Jan 26 15:00:04 UTC 2010


Hi Andrew,

On Tue, 26 Jan 2010 23:12:35 +1100
Andrew Grimm <andrew.j.grimm at gmail.com> wrote:

> Hi Naohisa Goto,
> 
> I tried creating a new factory in each thread, but I sometimes (but
> not always) have errors.

Please show ruby version and BioRuby version.
 % ruby -v
 % ruby -rbio -e 'puts Bio::BIORUBY_VERSION_ID'
(If you are using BioRuby 1.2.1 or earlier, 
 % ruby -rbio -e 'p Bio::BIORUBY_VERSION'
)

> Is the code in http://github.com/agrimm/bioruby-alignment-threading-replication/blob/master/test/test_multithreaded_alignment.rb
> correct? Does it cause problems for anyone else?

The "rescue RuntimeError" in line 15 may hide problems.
In my environment, it seems that the RuntimeError is raised
in lib/bio/alignment.rb. The error message I observed
without the rescue was
"alignment result is inconsistent with input data",
and output file created by Clustalw was unexpectedly empty.
It might be a bug of Tempfile in Ruby, but not sure.

With Ruby 1.8.7, errors are observed in some times.
  % ruby -v
  ruby 1.8.7 (2010-01-10 patchlevel 249) [i686-linux]
  ruby 1.8.7 (2009-04-08 patchlevel 160) [i686-linux]
  ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]

With Ruby 1.9.1-p378, no errors when I executed several times.
  % ruby -v
  ruby 1.9.1p378 (2010-01-10 revision 26273) [i686-linux]

> Some of the errors I get include the ones seen at http://gist.github.com/286775

The message "ERROR: Multiple sequences found with same name
(found 0 at least twice)!" is reported by ClustalW, and
it indicates incorrect input file sequence names. Maybe
two file contents are unexpectedly concatenated or mixed
possibly due to a bug of Tempfile, but not sure.

> It's possible that the issues are caused by problems in tempfile
> itself (which may have been fixed in August 2009 according to the
> changelog).

Another possibility is resource limits of the machine:
the number of child processes, total memory size, etc.
If exceeding limits, new child clustalw process could
not be started, or running clustalw processes might be
killed. This also causes void or truncated result files,
and leads to ruby-level errors.

Thanks,

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

> 
> Thanks,
> 
> Andrew
> 
> On Thu, Jan 21, 2010 at 12:50 AM, Naohisa GOTO
> <ngoto at gen-info.osaka-u.ac.jp> wrote:
> > Hi,
> >
> > On Wed, 20 Jan 2010 23:09:19 +1100
> > Andrew Grimm <andrew.j.grimm at gmail.com> wrote:
> >
> >> Is alignment intended to be thread-safe in bioruby? If so, should I
> >> use the same alignment factory between threads, or a separate one in
> >> each thread?
> >
> > It is not confirmed to be thread-safe, so it is safe to use
> > separate one in each thread.
> >
> > Currently, in BioRuby, manipulating the same object from different
> > threads is not intended. When manipulating the same object from
> > different threads is needed, using mutex is recommended.
> >
> > For library developers, it is encouraged to write thread-safe
> > code if possible, but not mandatory.
> >
> > Naohisa Goto
> > ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
> >
> >>
> >> Andrew
> >> _______________________________________________
> >> BioRuby Project - http://www.bioruby.org/
> >> BioRuby mailing list
> >> BioRuby at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioruby
> >
> >




More information about the BioRuby mailing list