[BioRuby] Biosprint bepipred implementation

George Githinji georgkam at gmail.com
Fri Sep 17 07:25:35 UTC 2010


Hi,
The Biosprint started here in Nairobi yesterday. Most of the
participants are new to Ruby and some to programming in general.
There are about 4 CS guys who are experts in other
languages(php,java,perl and python)
We are using git for version control and have forked GeorgeG bioruby
fork from github

Task 1 Description
BepiPred predicts the location of linear B-cell epitopes in proteins
using a combination of a hidden Markov  model and a propensity scale
method. The method is described in the following article:
#   Improved method for predicting linear B-cell epitopes.
#   Jens Erik Pontoppidan Larsen, Ole Lund and Morten Nielsen
#   Immunome Research 2:2, 2006.

We are implementing a wrapper class for bepipred linear B-cell epitope
prediction tool. Specifically we want to

1) be able to call it from within bioruby  as follows
  # === Examples
  #
  #   require 'bio'
  #   seq_file = 'test.fasta'
  #
  #   factory = Bio::Bepipred.new(seq_file)
  #   report = factory.query
  #   report.class # => Bio::Bepipred::Report

2) The report class should take the bepipred predictions and format them to GFF3

3) Document the tasks

4) Write unit tests for the methods.


We have divided ourselves into 4 groups to accomplish this task.

A couple of questions:
1) While developing, which is the best development lifecycle?
      - when testing the development version

2) what is the best way to call a command line program from within
bioruby. for example  I have this


require 'bio/command'
require 'shellwords'

module Bio

  # == Description
  #
  # A wrapper for Bepipred linear B-cell epitope prediction program.
  #
  # === Examples
  #
  #   require 'bio'
  #   seq_file = 'test.fasta'
  #
  #   factory = Bio::Bepipred.new(seq_file)
  #   report = factory.query
  #   report.class # => Bio::Bepipred::Report
  #
class Bepipred
  autoload :Report, 'bio/appl/bepipred/report'

  # Creates a new Bepipred execution wrapper object
  def initialize(program='bepipred',score_threshold=0.35,file_name='')
    @program = program
    @score_threshold = score_threshold
    @file_name = file_name
  end

  # name of the program ('bepipred' in UNIX/Linux)
  attr_accessor :program

  # options
  attr_accessor :score_threshold

  # return the names of the input sequences
  attr_reader :sequence_names

  def sequence_names(file)
    sequence_names = []
    Bio::FlatFile.auto(@file) do |f|
      f.each do |entry|
        sequence_names << entry.definition
      end
    end
    sequence_names
  end

  # TODO create a list of query sequences


  #TODO create a commandline as an array cmd
  def make_command
    cmd = [@program,"-t #{@score_threshold}", at file_name ]
  end

  #query the file
  def query(file_name)
    cmd = make_command
    exec_local(cmd)
  end

  # TODO create a parser class for the ouput
  # parse_results

 private
 #executes bepipred when called localy
 #The input is a file name or a path to the file containing protein
sequences in fasta format
 #This method does not work
 # There could be a bug in the way the cmd argument is created.
 def exec_local(cmd)
   Bio::Command.query_command(cmd)
 end

end
end

Seems not to work.
Please assist. Thanks.



-- 
---------------
Sincerely
George
KEMRI/Wellcome-Trust Research Program
Skype: george_g2
Blog: http://biorelated.wordpress.com/



More information about the BioRuby mailing list