[BioRuby] Codeml parser

Fri Jan 8 16:29:07 UTC 2010

Maybe it'd be a good idea to start thinking at a level removed from actual
code, and create some general design documents first. Maybe we should
* describe what we actually want to achieve with the bioruby toolkit: should
it be a library foremost, or should it rather be an interface to run other
programs (e.g. BLAST)?
* make a high-level overview of different parts of bioruby:
  - how do we handle file formats: are the files actual objects, or do they
merely describe a biological entity? E.g. does a FASTA file merit the
instantiation of a FASTA object, or is it nothing more than a container of
Sequence objects?
  - how do different parts of the library interact? Should we have a Root
class such as in bioperl? What type of class should be used to interface
with the world (e.g. file parsing)? What type of class should be used to
actually contain the object data (e.g. annotated sequence)?

When that's done: come up with general guidelines for coding, e.g. always
use keyword-based argument lists or something (just an example).

jan.

2010/1/5 Jan Aerts <jan.aerts at gmail.com>

> All,
>
> It appears that the pre-hackathon bioruby meeting will be very useful :-)
> Why don't we use that time to focus on the bit-more-distant future of
> bioruby: bioruby 2.0? We could discuss what it should look like without
> having to worry about backward compatibility. Topics:
> * documentation style (I happen to agree with Naohisa on that)
> * class hierarchy: how would we organize the information if we had to start
> from scratch? (maybe we should follow bioperl's lead with a Root class?)
> * coding style
> * general interface decisions
> * ...
>
> jan.
>
> PS: Still don't know if I can make it to Japan. Will know this afternoon
> (broken foot might interfere...)
>
> 2010/1/5 Pjotr Prins <pjotr.public14 at thebird.nl>
>
> Hi Naohisa,
>>
>> First I thought you were kidding. But then I realise you are serious.
>>
>> I don't think we need to document every simple class variable/accessor
>> to accept this source code. That is overkill. If you don't understand
>> lnL or alpha, don't use it. We are not in the business of documenting
>> for documenting's sake.  Documenting lnL and alpha will be like:
>>
>> "Retrieve the lnL value from the Report"
>>
>> "Retrieve the alpha value from the Report"
>>
>> etc. etc. I don't think we should be doing that. Standard 1to-1
>> relations are obvious and don't need lots of text in the code base.
>>
>> If someone feels like filling in these obvious statements, fine. It
>> really goes against my grain. Do we document every single accessor?
>> Note the previous implementation did no such thing. That code was
>> accepted fine (and partially written by you).
>>
>> > Details of +buf+ (class, contents, etc) should also be written in RDoc.
>> > It is recommended to use the style written in the README_DEV.rdoc, or
>> > the style used in the Ruby source code.
>>
>> You mean the contents of the input buffer, which is the content of the
>> input file? I see many places in Bioruby where no such a thing is
>> done.  Why become strict on this now? If you want a different
>> descriptive name for the variable - that is fine. Propose me
>> a better name.
>>
>> > >      def to_a
>> > >        [ @position, @aaref, @probability, @omega ]
>> > >      end
>> > What is the purpose of the method?
>>
>> Access converter. Convenience, really. You can remove it if you
>> dislike it so much. I use it for testing and to write to a file. Could
>> be to_s too, but that fixates the format.
>>
>> > >    class PositiveSites < Array
>> >
>> > To inherit Array and to create original container class is discouraged.
>> > In BioRuby, we have deprecated Bio::Features and Bio::References in
>> > version 1.3.0, although they do not inherit Array but have an array
>> > in the object. (The classes still exist only for backward compatibility,
>> > in lib/bio/compat/features.rb and references.rb).
>>
>> PositiveSites object has the all the features of a list (ie Array). I
>> think inheritance is what it should be. It is an is_a relationship.
>> Adding a @list will just add code. Not only for initialization, but
>> also for iterators. I only see how we can move backwards from readable
>> code. Nor is it good OOP practice. Inheritance is not *always* bad,
>> though I agree it is used too quickly (in general).
>>
>> > In this case, except initialize, only a method named "graph" is added.
>> > I think it is good to add the graph method in the Report class and
>> > using an Array for storing PositiveSite objects.
>>
>> This is awful. The graph is a feature of PositiveSites, and not of the
>> report *parser*. To keep things simple it is best practise to have
>> functionality where it belongs. It is good OOP design. Your proposal
>> means the Report class becomes less obvious in what it is. Look how
>> clean it is now!
>>
>> What do other people think on this list. I am at a disadvantage here.
>>
>> I would like this code accepted in Bioruby, so other people can use
>> it. I disagree with most of above 'criticism'. I certainly balk at the
>> last non-OOP ones. This is not the first time I am really unhappy. I
>> can't believe how much trouble I have to go to for a simple class,
>> which, as it happens, has a perfectly acceptable implementation by
>> most measures.
>>
>> Pj.
>>
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>
>
>