[BioRuby] bioruby documentation

Ryan Raaum rlr215 at nyu.edu
Mon Mar 6 18:46:12 UTC 2006


Hi all (again!),

Putting the formalization into a more concrete perspective, compare:

an example from the bioperl docs:
http://doc.bioperl.org/releases/bioperl-1.0.1/Bio/Tools/SeqStats.html

and an example from the Ruby on Rails docs:
http://api.rubyonrails.org/classes/ActionController/Base.html

The bioperl example is very formalized, so it is true that nothing is 
left out.  However, it doesn't read very well and most of the method 
documentation ends up being highly repetitive:  (To caricature... :)

Title   : do_something
Usage   : Object.do_something
Function: does something
Returns : something
Args    : precursor to something

Whereas (in my mind), the rails documentation reads very well, simple 
methods are simply documented, complex methods are documented in 
detail.  If the arguments are absent or obvious, don't talk about them; 
if the arguments are tricky, do talk about them. And so on.  No one 
really *wants* to document, and if documenting is annoying (= overly 
formalized), no one will.

I think a consistent, relatively formalized overview is good, but that 
overly formalized method and attribute documentation guidelines 
ultimately mean that little to no documentation will get done because 
it's too annoying (in most real-world open source projects).

Best,

Ryan

On Mar 6, 2006, at 1:21 PM, jan aerts (RI) wrote:

> Ryan,
>
> Nice piece of doc. I completely agree that the level of formalization 
> is entirely open to discussion. And I completely understand your 
> concerns. But on the other hand, a formalized list of things to be 
> described can, in my opinion, _help_ developers document their code, 
> rather than it would keep them from doing that. You can see it as a 
> checklist of things to document. In your piece of code, you describe 
> several aspects of the subseq method, but for every new method you'd 
> describe, you'd need to have this list of things in the back of your 
> head that you have to mention ("did I mention that it returns itself?" 
> "did I mention what the defaults for the arguments are", ...). If we 
> would have this list accessible on the wiki for any developer, he/she 
> could copy it into their code and fill it in like a checklist. I 
> suspect that would make things much easier on the developer (but 
> that's my own view, of course).
>
> You're right that rdoc already takes care of argument lists, but it 
> only lists them, instead of describing them. And in many instances, a 
> bioruby user would have to know what the arguments actually are 
> (including their defaults) without going into the code. Ergo: 
> arguments should be documented.
>
> What do you think?
> jan
>
>
> -----Original Message-----
> From: Ryan Raaum [mailto:rlr215 at nyu.edu]
> Sent: Mon 3/6/2006 3:14 PM
> To: jan aerts (RI)
> Cc: bioruby at open-bio.org
> Subject: Re: [BioRuby] bioruby documentation
>
>
> Hello again everyone!
>
>>
>> What do you think of using a standardized or (sound ugly:) formal
>> format? Does your documentation include some of the
>> synopsis/description/function/what it returns/arguments things? Do you
>> think it is useful/feasible to put them in that format?
>
> I think a reasonable standardization is a good thing, especially at the
> overview level of the class or module or whatever.  Here's an example
> of what I've been writing for method documentation:
>
> (This is for subseq in Bio::Sequence::Common)
>
>    # Returns a new sequence containing the subsequence identified by 
> the
> start
>    # and end numbers given as parameters.  *Important:* Biological
> sequence
>    # numbering conventions (one-based) rather than ruby's (zero-based)
> numbering
>    # conventions are used.
>    #
>    #   s = Bio::Sequence::Generic.new('atggaatga')
>    #   puts s.subseq(1,3)                      #=> "atg"
>    #
>    # Start defaults to 1 and end defaults to the entire existing 
> string,
> so
>    # subseq called without any parameters simply returns a new sequence
> identical
>    # to the existing sequence.
>    #
>    #   puts s.subseq                           #=> "atggaatga"
>    #
>
> So, I haven't been writing enormously formal specs - which seem like a
> bit of overkill for most of the methods, and rdoc takes care of the
> basics of argument lists.  Otherwise I note what to expect in return,
> or if the method does or does not modify the current object.  Also if
> there are any things that are dangerous or tricky...  I also give an
> example for all methods.
>
> It seems to me, and this is surely open to discussion, that formalizing
> the individual method descriptions too much makes them enormously
> tedious to write - so much so that very few will ever get written.
> BUT, on the class or module level, I think a certain amount of
> formalization is good, so that the overviews are reasonably consistent.
>
> Best,
>
> -Ryan
>
>>
>> Thanks,
>> jan.
>>
>>> -----Original Message-----
>>> From: Ryan Raaum [mailto:rlr215 at nyu.edu]
>>> Sent: 06 March 2006 14:41
>>> To: jan aerts (RI)
>>> Subject: Re: [BioRuby] bioruby documentation
>>>
>>> Good Morning All,
>>>
>>> I've had similar toughts to Jan, and am a couple methods away
>>> from completely documenting Bio::Sequence::* .  I was hoping
>>> to send that in to Toshiaki later today.  I haven't yet
>>> written a synopsis or description for them, mainly because I
>>> was using the process of documenting all the methods as a way
>>> of thoroughly understanding the use and structure of the
>>> classes.  If the documentation I've currently written is seen
>>> as reasonable and accepted, I would then add the overview
>>> documentation for those classes and files.
>>>
>>> Is there somewhere we can note which parts different people
>>> are working on documenting, so as to avoid any duplication of effort?
>>>
>>> Best!
>>>
>>> -Ryan
>>>
>>> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote:
>>>
>>>> Hi all,
>>>>
>>>> Given the posts about bioruby documentation in the last few
>>> months, my
>>>> own experiences with bioruby and a bit of encouragement
>>> from Toshiaki,
>>>> I'd like to commence documenting bioruby classes (in CVS)
>>> that are not
>>>> documented yet, and to standardize the documentation format
>>> for those
>>>> that already have documentation.
>>>>
>>>> Documentation would take the form of rdoc, so that it would be
>>>> browsable via the www.bioruby.org/rdoc website.
>>>>
>>>> Some guidelines that I would like to use in the documentation:
>>>> (1) Each class should have a description and synopsis. If
>>> there is a
>>>> unit test at the bottom, this can easily be tweaked into a
>>> synopsis.
>>>> If such a unit test is available, 'documentating' would
>>> mean (at least
>>>> in the first round) 'tweaking and copying the unit test in
>>> a comment
>>>> in front of the class'. Alternatively, unit tests and documentation
>>>> could be combined into one (as Ara and Pjotr discussed),
>>> but I'm not
>>>> experienced enough in ruby yet to do this in a simple,
>>> transparent way.
>>>> (2) Given the effort developers have put into writing the
>>> classes, it
>>>> would be nice if bioruby could reach as wide an audience as
>>> possible.
>>>> What I believe would help tremendously, is a standardized
>>> format for
>>>> documentation. By this I mean that the following
>>> information is given
>>>> for each method (sort of like in bioperl documentation):
>>>>     * synopsis
>>>>     * description
>>>>     * function
>>>>     * what it returns
>>>>     * any arguments
>>>> (3) It should be made clear to the user if a class should be used
>>>> directly, or if it just supports other classes (e.g.
>>>> Bio::Sequence::Format). Additional important info would be
>>> interaction
>>>> with other classes (e.g. "how does the sequence class interact with
>>>> the embl class?"). Original module writers have an
>>> important role in
>>>> describing this context.
>>>> (4) Encapsule the copyright information between '#--' and
>>> '#++', as it
>>>> distracts the user from what he/she wants to know. (It _is_
>>> important,
>>>> but not for the average user...)
>>>>
>>>>
>>>> Example of class documentation (from sequence.rb):
>>>> # = DESCRIPTION
>>>> # The Bio::Sequence class generically describes a nucleic or amino
>>>> acid sequence and is a superclass of # Bio::Sequence::NA and
>>>> Bio::Sequence::AA. Most methods that can be used on Bio::Sequence
>>>> objects are described # in Bio::Sequence::Common, Bio::Sequence::NA
>>>> and Bio::Sequence::AA # # If possible, create sequence
>>> objects using
>>>> the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the
>>>> Bio::Sequence # class will have to guess the type of
>>> sequence you're
>>>> talking about.
>>>> #
>>>> # = SYNOPSIS
>>>> #   # Create a nucleic or amino acid sequence
>>>> #   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
>>>> #   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
>>>> #   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
>>>> #
>>>> #   # Print it out
>>>> #   puts dna.to_s
>>>> #   puts aa.to_s
>>>> #
>>>> #   # Get a subsequence, bioinformatics style (first
>>> nucleotide is '1')
>>>> #   puts dna.subseq(2,6)
>>>> #
>>>> #   #...more examples from the unit test
>>>>
>>>> Example of method documentation (from sequence.rb):
>>>>   # Usage:
>>>>   #    my_seq = Bio::Sequence('AGGCACGAT')
>>>>   #    my_na = my_seq.na
>>>>   # Function::   Converts the Bio::Sequence object into a
>>>> Bio::Sequence::NA object
>>>>   # Returns::    a Bio::Sequence::NA object
>>>>   # Arguments::  none
>>>>   def na
>>>>     @seq = NA.new(@seq)
>>>>     @moltype = NA
>>>>   end
>>>>
>>>> As the time I can work on this is only limited, expect to
>>> see gradual
>>>> additions to the cvs repository. Any other people wishing
>>> to help out
>>>> are greatly welcome!!
>>>>
>>>> Of course, I promise not to touch other people's code, unless they
>>>> explicitely tell me to.
>>>>
>>>> Any thoughts/suggestions on this?
>>>>
>>>> Kind regards,
>>>>
>>>> Jan Aerts, PhD
>>>> Bioinformatics Group
>>>> Roslin Institute
>>>> Roslin, Scotland, UK
>>>> +33 131 527 4200
>>>>
>>>> ---------The obligatory disclaimer-------- The information
>>> contained
>>>> in this e-mail (including any attachments) is
>>>> confidential and is intended for the use of the addressee
>>> only.   The
>>>> opinions expressed within this e-mail (including any
>>> attachments) are
>>>> the opinions of the sender and do not necessarily
>>> constitute those of
>>>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>>>> stated by a sender who is duly authorised to do so on behalf of the
>>>> Institute.
>>>>
>>>> _______________________________________________
>>>> BioRuby mailing list
>>>> BioRuby at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>>
>>>
>
>




More information about the BioRuby mailing list