[BioRuby] Bioruby HTML output

Toshiaki Katayama ktym at hgc.jp
Tue Jan 19 16:21:54 UTC 2010


Dear Pj,

On 2010/01/19, at 23:34, Pjotr Prins wrote:

> On Tue, Jan 19, 2010 at 09:41:31PM +0900, Toshiaki Katayama wrote:
>> All we need to do is to add these methods in every database class
>> comprehensively.
>> 
>> I think this is simple enough and beautiful.
>> I'll attach a primitive pseudo code in below.
>> Comments are welcome.
> 
> I agree with Tomoaki it is too restrictive. What, indeed, if we want
> to present the HTML in a different way?

Hmm. Could you provide me some use cases?

Override the output_html method, or, use some template engine to be
more generic.


> 
> The second comment is that I dislike the way the current files like
> sequence.rb and alignment.rb are mushrooming in size. There is much
> too much in there, which discourages people from diving in. I believe
> code should be readable, and easy to understand/digest.

I can agree some files became too large to learn and/or maintain.
But if we try to change the structure of current code base,
we need to define a clean criteria beforehand.

If we separate files into sub files, people then need to look around
the number of files, and it may also slow down the loading speed of
the bioruby library. It is a problem of balance.

In both cases, lack of excellent guide to read through the bioruby
library might be a essential issue.


> 
> Sticking in output 'details', like HTML generation, does not help.
> 
> I really would like all HTML to be in one sub-tree. Also XML, RDF and
> whatnot. When it is 'business' logic it should be in database. When it
> is output transformations it is not 'business' logic any longer.

I'm not sure about HTML but FASTA and RDF, for example, are tightly
related to the original database format/contents. So, I proposed
to have methods to generate formatted string in each database class.

There can be many ways to design OO class trees and to find the best
way to represent/abstract things is always a difficult task.

At some time, we may do refactoring to produce BioRuby 2.0.
Before doing that, we can discuss how to sit all classes/codes cleanly.
We may need someone who understand entire structure/contents of
the current codebase and willing to design a better one with a good sense.


> 
> Don't you think the Sequence, or KEGG, object should not care about
> HTML? Or RDF, or plotting? Those are separate functionalities. They
> share common access patterns - which are part of the DB class.

Again, we can take both approach. My current proposal is conservative one.
Just add these functionalities in each class as the class knows what is in it
and what is the best way to represent the contents.

If we separate formatting/plotting functionalities into separate class,
which might be something like Bio::FlatFile class who knows the header
line format of every database entries. Or we may design better one.

Anyway, I'm now listening. So, please don't stick with HTML things only
and think a global design to which we can plan to migrate.


> 
> Finally, why not use method names? What is the added value of 
> 
>  output(:html)
> 
> over 
> 
>  output_html
> 
> Pj.

Maybe from esthetics viewpoint?

I think it looks better, and, we can easily switch the output format
depending on the context without modifying the code.
Something like a @media property in CSS (screen, print etc.) in mind.

if used_for_semantic_web?
  format = :rdf
  # add some codes to do preparation job for SW
elsif used_for_blast?
  format = :fasta
  # add some codes to do preparation job for blast
end

# we don't need to change the following line in any context
entry.output(format)

Toshiaki






More information about the BioRuby mailing list