[Bioperl-l] question about the nature of bioperl

Peter Kos kos@rite.or.jp" <kos@rite.or.jp
Wed, 21 Aug 2002 16:52:40 +0900


Hi,

Neither Bioperl nor the developers need my defense, nevertheless let 
me tell you my personal opinion.

Perl is designed in a way that one does not actually have to know 
Perl to start using it.
Bioperl, intertwined bloated monster or not, works the same way. 
Start with the smallest bit and you will love it. Or at lest 
appreciate. As will eventually your supervisor.

Many years ago once I participated in a workshop of Tim Hubbard. At 
the end he asked us to write some short take-home messages in 5 
points. In almost everybody's report one of the five points was: "the 
differences/conversion between file formats are a pain ..."
Most of the general documentations of Bioperl start with the sequence 
file format converter as an example. Catherine's nice tutorial 
(copyright Pasteur Inst.) has even three or four blends as I can 
recall. You do not even need to invent/write these ten lines, you 
just copy out to a new file and you do not have to worry about 
sequence file formats in the rest of your life.
It is true though, that you need to copy another ten lines if you 
want another converter for the alignment files. Or modify your first 
converter to recognize that in case of 
msf/clustalw/phylip/bl2seq/WhatTheHeckEver format it should use 
AlignIO rather then SeqIO.
And you do not need to know how and why to use the abstract phenomena 
if you do not need to use them. You don't even need to know (nor 
care) about them. However, these things are there for you in case you 
need to use some of them. (For example if you need "a factory class 
capable of instantiating SeqAnalysisParserI implementing parsers." Or 
similar.)


As for your current project, reading in the fasta file: it sounds for 
me quite straightforward like
$in  = Bio::SeqIO->new(-file => "inputfilename" , '-format' => 
'Fasta');
$seqobj =  $in->next_seq();

> strip everything but sequence characters
$seq = $seqobj -> seq;
I can not imagine how on Earth it could be done "seemingly more 
simply" than this.
Similarly "a simple blast parser" can not be more simple than the one 
which is already readily written for you.
> ... grabs sequences by user-defined hit-def keywords
like (after some startup lines)
while ( my $subject = $blast_report->nextSbjct() ) {
	if ($subject -> name () =~ /$hitdef/) {
		do whatever
	}	
}
This again is difficult to be seemingly more simply done.
You may not even need to wait with this till you are pressed for 
time/results.  <;o)


The point what I just want to make is that you may not read Mark's 
whole local public library in order to learn the adventures of 
Pinocchio. It is true that Bioperl as a whole is really getting to be 
a bit scary with respect to size and complexity, but it is because 
bioinformatics is getting to be a bit scary with respect to size and 
complexity.

I would (weakly) suggest you first read and "steal" the examples and 
scripts provided, rather than (or much before) trying to understand 
the whole network of classes, inheritances, factory concept and other 
funny stuff.
Moreover I would STRONGLY recommend you NOT to write any Blast 
parsers. If you do not use anything else from Bioperl, just one of 
these parsers, then still: installing Bioperl just for this only 
reason is much more simple than writing a very basic unstable 
undocumented and error-prone parser for yourself.

Use Bioperl. And have fun.
Peter

> Thanks for your reply.  I am relatively new to open source
> development and had
> not considered that bit about "no one praises you".  I started
> looking into
> bioperl a little while ago, and asked my supervisor why we don't 
use
> it.  His
> answer was that using anything from bioperl seems to require lots 
of
> other
> things used just so; in other words, it's too intertwined with
> itself.  And
> indeed, between CPAN modules for parsing etc. and our own code, we
> do well
> enough, seemingly more simply.  Your points were all well made
> however, I
> appreciate the insight.  I personally am interested in further
> exploring
> bioperl.
>
> Best regards,
>
> Nathanael Kuipers
>

..................................................................  
..........
Peter B. Kos, Ph.D.
 (RITE)
E-mail: kos@rite.or.jp