[Bioperl-l] question about the nature of bioperl
Peter Kos
kos@rite.or.jp" <kos@rite.or.jp
Wed, 21 Aug 2002 16:52:40 +0900
Hi,
Neither Bioperl nor the developers need my defense, nevertheless let
me tell you my personal opinion.
Perl is designed in a way that one does not actually have to know
Perl to start using it.
Bioperl, intertwined bloated monster or not, works the same way.
Start with the smallest bit and you will love it. Or at lest
appreciate. As will eventually your supervisor.
Many years ago once I participated in a workshop of Tim Hubbard. At
the end he asked us to write some short take-home messages in 5
points. In almost everybody's report one of the five points was: "the
differences/conversion between file formats are a pain ..."
Most of the general documentations of Bioperl start with the sequence
file format converter as an example. Catherine's nice tutorial
(copyright Pasteur Inst.) has even three or four blends as I can
recall. You do not even need to invent/write these ten lines, you
just copy out to a new file and you do not have to worry about
sequence file formats in the rest of your life.
It is true though, that you need to copy another ten lines if you
want another converter for the alignment files. Or modify your first
converter to recognize that in case of
msf/clustalw/phylip/bl2seq/WhatTheHeckEver format it should use
AlignIO rather then SeqIO.
And you do not need to know how and why to use the abstract phenomena
if you do not need to use them. You don't even need to know (nor
care) about them. However, these things are there for you in case you
need to use some of them. (For example if you need "a factory class
capable of instantiating SeqAnalysisParserI implementing parsers." Or
similar.)
As for your current project, reading in the fasta file: it sounds for
me quite straightforward like
$in = Bio::SeqIO->new(-file => "inputfilename" , '-format' =>
'Fasta');
$seqobj = $in->next_seq();
> strip everything but sequence characters
$seq = $seqobj -> seq;
I can not imagine how on Earth it could be done "seemingly more
simply" than this.
Similarly "a simple blast parser" can not be more simple than the one
which is already readily written for you.
> ... grabs sequences by user-defined hit-def keywords
like (after some startup lines)
while ( my $subject = $blast_report->nextSbjct() ) {
if ($subject -> name () =~ /$hitdef/) {
do whatever
}
}
This again is difficult to be seemingly more simply done.
You may not even need to wait with this till you are pressed for
time/results. <;o)
The point what I just want to make is that you may not read Mark's
whole local public library in order to learn the adventures of
Pinocchio. It is true that Bioperl as a whole is really getting to be
a bit scary with respect to size and complexity, but it is because
bioinformatics is getting to be a bit scary with respect to size and
complexity.
I would (weakly) suggest you first read and "steal" the examples and
scripts provided, rather than (or much before) trying to understand
the whole network of classes, inheritances, factory concept and other
funny stuff.
Moreover I would STRONGLY recommend you NOT to write any Blast
parsers. If you do not use anything else from Bioperl, just one of
these parsers, then still: installing Bioperl just for this only
reason is much more simple than writing a very basic unstable
undocumented and error-prone parser for yourself.
Use Bioperl. And have fun.
Peter
> Thanks for your reply. I am relatively new to open source
> development and had
> not considered that bit about "no one praises you". I started
> looking into
> bioperl a little while ago, and asked my supervisor why we don't
use
> it. His
> answer was that using anything from bioperl seems to require lots
of
> other
> things used just so; in other words, it's too intertwined with
> itself. And
> indeed, between CPAN modules for parsing etc. and our own code, we
> do well
> enough, seemingly more simply. Your points were all well made
> however, I
> appreciate the insight. I personally am interested in further
> exploring
> bioperl.
>
> Best regards,
>
> Nathanael Kuipers
>
..................................................................
..........
Peter B. Kos, Ph.D.
(RITE)
E-mail: kos@rite.or.jp