Bioperl: article for Dr. Dobb's Journal
Ewan Birney
birney@sanger.ac.uk
Fri, 9 Oct 1998 17:34:55 +0100 (BST)
On Fri, 9 Oct 1998, Lincoln Stein wrote:
> Ewan Birney writes:
> > It's a very nice article. How much do you need cut out? Here are some
> > suggestions:
>
> The article needs to be cut by about 50% (I'd actually asked to make
> it a two-parter originally, but got turned down). If I cut out the
> alignment stuff there will still need to be some substantial trimming
> in the rest. Alternatively, I could focus on the alignment algorithm
> entirely, and this is what the editor has suggested. I hate leaving
> out all the OO stuff, however.
>
To be honest I think the OOP stuff is more important than the algorithm
and the fact that perl is the *ideal* language to glue and provide a
development 'framework' is v. important. But the algorithm might look
more sexy to people. I'd go OOP-Perl to say that it is more than a
web/systems glue language.
[snip]
> > I think your biggest saving would be to drop the alignment class stuff
> > all together. It's sad because that's where this stops becoming simple
> > datastructures and starts getting interesting (and of course, I find
> > alignments v.interesting), but I think trying to explain OOP-perl,
> > bioinformatics and dynamic programming all in one small article is taking
> > on quite a job.
>
> Do you think the alignment part is strong enough to stand on its own?
> The code actually runs pretty slowly and uses a lot of memory (and
> uses a horrible trick in which strings are turned into numbers
> automagically). Maybe I should focus on the algorithm and then show
> how it can be turned into an XS module.
>
Perhaps. Does DDJ really want an explanation of dynamic programming? It
isn't very 'perly' then, and alot of people have written about dynamic
programming alot (ie - you'd have to watch out that you didn't tick off
some computer science types by your explanation - I tend to do this alot
<shucks>).
I think it is foolish to write dynamic programming in perl if it is a
serious thing to be used in anger. DP is a v. cpu intensive algorithm
which is almost perfect for a RISC chip + a good C optimiser. I think the
algorithm -> C implementation + C API -> bioperl intergration via XS is a
much more realistic example of this... It makes the article much more
'here is a complex algorithm that we want to provide sensibly for non-C
users to use'.
I might point out of course that the current dump from the bio-perl cvs
directory has a protein smith-waterman implementation written in C and
stuck in via XS - it produces a Bio::SimpleAlign object which is a pure
perl object. Quite an interesting starting point if you are looking for
pre-cooked implementations... (guess who wrote it <grin>)
>
> > b) I think the point about perl is that not only is it a rapid development
> > cycle but that existing command line based solutions can be worked into
> > it, as can C based APIs (a la AcePerl and the bioperl alignment
> > routines).
>
> Very good point. I'll add that to the intro.
>
I've been claiming that Perl (not java) is the ideal driver language for
'components' of code that you want to put together - some components
written in Perl, some in C/C++, some CORBA'ized. (I had some odd looks at
Objects in bioinformatics when I said that...).
There are lots of things you can focus on in this article. I guess you're
going to have to weigh up 'readability' 'sexiness' and 'importance'.
I'm happy to reread anything if you like. Have fun!
> Lincoln
>
> --
> ========================================================================
> Lincoln D. Stein Cold Spring Harbor Laboratory
> lstein@cshl.org Cold Spring Harbor, NY
> ========================================================================
>
Ewan Birney
<birney@sanger.ac.uk>
http://www.sanger.ac.uk/Users/birney/
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================