[Bioperl-l] DNA Smith-Waterman?
Ewan Birney
birney@ebi.ac.uk
Sun, 8 Dec 2002 18:18:11 +0000 (GMT)
On Sat, 7 Dec 2002, Yee Man wrote:
>
> Hi,
>
> I am a newbie programmer at Stanford Human Genome Center. We are
> doing some DNA sequence alignments here. Occasionally, we would like to
> align DNAs using some sort of Dynamic Programming algorithm a la
> Smith-Waterman.
>
> I found that Ewan Birney wrote a module for protein Smith-Waterman
> already. Is he going to extend it to DNA as well?
>
I have in my Wise2 package (the commandline program is dnal, look at
www.ebi.ac.uk/Wise2) but have not bothered to link it into the XS system
into Perl, although alot of the stubs are there.
> I just wrote a very simple C program that does Smith-Waterman for
> DNA sequences. It is using Matrix Space with Gotoh's Improvement. Is this
> the fastest implementation you can do with Smith-Waterman?
There a series of different improvements to smith waterman you can do some
of which are published, and some are just "knowledge" swapped between
different practioners; some of the best people at this are Phil Green
(wrote SWAT, which has some improvemnents there) and Guy Slater (in my
group at the EBI); but there is a large number of people with different
algorithmical tricks for speed - to be honest, I am not sure what Gotoh's
Improvement is.
To be more honest, once you have written a piece of code with sensible
memory layout then you will only be changing the runtime speed by a factor
of 0.9 or 0.8 at best; to tackle more problems in less time you need to
move to heuristics such as BLAST, exonerate, BLAT or SSAHA which all use
different seed-extension-alignment strategies. There is again alot of
distributed knowledge in this area - you are more than welcome to get
stuck in and try some ideas out, but I'd definitely suggest you check out
some of these programs first off.
>
> I know XS and Perl, so if no one is going to extend that protein
> Smith-Waterman to cover DNA, I can probably insert my code to do that.
> What do you guys think?
If you would like to do this, great! Come up with a pretty sensible
directory structure that could be worked into bioperl-ext (ie, a standard
Perl XS extension) and model the Bioperl bridge code after pSW; then tar
it up and I would be happy to take a look at it for inclusion.
>
> Thanks
> Yee Man
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>.
-----------------------------------------------------------------