[Bioperl-l] An update for my DNA Smith-Waterman code

Ewan Birney birney at ebi.ac.uk
Sun Jan 26 15:32:21 EST 2003



> > I tried it but it seems to me ssearch34 only display the score only. How
> > can I get the alignment? How can I set gap penalty? I read the
> > accompanied fasta3x.doc but I still can't figure it out...
>
> Make sure you are using a recent (version 3.4) distribution.
>
> % ssearch34 -n -q -H -d 0 -b 1 t1.fa t2.fa
>
> ( the -f and -g parameters change the gap penalties for gap open and gap
> extend, respectively; if you have an older version of the package, the gap
> open penalty *includes* the first gap extension -- more recent versions do
> not include the first gap extension in the gap open penalty )
>
> This calculates scores only, no alignments; On my 1GHz G4, running
> altivec'ed SW, this takes 1.6 seconds to calculate the SW score (32767)
> (non altivec'ed on this pair with osearch34 [non-SWAT optimized] takes 5.6
> seconds)
>
> If I change the -d 0 to -d 1, ssearch34 also calculates the alignment
> using the linear space, divide and conquer method.  This part is *not*
> altivec accelerated (yet), and does, as you describe, make multiple passes
> through the matrix to get start/end coordinates.  That takes about 9
> additional seconds to generate the alignment.
>
> On this same machine, your align.pl on t1.fa and t2.fa takes 33 seconds.
> Running on seqs of length 1 shows that the various perl/C binding overhead
> takes about 0.5 seconds, so I'm guessing that's not the source of the
> difference.
>

It sounds like the SWAT implementation in fasta is considerably faster and
also is likely to be less buggy as it is well used -

- Yee - would you like to look into building the necessary XS bindings to
this code if it was possible?

- Aaron - is there any licensing problem to prevent this happening? (ie,
the code from Fasta being repackaged in bioperl-ext)


For people's information, the pSW comes from my Wise2 system which is
definitely not the fast implementation, so if we could replace both the
protein and add the DNA SW implementations using the Fasta/SWAT code, that
would be great.





> -Aaron
>
> --
>  Aaron J Mackey
>  Pearson Laboratory
>  University of Virginia
>  (434) 924-2821
>  amackey at virginia.edu
>
>
>
>
>



More information about the Bioperl-l mailing list