[Biopython] pairwise sequence alignment programs in biopython

Peter Cock p.j.a.cock at googlemail.com
Tue Jul 10 23:12:47 UTC 2018


Hi John,

The Align.PairwiseAligner code is new in Biopython 1.72, and
better support for longer sequences was one of the improvements.

You would probably find it useful to read over the pull request:
https://github.com/biopython/biopython/pull/1655


Peter

On Tue, Jul 10, 2018 at 7:51 PM, John Berrisford <jmb at ebi.ac.uk> wrote:
> Hi
>
>
>
> I’m looking at performing pairwise alignments of polymer sequences in
> biopython.
>
> These will be protein or nucleotide sequences. They may include non-standard
> residues which will be denoted as X.
>
> The sequences will be of varying length from around 20 residues up to
> several thousand residues – put simply the range of sequences in the PDB.
>
>
>
> I’m looking for the best tool to use to do this in biopython
>
>
>
> So far I have performed tests with pairwise2 and Align.PairwiseAligner.
>
> From my tests it seems that pairwise2 has a limit of ~2000 residues – i.e.
> if I give it a sequence of 2500 residues to compare against itself it
> crashes. PairwiseAligner seems to be able to handle much longer sequences
> without issue.
>
>
>
> I need to be able to set gap penalties – which is possible in both of these
> programs.
>
>
>
> So my question are:
>
> Are these the only options in biopython? – I would prefer a python
> implementation rather than something that requires external compilation i.e.
> Emboss Needle
>
> Are these the best options?
>
> Are they both maintained / stable?
>
> Are they comparable in their results?
>
> Is the limitation in sequence length in pairwise2 a known issue? A quick
> google search suggests most people use pairwise2, which is strange given its
> sequence length limitation.
>
>
>
> Thank you
>
>
>
> John
>
>
>
> --
>
> John Berrisford
>
> PDBe
>
> European Bioinformatics Institute (EMBL-EBI)
>
> European Molecular Biology Laboratory
>
> Wellcome Genome Campus
>
> Hinxton
>
> Cambridge CB10 1SD UK
>
> Tel: +44 1223 492529
>
>
>
> https://www.pdbe.org
>
> https://www.facebook.com/proteindatabank
>
> https://twitter.com/PDBeurope
>
>
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython



More information about the Biopython mailing list