[Biopython] pairwise sequence alignment programs in biopython

Markus Piotrowski Markus.Piotrowski at ruhr-uni-bochum.de
Wed Jul 11 06:03:45 UTC 2018


Dear John,

I did the rewrite of pairwise2 in 2016. Michiel, who wrote the new 
PairwiseAligner, did some comparisons and, as far as I remember, 
pairwise2 was comparable to the new PairwiseAligner regarding maximum 
sequence length. Also, in my experience, ~2000 residues for pairwise2 is 
very low; I usually got something about ~7000 residues. Can you report a 
test case, e.g. two sequence IDs and your pairwise2 alignment parameters?

Best,
Markus

Am 10.07.2018 um 20:51 schrieb John Berrisford:
>
> Hi
>
> I’m looking at performing pairwise alignments of polymer sequences in 
> biopython.
>
> These will be protein or nucleotide sequences. They may include 
> non-standard residues which will be denoted as X.
>
> The sequences will be of varying length from around 20 residues up to 
> several thousand residues – put simply the range of sequences in the PDB.
>
> I’m looking for the best tool to use to do this in biopython
>
> So far I have performed tests with pairwise2 and Align.PairwiseAligner.
>
> From my tests it seems that pairwise2 has a limit of ~2000 residues – 
> i.e. if I give it a sequence of 2500 residues to compare against 
> itself it crashes. PairwiseAligner seems to be able to handle much 
> longer sequences without issue.
>
> I need to be able to set gap penalties – which is possible in both of 
> these programs.
>
> So my question are:
>
> Are these the only options in biopython? – I would prefer a python 
> implementation rather than something that requires external 
> compilation i.e. Emboss Needle
>
> Are these the best options?
>
> Are they both maintained / stable?
>
> Are they comparable in their results?
>
> Is the limitation in sequence length in pairwise2 a known issue? A 
> quick google search suggests most people use pairwise2, which is 
> strange given its sequence length limitation.
>
> Thank you
>
> John
>
> --
>
> John Berrisford
>
> PDBe
>
> European Bioinformatics Institute (EMBL-EBI)
>
> European Molecular Biology Laboratory
>
> Wellcome Genome Campus
>
> Hinxton
>
> Cambridge CB10 1SD UK
>
> Tel: +44 1223 492529
>
> https://www.pdbe.org <https://www.pdbe.org/>
>
> https://www.facebook.com/proteindatabank
>
> https://twitter.com/PDBeurope
>
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython

-- 
_________________________________
Dr. Markus Piotrowski
Privatdozent/Akademischer Rat
Lehrstuhl für Molekulargenetik und Physiologie der Pflanzen
ND 3/49
Universitätsstr. 150
44801 Bochum

Tel. xx49-(0)234-3224290
Fax. xx49-(0)234-3214187

http://www.ruhr-uni-bochum.de/pflaphy/Seiten_dt/PG_Piotrowski_d.html
http://homepage.ruhr-uni-bochum.de/Markus.Piotrowski/Index.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20180711/6213e961/attachment.html>


More information about the Biopython mailing list