[Biopython-dev] Sorting alignments

Eric Talevich eric.talevich at gmail.com
Sat Jun 23 16:04:38 UTC 2012


On Sat, Jun 23, 2012 at 9:48 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> Hi all,
>
> This branch extends the MultipleSeqAlignment's sort method to
> accept a key function and a reverse option (just like lists under
> Python 3 - there is no need for a cup argument):
>
> https://github.com/peterjc/biopython/tree/align-sort
>
> This was prompted by a BioStars question,
>
> http://www.biostars.org/post/show/47562/is-there-a-way-to-sort-a-biopython-alignment-by-a-feature-other-then-id/
>
> Does this seem like a good idea?


Seems cool to me. I've normally had to operate on the alignment._records
attribute to do this sort of thing, so it's nice to have an officially
sanctioned method.


> Can anyone think of a nicer
> example for custom sort ordering for the doctest or Tutorial?
>

I sort by unaligned sequence length sometimes. In alignment, it would be
the reverse of gappiness:

>>> aln.sort(key=lambda rec: rec.seq.count('-'), reverse=True)

Other use cases could include sorting by sequence weights, given a function
for calculating them, or according to the ordering in a phylogenetic tree.

-E



More information about the Biopython-dev mailing list