[Biopython-dev] [Bug 3066] Iterating/looping over colums/rows of a MultipleSeqAlignment

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Thu Apr 29 11:02:42 UTC 2010


http://bugzilla.open-bio.org/show_bug.cgi?id=3066





------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk  2010-04-29 07:02 EST -------
(In reply to comment #2)
> Two things:
> 
> 1. Is this implementation fast? It basically transposes the alignment as a
> list-of-lists, right? So:
> 
> return zip(*self)
> 
> or:
> 
> from itertools import izip
> return (''.join(col) for col in izip(*self))

I haven't done any profiling yet - using itertools would be worth trying.

> 2. On the topic of efficiency -- have you encountered a situation where
> having an alignment as a NumPy character array would have helped?

Not personally, but these iterators should facilitate creating a NumPy
character array from our alignment object. I was also pondering adding
an explicit "as_array" or "to_array" method which would require NumPy
at runtime. However, I would rather keep the core of Biopython without
any NumPy dependency.

Peter


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list