[Biopython-dev] antiparallel ?

Andrew Dalke dalke at acm.org
Fri Aug 11 13:11:16 EDT 2000


thomas at cbs.dtu.dk

>How are people changing sequences to antiparallel with biopython ?
>Currently I use
>
>    def complement(self, seq):
>        return string.join(map(lambda
x:IUPACData.ambiguous_dna_complement[x], map(None,seq)),'')

Two thing here.  First, I like working in Seq space rather than as
strings.  Which means I just realized there's no way to get the complement
table for an alphabet.  (Well, there is a way using the PropertyManager
and setting the values in IUPACEncodings.  It's just not begin done.)
If it did, then this would be a function in utils (not a method) and
work like:

def complement(self, seq):
  alphabet = seq.alphabet
  table = default_manager.resolve(alphabet, "complement_table")
  new_data = []
  for c in seq.data:
    new_data.append(table[c])
  return Seq(string.join(new_data, ''), alphabet)

If I weren't trying to get things done for BOSC, I would fix things now :(

Second, there's no need to do the map(None, seq) since a string is a
sequence-like object.  That is,

def spam(c):
  print "Character", repr(c)
  return c
map(spam, "Andrew")

prints

Character 'A'
Character 'n'
Character 'd'
Character 'r'
Character 'e'
Character 'w'
['A', 'n', 'd', 'r', 'e', 'w']

Also, doing the map(lambda x, IUPACData.ambiguous_dna_complement[x], ...)
is slower than
  x = []
  for c in seq:
    x.append(IUPACData.ambiguous_dna_complement[c])
  return string.join(x, '')

because the lambda introduces the function call overhead.  Also, using
a loop is easier for most people to understand.

>    def reverse(self, seq):
>        r = map(None, seq)
>        r.reverse()
>        return string.join(r,'')

instead of "r = map(None, seq)" try "r = list(seq)"

>    def antiparallel(self, seq):
>        s = self.complement(seq)
>        s = self.reverse(s)
>        return s

If you are interested in performance, you could repeat the code for
complement, except adding a ".reverse()" before the string.join.  This
would prevent the extra conversion from list -> string -> list.

Is it usually called "antiparallel"?  I'm used to "rc" or
"reverse_complement".  I believe bioperl calls it "rc", so and for
consistency that is what I would lean towards - except that it's too
small a name for my preferences.

                    Andrew






More information about the Biopython-dev mailing list