[Biopython-dev] next release closer (?)

Andrew Dalke dalke at acm.org
Thu Nov 30 00:30:56 EST 2000

>> C:\biopython-0.90-d03\Tests>python test_prosite.py
>> Patterns: 'A.' 'A' '(A)'
>> Traceback (most recent call last):
>>   File "test_prosite.py", line 88, in ?
>>     m = p.search(Seq.Seq(x))
>>   File "c:\biopyt~1.90-\Bio\Prosite\Pattern.py", line 168, in search
>>     m = self.grouped_re.search(buffer(seq.data), pos, endpos)
>> TypeError: an integer is required
>>                                      Cayte
>  Its OK with the laest Pattern.py

I checked in the CVS logs since I wanted to ensure that it was a proper
code fix and not some side effect of perhaps another bug.  Looks like
Brad fixed that on 2000/09/27 with the following:
<         m = self.grouped_re.search(buffer(seq.data), pos, endpos)
>         if endpos:
>             m = self.grouped_re.search(buffer(seq.data), pos, endpos)
>         else:
>             m = self.grouped_re.search(buffer(seq.data), pos)
<         m = self.grouped_re.match(buffer(seq.data), pos, endpos)
>         if endpos:
>             m = self.grouped_re.match(buffer(seq.data), pos, endpos)
>         else:
>             m = self.grouped_re.match(buffer(seq.data), pos)

This would indeed have caused the problem you identified, and updating
to the newer version properly fixed it.

The base reason for the problem was a difference between Python 1.5.2's
re module and 2.0's sre.  In the first module, the "search" method is
defined in Python as:

  def search(self, string, pos=0, endpos=None):

in the second, it's defined in C as
    in start = 0;
    int end = INT_MAX;
    if (!PyArg_ParseTupleAndKeywords(args, kw, "O|ii:search", kwlist,
                                     &string, &start, &end))

which when translated into Python is

  def search(self, string, pos=0, endpos=sys.maxint):

There's little anyone could have done to guard against this change in
the underlying Python API.

Also, BTW, when we make the change to Python 2.0, I suggest changing
Pattern.py's Prosite.search so that endpos defaults to sys.maxint
instead of the None it does now.  This keeps it compatible with the
Python API and prevents the if-branches in the code - I don't like
branches since they are harder to test fully.


More information about the Biopython-dev mailing list