[Biopython-dev] Giving the SeqRecord a length? Evaluating it as a boolean

Peter biopython at maubp.freeserve.co.uk
Tue Jun 10 11:37:42 UTC 2008


Something we've discussed before is making the SeqRecord more like a
Seq object, perhaps even subclassing it.  I've got a patch on Bug 2507
to make some small steps in this direction - accessing elements of the
sequence by indexing the SeqRecord, i.e. letter = my_seq_record[5], or
iterating over the letters in a SeqRecord's sequence.

http://bugzilla.open-bio.org/show_bug.cgi?id=2507

In addition, I would like to give the SeqRecord a length, allowing
len(my_seq_record) rather than len(my_seq_record.seq).  However, this
has a side effect on the evaluation of a SeqRecord as a boolean.
Before, any sequence was True, but if we add the __len__ method then
any SeqRecord with a zero length sequence will evaluate as False.
This is a real issue, for example you can have GenBank files without a
sequence (see our unit test cases).  One example where this is
important is if you are using an iterator via the .next() method and
had been checking for a returned None by using "if record:" (something
some of the older unit tests were doing) you would have to start using
"if record is not None:" instead.

If the old behaviour is desirable (evaluating a SeqRecord as a boolean
is alway True), we could implement a __nonzero__ method to preserve
it, see: http://docs.python.org/ref/customization.html

What do people think?  Would adding a __len__ method to the SeqRecord
cause trouble?

Peter



More information about the Biopython-dev mailing list