[Biopython-dev] [Biopython] Subclassing Seq and SeqRecord

Peter biopython at maubp.freeserve.co.uk
Tue Nov 24 16:30:21 UTC 2009


On Tue, Nov 24, 2009 at 4:17 PM, Jose Blanca <jblanca at btc.upv.es> wrote:
> On Tuesday 24 November 2009 16:52:25 Peter wrote:
>> Thinking about use-cases, sometimes a subclass will want the
>> methods to return Seq objects, sometimes the same class.
>>
>> The UnknownSeq too sometimes can return another UnknownSeq,
>> but must often return a Seq object.
>
> I'm thinking about that and I don't think it's a problem. If the subclass
> wants to return as the parent class it can chose to do it. I'm just proposing
> to change the behaviour of the parent class.

Yes - but it means any existing subclasses will need updating
(fairly easy for those included with Biopython) which could be
a big problem for end user scripts (especially if anyone wants
to target old and new versions of Biopython).

>> The BioSQL DBSeq on the other hand always returns a Seq
>> object for all its methods. The fact that the Seq __add__ and
>> __addr__ use __class__ was the cause of a bug in that adding
>> DBSeq objects didn't work.
>
> I haven't realized that problem. Was that a bug of the BioSQL project
> that could be solved or a desing problem related to my proposal?

It was just a bug in Biopython's BioSQL wrappers, fixed by adding
explicit __add__ and __addr__ methods to the DBSeq class since
it couldn't safely use the default methods of the Seq class. Your
proposal would require further similar changes to the DBSeq class
to override *all* the Seq returning methods to ensure a Seq object
is returned and not attempt to create a DBSeq object with the
wrong __init__ arguments.

The point is while your proposed change will make some tasks
easier (e.g. writing an extended Seq subclass that adds a new
method or changes an existing method), it will make other tasks
much harder (e.g. the DBSeq class).

Peter



More information about the Biopython-dev mailing list