[Biopython-dev] [Bug 2639] SeqRecord.init doesn't check for arguments for their types

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Mon Nov 10 08:58:52 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2639


dalloliogm at gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|SeqRecord.init doesn't check|SeqRecord.init doesn't check
                   |for arguments to their types|for arguments for their
                   |                            |types




------- Comment #5 from dalloliogm at gmail.com  2008-11-10 03:58 EST -------
(In reply to comment #4)
> (In reply to comment #3)
> > Created an attachment (id=1041)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=1041&action=view) [details] [details]
> > add a check for the seq argument in seqrecord, to be a Seq object and not None
> >
> > This patch adds a check for the seq argument in SeqRecord.
> > If seq is None (by default), it raises a ValueError Exception.
> > If it is a Seq objects, it saves it as self.seq.
> > If it is another kind of object (string, list, integer), it is converted to a
> > string, and then used to instantiate a seq object.
> 
> I was deliberately not checking the seq argument. 

Ok, understood. I didn't thought of these cases.
However, having not a Seq causes errors that are difficult to understand in
other functions that use SeqRecord.
For example, if you do:

>>> a = SeqRecord(id = '1')
>>> a.format('fasta')

you get the error: 
<type 'exceptions.AttributeError'>: 'NoneType' object has no attribute
'tostring'

This could scary an eventual biopython newbie, an exception like to 'error -
current SeqRecord object doesn't have a Seq' could be better.
What do you think about creating a 'NullSeq' object, which represent a Seq with
no value, and using it as a default for SeqRecord?
Later we could modify the other functions like .format e Seq.translate to
intercept these objects and return the right error message.


> There are several reasonable
> use cases:
> 
> * a Seq object (normal) or a subclass of it.
> * a MutableSeq object (seems reasonable, note this is not a subclass of Seq)
> * None (seems a good way to handle sequence records where we don't know the
> sequence - for example some GenBank files).
> * a user defined sequence object which implements the Seq API but does not
> subclass Seq or MutableSeq (this is more difficult to check).
> 
> > I thought that someone could use an integer (e.g.: 010100010101101) as a
> > sequence, and in this case, the integer is first converted to a string
> > (otherwise Seq() would return an error).
> 
> Note that if someone did want to use some weird numerical sequence, then the
> SeqRecord object should NOT be trying to do anything special (guessing what is
> intended). The user should create a suitable Seq object themselves (ideally
> with a numerical alphabet object).  Explicit rather than implicit (Zen of
> python).
> 
> --
> 
> Note that I'm not 100% happy with the type checking we've just added.  See
> "duck-typing" and interfaces versus types,
> http://www.python.org/doc/2.5.2/tut/node18.html#l2h-46
> 
> The checks I've added shouldn't be too constraining - but maybe they should use
> using interface checking instead (or just revert back to no checking).
> 
> Any comments from other people?  This should be being CC'd to the dev mailing
> list.
> 


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list