[Bioperl-l] RE: SeqIO fails on masked sequences
Hilmar Lapp
hlapp at gmx.net
Mon Jan 10 03:13:35 EST 2005
On Sunday, January 9, 2005, at 05:05 PM, Wes Barris wrote:
>>> Hilmar Lapp wrote:
>>>
>>>> You should not require by default that all sequences in one file be
>>>> of
>>>> the same type (alphabet). We never have required this, nor
>>>> documented
>>>> that it is a (not enforced) requirement, and so there may be people
>>>> out
>>>> there relying on this 'feature'.
>>>
>>> Mixing both DNA and protein sequences in one file and then attempting
>>> to process it seems like kind of a bizarre thing to want to do. If
>>> the alphabet is explicitly specified, isn't there a way to make that
>>> take precedence?
>> Why are you then able to set the alphabet of a SeqIO object if
>> whenever you call next_seq() it trys to guess the alphabet of the
>> sequence anyway? It seems more logical to me, that the user can
>> specify the alphabet without worrying about bioperl guessing it, and
>> getting it wrong, or not setting it at all.
>
> I am guessing that you meant to direct this question to Hilmar because
> I agree with you. If one specifies the alphabet, bioperl should not
> subsequently try to guess it.
Right, that's what I agree with too. If an alphabet set for the stream
gets reset to undef after every sequence then I'd call that a bug.
My point was, if the user doesn't specify the alphabet, then don't make
assumptions that you don't absolutely have to make. You had suggested
to guess the alphabet from the first sequence in this case and then
assume every subsequent sequence in that stream will have that same
alphabet. That's what I think is not a good idea and not necessary
either. If the user doesn't preset the alphabet, just keep on guessing
for every new sequence.
Mixing alphabets is indeed bizarre but people who do bizarre things are
everywhere.
-hilmar
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the Bioperl-l
mailing list