[Bioperl-l] GCG MSF format alignments
Hilmar Lapp
lapp@gnf.org
Fri, 09 Mar 2001 18:42:27 -0800
Peter Schattner wrote:
>
> > Regarding the ~ characters ... my primitive understanding suggests this
> > throws an exception in Bio::Primaryseq::seq with a message that '[some
> > sequence ~~~] does not look healthy'.
>
> I've had this problem too in other contexts. Bio::Primaryseq is
> currently rather strict about what it allows in asequence. I would
> prefer to see it issue a warning rather than die when it comes across a
> bad character (what do you think about that Ewan? Hilmar? Jason?)
I actually like the exception being thrown there, even though I bumped
into this several times, too. However, it always indicated a bug.
I'd actually prefer this being solved by creating a specialized class
that specifically allows for additional characters as they are used in
e.g. alignments. This should go under Bio::Seq::<AlignedSeq>.pm
(replace <...> with whatever you like better). Ideally Bio::PrimarySeq
(or maybe better Bio::PrimarySeqI?) has a method boolean
validate_seq(), and depending on its return value seq() throws an
exception. So, in your deriving implementation you'd only have to
override validate_seq(). (As a matter of fact, there is no
validate_seq() right now :|
Does this make sense to anyone?
Hilmar
--
-------------------------------------------------------------
Hilmar Lapp email: lapp@gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------