[Bioperl-l] GCG MSF format alignments

Hilmar Lapp lapp@gnf.org
Fri, 09 Mar 2001 18:42:27 -0800


Peter Schattner wrote:
> 
> > Regarding the ~ characters ... my primitive understanding suggests this
> > throws an exception in Bio::Primaryseq::seq with a message that '[some
> > sequence ~~~] does not look healthy'.
> 
> I've had this problem too in other contexts.  Bio::Primaryseq is
> currently rather strict about what it allows in asequence.  I would
> prefer to see it issue a warning rather than die when it comes across a
> bad character (what do you think about that Ewan? Hilmar? Jason?)

I actually like the exception being thrown there, even though I bumped
into this several times, too. However, it always indicated a bug.

I'd actually prefer this being solved by creating a specialized class
that specifically allows for additional characters as they are used in
e.g. alignments. This should go under Bio::Seq::<AlignedSeq>.pm
(replace <...> with whatever you like better). Ideally Bio::PrimarySeq
(or maybe better Bio::PrimarySeqI?) has a method boolean
validate_seq(), and depending on its return value seq() throws an
exception. So, in your deriving implementation you'd only have to
override validate_seq(). (As a matter of fact, there is no
validate_seq() right now :|

Does this make sense to anyone?

	Hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp@gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------