New Bio::Seq and Bio::Seq::Parse (.025 BETA)
Georg Fuellen
fuellen@dali.Mathematik.Uni-Bielefeld.DE
Mon, 17 Mar 1997 15:04:40 +0000 (GMT)
Hi,
> Hi Chris,
>
> > The location is: http://www.ayf.org/~c_raffi/bioperl/top.html
>
> Nifty logo! Only, I'm confused about what the code is supposed [sic] to
> be doing. (Also, the name of hte project is 'bioperl', not 'bio::perl'
> That would imply a bio/perl.pm module, whereas bioperl doesn't imply
> anything at all :)
:-> For me, the logo somehow reinforces the idea that "Perl is obfuscated" -
it's not the message we'd like to get across I think, or?
I'd suggest to keep the logo on Dag's page, and postpone the issue;
finalizing a logo costs time that is better spent on the code right
now - in April, I hope things are different :-)
> > I wrote a crude Parse.pm that serves as an interface to ReadSeq and made
> > the appropriate changes to Seq.pm.
>
> Great!
>
> Had a quick look at it; it seems quite reasonable and the changes in
> Seq.pm are also appropriate. On comment is that it would be much more
> efficient to pass around references than potentially huge strings.
>
> However, these modification doesn't deal with the bigger issue of what
> to do about the strings v. files problem, that I mention in the 5th
> paragraph of:
>
> http://www.hrz.uni-bielefeld.de/mailinglists/BCD/vsns-bcd-perl/9702/0003.html
>
> Is the parse function in Bio::Seq supposed to take 1 or 2 parameters (as
> documented) or 4 params (as coded). The problem arises because of some of
> my inefficient legacy design at the very outset, but I think there's a
> solution.
>
> -=-
>
> A few other nits from a _very_ cursory look-through
>
> @SeqForm appears never to be created
>
> I would change [@%]SeqForm to [@%]SeqFmt, or even [@%]seq_fmt (to be
> consistent with the rest of the naming).
I think then we should have seq_ffmt.
Then again, doesn't SeqForm hint at the fact that these variables are
very special ?
> The names of formats in SeqForm, etc., should be all lower-case for the
> reasons discussed earlier on this list. (Becuase is it FastA or fasta or
> Fasta? GenBank, Genbank, or genbank? If it is always lowercase, there's
> no ambiguity.)
>
> There's no 'valid' field to indicate whether or not the object is indeed
> valid for any operation. For example, if setseq is used to set an invalid
> sequence.
What if we don't allow this to happen ?
If we keep the object valid all the time ?
> to-do: more validity checking, such as in setseq
>
> A "_undef" parameter (or something like it) needs to be available to unset
> various options
>
> Functions which can return an invalid result (such as parse_bad) should
> return undef ratehr
You mean, rather than 0 ? I thought zero and the null string ("")
are interpreted as false, and returning 0 or "" seems the standard
convention, no ?
best wishes,
georg
>