thoughts/questions on Seq.pm and Parse.pm

Chris Dagdigian cdagdigian@genetics.com
Mon, 17 Mar 1997 11:58:35 -0400


It's getting tough keeping track of the 'open' issues that should be
resolved, I've tried to distill most of them into the "ToDo" file with the
Seq.pm distribution. I'm limited both by time and by programming skill
(working on Seq.pm has been like a trial by fire-- learning by doing) which
limits the amount of 'real' contributions I can make.

I've got some questions/comments about several of the issues so here goes:

Parse.pm
--------
I wrote 2 basic methods that were necessary to get things in Seq.pm working
without thinking much about the overall interface scheme. Any
guidance/code/observations on method names, interface or implementation
would be appreciated.


Seq.pm
-------

o One major interface change that needs to happen SOON is changing
Dna_to_Rna(), Rna_to_Dna() and translate() so that they return biosequence
objects instead of strings. I tried for a little while to do this, using
the Perl-OO-tutorial as a guide but kept running into problems with
scoping. I'm also not sure if the "Right Way" invoves returning an object
or a ref. to an object. I don't want to waste any more time doing this if
the answer invoves a tiny piece of code that is immediatly obvious to
someone on this list. So- if someone knows the "Right Way" to do this,
please let me know!

o Non-fatal use of Parse.pm if ReadSeq does not exist or not configured
I wrapped some code around an eval{} statement in Seq.pm that tries to
politely figure out if Parse.pm is available -- it checks for the presense
of an exported "OK" variable in Parse.pm. Is this the right approach?
Seq.pm should be able to use/not-use Parse.pm without any obvious error
messages.


o Site-specific configuration issues.
Right now, Seq.pm does not have to be edited by users but Parse.pm and the
test scripts do. I'm going to hit the POD docs for MakeMaker, etc. and try
to figure out how setup a system where users edit a ".config" file or
somesuch and the resulting info is used to automatically tweak Parse.pm and
Seq.pm during the 'make' process. Again, any help/suggestions on this would
be appreciated.


o Proposed validity markers
  - A marker that would be set to 'false' whenever Seq.pm makes a call to carp()
  - A marker to specify valid/invalid biosequence object
Are these permutations of the same idea or two different things? I'm also
not sure about how to implement.


o Default constructor ID
Steve commented that the default constructor ID should be changed from
"No_Id_Given" to "No_Id" plus a unique number. Assigning a number is easy
enough but how would you keep track of "unique" numbers assigned? Is there
a way to save state or remember these numbers each time new() is called? I
think I see the potential problems that objects with the same 'ID' field
could cause but I'm unsure how a 'unique' naming process would work.

o translate() treats ambiguity inconsistantly
Steve mentioned this, but I want to be sure that I understand the problem
-- it looks like the code deals with "N" unknown bases but does not deal
with any of the other IUPAC symbols for ambiguity. Is this what you were
pointing out Steve?



Sorry for the length!

Regards,
Chris Dagdigian
cdagdigian@genetics.com