[Biopython-dev] Restriction analysis package.

Brad Chapman chapmanb at uga.edu
Tue May 18 06:00:56 EDT 2004


Hi;

[...description of Rana package...]
[...http://sourceforge.net/projects/rana...]
> However, the code which deals with the restriction enzymes is more 
> mature (I would say beta). I tested the results it gives against the 
> restriction analysis facilities of EMBOSS for common vectors (pBR322, 
> pGEMs, ...) and it is ok. I will release this part under python license. 
> For the moment, I have removed the class which allows meta-analysis 
> (full restriction analysis, limited to Blunt, ...) as it has not been 
> tested with Biopython DNA objects.
> 
> The code itself does not follow exactly the Biopython convention for 
> coding (class methods are in lower case with underscore as in Python). 
> But, this was the convention for the whole Rana package. This can be 
> changed eventually.
> 
> I release a new package in the Rana project (ranaBiopython-0.1) which 
> contains the
> package to be included in Biopython. I have tested it quickly and it 
> seems to work fine.

Thanks for putting this together. The code looks very useful and I'd
definitely like to see it work towards being included in Biopython,
if that's what you'd like. A few comments on it:

1. First, if you'd like to include this in Biopython the code would
have to be willing to license the code under the Biopython license.
I see different references to the GPL and Python license within your
package. I'm not at all the type of person who argues about
licensing issues, but we just need to keep the Biopython
distribution under one license.

2. The way this is organized right now puts two different types of
functionality together -- building the enzyme dictionary by
downloading and parsing Rebase, and the actual enzyme dictionary
itself. For Biopython, the public functionality you'd want to expose
would be the enzyme dictionary and the useful functions you have
within that. The downloading and parsing work would be something
that you, or another developer, would do on a monthly or whatever
basis to keep the enzyme dictionary up to date within Biopython.
Thus I'd propose organizing the code like:

Bio/Restriction/__init__.py --> The current Restriction.py
Bio/Restriction/Restriction_Dictionary.py --> the dictionary
Bio/Restriction/_Update/ --> The Update, RanaConfig and 
RestrictionCompiler code to do the updates and regenerate the
dictionary.

ranacompiler.py should exist in somewhere like Scripts/restriction
to be run, instead of in site-packages.

3. Going along with reorganizing the code base, I'd propose changing
the updating scripts a bit. Storing databases and things into
site-packages is generally not a good idea, since that is meant
for Python code, and also requires the user to mess around with
either running scripts as root or changing permissions -- more work
then is really necessary. What I'd do is store the Database and
Updates information into, say, the current directory where the user
runs the scripts. Additionally, the Restriction_Dictionary.py would
be generated there. Then, when the updates are done everything gets
run and you have a new Restriction_Dictionary.py to copy over and
check into CVS.

Hopefully these make some sense. I really like the catalyse and
search functionality on the enzyme classes -- it's a nice interface
design and it would be great to have in Biopython.

Please do let me know what you think about the licensing and change
proposals and we can keep moving forward towards getting this in
Biopython. Thanks again for the work so far!

Brad



More information about the Biopython-dev mailing list