[Biopython-dev] Developmental policies

Peter peter at maubp.freeserve.co.uk
Sat Jan 10 17:46:38 UTC 2009


On Sat, Jan 10, 2009 at 4:48 PM, Tiago Antão <tiagoantao at gmail.com> wrote:
> This whole discussion is very interesting. In fact, whatever are the
> conclusions I think they should be labeled "offical policy" and put on
> the Wiki.

That sounds good.

> The biggest problem that I've faced is that, whenever I am doing
> something, I don't know the level of acceptability with other
> developers. I tend to put everything to discussion before I commit it
> and whenever I say something I might get completely different answers
> from time to time and from different people. The end result is that I
> defer from commiting things because of issues that are raised in an
> ad-hoc fashion.

Asking before doing things is in general a good plan.  Sadly not
everyone will be free to respond at any one time - but I agree with
you that having more of the defacto policy written out explicitly
would help.

> There should be a page clarifying things like:
> 1. Are contributions that have a small target audience accepted?

Historically yes this has happened - although my impression is that
the bar was perhaps set too low.  I would say some things were
accepted without sufficient documentation and tests.  The problem with
small interest modules is that if the original developer moves on, in
the absense of any apparent users, the module gets abandoned.  This
seems to explain several of the smaller modules we've deprecated in
the last couple of years.

On the other hand, somethings will start with a small target audience
that will grow.  If I was confident that the developer concerned would
stick arround for several years and was prepared to deal with
documentation, unit tests and bug fixes then I would be much happier
about including something, even if it might have a relatively small
target audience initially.

> 2. Use of foreign libraries (e.g., SciPy)?

I think the current stance has been to try and minimise 3rd party
dependencies, other than the special case of python wrappers for
command line tools.  This makes much easier for beginners to install
and use Biopython, and lowering the barrier to entry is a good thing.

There are practical points here too.  In general, 3rd party
dependencies can be a pain (e.g. our Martel parsers broke when
mxTextTools changed their API between 2.0 and 3.0).  Similarly they
can restrict the distribution of Biopython (e.g. NumPy isn't get
available on Windows for Python 2.6), and will also be a potential
road block for moving to Python 3.  As another example, a small part
of Bio.PDB uses flex in a parser, and again this makes building and
distributing it a real pain (so much so, that its been commented out
by default).

However, run time only dependencies (like pure python libraries and
command line tools) are not such an issue for packaging/distribution.
e.g. ReportLab (used in Bio.Graphics only).  If SciPy were to be used
by part of Bio.PopGen, and this didn't affect packaging/distribution
then this might be OK.

> 3. Code management policies. Branches?  Adding new code? Breaking interfaces?

Biopython has historically worked from a stable trunk.  As a
consequence we try and avoid breaking interfaces, instead adopting a
gradual deprecation of an old interface when adding a new interface,
or adding enhancements in a backwards compatible manor.

> 4. New developers

I think there is something written down about this already...

> 5. Legal issues

Try and avoid them?  What did you mean in particular?

> 6. Interop with non-free software

This is linked to the legal issues question.  Many of the tools we
link to like BLAST aren't open source, but are "free" as in cost.  I
don't think we have any examples of non-free software.

> 7. Code quality strategies. Code review? Testing?

Code review:
For new code in a specialist area, it can be difficult to get a
qualified second opinion on the approach, but existing developers can
at least comment on the coding style.  For existing code, my
impression is module owners have been trusted to make changes to
"their" code without review - and generally speaking this has worked
out OK.  Although if anyone spot someone making a change they disagree
with, then please do raise it.  I would hope any larger change had
some discussion before hand - possibly via enhancement entries on
bugzilla.

Testing:
I'd strongly resist adding any new module without an accompanying
test, and wish this had been a firm policy from day one.

> 8. Multiplatform issues

Ideally everything should be cross platform (like python itself).
There are exceptions to this - in particular some 3rd party tools are
not cross platform.  I personally use and test on Windows, Linux and
Mac - and I believe Michiel does too.

> I am not saying a big document. But as questions arise, just discuss
> them, arrive at a decision and document them. It becomes tiring having
> to answer the same questions about code that you want to submit over
> and over again and with different issues everytime.
> One can live with decisions that are disliked, but it is much more
> difficult to live when the playing ground is moving all the time.

I'm sorry if you've had that feeling.  However, circumstances change.
As I recall when you first asked about using SciPy as a dependency,
Biopython was still using Numeric instead of Numpy - so using SciPy
had to wait until after that transition.  Now that we have moved to
NumPy, I think you have a much stronger case.

Peter




More information about the Biopython-dev mailing list