[Biopython-dev] Bio.MarkovModel; Bio.Popgen, Bio.PDB documentation

Michiel de Hoon mjldehoon at yahoo.com
Mon Oct 6 10:13:18 UTC 2008


> > When I was looking at the NumPy-dependent modules, I
> > got the impression that Bio.MarkovModel can be
> > simplified now that it's using the new NumPy.
> 
> That's good, but of limited benefit in itself.

Well, currently Bio.MarkovModel uses a C extension module Bio.cMarkovModel. If we can achieve the same speed or better by making use of NumPy, then we won't need this C extension module and we can simplify Biopython.

> > 2) If not, should this module be kept?
> 
> I would say yes.
> ...
> To me, having this remain as a "top level" module in
> Biopython would give it higher status and visibility than
> hiding it away in the example scripts.

OK, let's keep it as a module then. We now have several small modules related to supervised learning as separate Bio.<module>s (LogisticRegression, MaxEntropy, kNN, NaiveBayes, and arguably MarkovModel), which to me looks a bit messy. It may be a good idea to collect these in one Bio.Supervised, though this is not urgent.

I'd be happy to set up a new chapter in the tutorial about these supervised learning modules (I wrote a section a long time ago for the cookbook about logistic regression). While we're on the subject, I think that the Bio.PopGen and Bio.PDB sections of the cookbook chapter in the tutorial should be promoted to separate chapters in the tutorial, since these modules are fairly big and have a good documentation.

--Michiel.


      



More information about the Biopython-dev mailing list