[Biopython-dev] Operations on kmers?

Peter Cock p.j.a.cock at googlemail.com
Mon Feb 6 12:26:37 UTC 2017


HI Alexey,

On Fri, Feb 3, 2017 at 3:14 AM, Alexey Morozov
<alexeymorozov1991 at gmail.com> wrote:
> I've written a little library for k-mer based analyses, and eventually
> decided to upload it to PyPI. Guess what? There are several modules for that
> (https://pypi.python.org/pypi?%3Aaction=search&term=kmer&submit=search),
> including my own (kmers).

I have used khmer from Python http://khmer.readthedocs.io/en/v2.0/
but have not looked into the more recent options.

> Is there a chance one of those can make it to Biopython?

Potentially, although it may not be a good fit.

> It's usually better to have a singe universally available library
> than a bunch of in compatiblble ones.

Yes :)

> I'm willing to work on it, probably in cooperation with folks that made
> other packages, but I don't have a slightest idea whether it's gonna be
> accepted to Biopython. k-mers are still somewhat obscure, after all. If it
> is, where, in your opinion, does it belong? A separate Bio.Kmers module,
> a submodule of Bio.Statistics or Bio.Cluster? Something else?

If Biopython were to have some k-mer support, probably a separate
top level module, Bio.kmers (lower case as per PEP8 unless
constrained by historical choices, so not Bio.Kmers) would be best.
Or, under Bio.SeqUtils might work too?

My gut feeling is that a one or two person effort would struggle to
match some of existing Python libraries focused on kmers,
especially for performance. However, if there is interest from
the Biopython community that would be great.

Regards,

Peter


More information about the Biopython-dev mailing list