[Biopython-dev] Refactoring motif analysis code

Bruce Southey bsouthey at gmail.com
Wed Dec 3 02:33:26 UTC 2008


On Tue, Dec 2, 2008 at 3:39 AM, Bartek Wilczynski
<bartek at rezolwenta.eu.org> wrote:
> On Mon, Dec 1, 2008 at 10:07 PM, Giovanni Marco Dall'Olio
> <dalloliogm at gmail.com> wrote:
>
>> Thanks for all these changes.
>> I remember that I wrote a mail to TAMO's authors when I was using it.
>> They seemed to be interested in integrating the code with biopython,
>> so maybe the license issue could be superated.
>> It's up to you, whether you want to reimplement all the functions they
>> have or not.
>
> I have to say I haven't done anything yet towards integrating TAMO
> with biopython.
> So far, my own code was doing the job for me, and since there was a
> certain learning curve to get into TAMO,
> I didn't look closely into it. I've looked more carefully now at it
> and I have two general thoughts:
> - There is a number of features in TAMO, for which there is no
> counterpart in Bio.Motif. Just by looking at module names I've found:
>  - MDscan parser
>  - their own EM motif finding scheme (some kind of EM method)
>  - several motif comparison functions from MotifCompare
>  - a lot of nice little methods for motifs like textLogo, giflogo, etc.
> - There is quite an overlap between biopython and TAMO. They
> implemented their own Sequence handling, FASTA Parser, clustering
> module etc.  There will be some gruntwork with integrating their code
> into Biopython (findining and reconciling the overlaps)
>
> I also have to say, that I'm a bit scared by copright statements in
> the TAMO code, saying it belongs to the Whitehead institute. I don't
> want to be overly pessimistic, but the process of releasing this code
> under biopython license might be slow.
>
> What I think is the best way to go is to clean up current mess with
> Bio.Alignace and Bio.MEME, and then ask people for contributions.
> If TAMO developers would be willing to contribute I'll be happy to
> help with integration into biopython. It will take some time anyway,
> so I wouldn't delay the inclusion of Bio.Motif into Biopython.
>
> cheers
> Bartek
>
>
>
> --
> Bartek Wilczynski
> ==================
> Postdoctoral fellow
> EMBL, Furlong group
> Meyerhoffstrasse 1,
> 69012 Heidelberg,
> Germany
> tel: +49 6221 387 8433
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>

I would agree that you should ignore TAMO and just focus on developing
a suitable framework to integrate Alignace and MEME as you have
indicated. I would presume that the other motif finding applications
will also fit into that framework.

Unless the TAMO code is under a BSD-style or equivalent license that
is compatible with Biopython you must stop looking at it. I know it is
hard to avoid as the comes up on Google with a simple search. If the
TAMO code gets suitably licensed, then fine but until then it can
cause major problems that can involve the whole Biopython project
(even including GPLed code can do this).

Bruce



More information about the Biopython-dev mailing list