[Biopython-dev] Fwd: Fast instance search of motif in a sequence

Sefa Kılıç sefakilic at gmail.com
Wed Feb 13 02:40:12 UTC 2013


Hi Michiel,

Thanks for the reply. It seems that _pwm.c does the same thing, as you
said. I missed that part of the code. However, it seems that it is not
mentioned in the tutorial and it might be useful to mention it there.

Anyway, it was a good practice for re-implementing it. Thank you!

Sefa Kilic



On Tue, Feb 12, 2013 at 9:06 PM, Michiel de Hoon <mjldehoon at yahoo.com>wrote:

> Hi Sefa,
>
> Bio.Motif._Motif.search_instances() searches for exact instances of a
> motif, but it looks like your code searches for motifs based on its PSSM
> score. Then, isn't it the same as the current code in Bio/Motif/_pwm.c (or
> Bio/motifs/_pwm.c)?
>
> Best,
> -Michiel.
>
> --- On Tue, 2/12/13, Sefa Kılıç <sefakilic at gmail.com> wrote:
>
> > From: Sefa Kılıç <sefakilic at gmail.com>
> > Subject: [Biopython-dev] Fwd: Fast instance search of motif in a sequence
> > To: biopython-dev at biopython.org
> > Date: Tuesday, February 12, 2013, 6:18 PM
> > Hi all,
> >
> > I am working on comparative genomics and I frequently use
> > Motif module of
> > Biopython. One of the most frequent operations that I do is
> > to build a
> > motif out of sites and search a sequence to find instances
> > that are similar
> > to the motif [Bio.Motif._Motif.search_instances()].
> >
> > The problem is that the sequence that instances are searched
> > is huge.
> > Mostly it is the genome sequence itself, with its reverse
> > complement. For
> > example, scanning the E.coli genome + its reverse complement
> > with a motif
> > of length ~20 takes almost a minute in my machine.
> >
> > To make it faster, I implemented a C version of it and a
> > Python interface
> > so that you can call it from Python. It is pretty fast, it
> > takes about ~2.5
> > seconds.
> >
> > Current implementation can be found at:
> >
> > https://github.com/sefakilic/yassi
> >
> > If anyone is interested and it is appropriate, I would like
> > to modify the
> > current implementation and integrate it into Biopython.
> >
> > Thanks!
> >
> > Sefa Kilic
> > _______________________________________________
> > Biopython-dev mailing list
> > Biopython-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython-dev
> >
>




More information about the Biopython-dev mailing list