[Biojava-l] Dp newbie question!!

Mark Schreiber markjschreiber at gmail.com
Sat Mar 8 14:41:44 UTC 2008


Hi Alex -

Good to know that the cookbook is helpful in getting you started.  You
are correct about the state path of matches and deletes etc.

The limitation of the model you probably used is that it doesn't loop
back on itself and can by definition find only one match to a repeated
motif.  There are two ways to deal with this. One would be to wire up
the model (set the transition alphabets and probs) so that the model
can repeat (or at least repeat the motif part).  The other would be to
apply the model to a sliding window.  The second approach requires
less understanding of the DP package but is much less efficient and if
you interested in interpreting the forwards and backwards probs it
would be a bit hard to correct for the sliding window.

Hope this helps.

- Mark

On Sat, Mar 8, 2008 at 4:22 AM, alex johansson
<alex.johansson1 at gmail.com> wrote:
> Hi,
>
> Iam a Cell biology student with a growing interest in biojava, although i
> have a very basic biojava experience but the cookbook
> examples makes it easy to get around with api. My question is very basic and
> might sound very stupid, i followed the cookbook example on creating a HMMER
> like profileHMM and made a profile with a set 12 training sequences (19bp)
> and tested it with a test sequence with motif occuring twice.Below is the
> output from the program:
>
> Log Odds = 43.786769243019506
> m-1 m-2 m-3 m-4 m-5 m-6 d-7 m-8 m-9 m-10 m-11 m-12 m-13 m-14 m-15 m-16 m-17
> d-18 i-18 d-19 i-19 i-19 i-19 i-19 i-19 i-19 i-19 i-19 i-19 i-19 i-19 i-19
> i-19 i-19 i-19 i-19 i-19 i-19 i-19 i-19 i-19 i-19 i-19 m-20 i-20
>
> My Question is how to interpret these results?how do i know if the motif is
> occuring twice and its location in the test sequence?? I know that 'm'
> stands for match and 'i' and 'd' stands for insert and delete transitions in
> the path from start to end.
>
> I'd certainly appreciate if the biojava gurus out there could spend some of
> their valuable time in explaining this.
>
> Thank you for your time,
>
> Cheers,
> Alex J
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>



More information about the Biojava-l mailing list