[EMBOSS] non-overlapping matches in fuzznuc?

Peter Rice pmr at ebi.ac.uk
Thu Oct 13 08:44:33 UTC 2011


On 13/10/2011 08:45, Jon Ison wrote:
> Hi chaps (Aengus !)
>
> If I understood Aengus' msg. what's needed is something that simply combines overlapping hits (for
> a given pattern) into one or more non-overlapping "region of hits", and reports those regions e.g.
>
>     Start     End  Strand Pattern_name Mismatch Sequence
>        54      65       + pattern1            5 GCCAAATAAGGG
>       104     115       + pattern1            5 CCTAAATAAGGG
>       179     188       + pattern1            2 CCTTGCTTGG
>       190     200       + pattern1            6 CCGATTAGAGC
>
> Mismatch in this case is reporting the sum of mismatches from before.  A column for number of
> (sub)matches would also be needed.  Is that right Aengus?

I'm not sure that adding the mismatches is sound. I'd assume just a best 
hit from the overlapping matches.

> The above might give a useful result depending in the input pattern.  It would I think be easy
> enough to implement.

This is a report output, so post-processing could be done by trimming 
the results before output using an associated qualifier.

Still not sure how useful it would be, we need more feedback from other 
users on this one please!

Peter Rice
EMBOSS Team




More information about the EMBOSS mailing list