[EMBOSS] fuzznuc pattern expansion

Bernd Web bernd.web at gmail.com
Mon Nov 7 12:24:36 UTC 2011


Dear all,

Is or would it be possible to see the (numeric) position of the
mismatches in the fuzznuc output file.
E.g. the example output file shows mismatches, but not where there are located:
http://emboss.sourceforge.net/apps/release/6.4/emboss/apps/fuzznuc.html#output.4
# pat2                1 cg(2)c(3)taaccctagc(3)ta
  605     624       + pat2: cg(2)c(3)taaccctagc(3)ta        1
cggccctaaccctaacccta

Clearly, we can find the position of mismatched by matching the
supplied pattern with the reported match, but would not be preferred.


Kind regards,
Bernd

On Wed, Nov 2, 2011 at 6:37 PM, Peter Rice <pmr at ebi.ac.uk> wrote:
> Dear Bernd,
>
> On 02/11/2011 15:12, Bernd Web wrote:
>
>> Thanks! It would indeed be great to have the option to seach on the
>> ambiguity codes directly. Probably, I'd prefer the escape option, but
>> you mean to implement both escaping and expansion to subsets?
>
> Yes, we will implement both. Escaping is needed to find any ambiguity codes
> in a sequence. Expansion allows S to find G, C and S.
>
>> It might be good to report the pattern that was used in the matching.
>> Would the (very high) speed of fuzznuc be affected by always exploding
>> the to the subsets? For example, "N" would become "ACTGUMRWSYKVHDB".
>
> N is not a problem - it matches anything. The 2-letter ambiguity codes only
> expand to one extra letter, and 3-letter codes (B, D, H, V) are only very
> rarely used.
>
> regards,
>
> Peter Rice
> EMBOSS Team
>
>



More information about the EMBOSS mailing list