[EMBOSS] fuzznuc repetition issue
bernd.web at gmail.com
Tue Apr 9 15:12:45 UTC 2013
I tried a repetition with fuzznuc in a pattern. It seems that when I start
a range with 0 e.g. (0,1), the pattern is not found when it is located at
the end of the sequence and the count of the character is 0. This only
occurs when there are more matches possible.
The following example shows this. It contains the pattern, once with 4
fuzznuc -pmismatch 4 -sequence test.fa -outfile test.fuzznuc -pattern
results in the expected output:
Start End Strand Pattern Mismatch Sequence
29 48 + pattern:GAGGAAGGAGGGGATGACGT 4
61 80 + pattern:GAGGAAGGAGGGGATGACGT .
However, fuzznuc -pmismatch 4 -sequence test.fa -outfile test.fuzznuc
only find the first pattern at pos 29, with 0,1,2 and 3 times a match any
nucleotide (so 4 matches in total), but not the one at 61-80.
Now, if I request 0 mismatches (-pmismatch 0), then this last pattern is
reported (from 61 to 80). When requesting e.g. 3 mismatches no hit is
found. The first has 4 mismatches, but now also also last with 0 mismatches
is not reported. This only seems to be reported when I ask for 0
However, when allowing 4 mismatches I'd expect 5 hits in total (4 starting
at 29 with 4 mismatches) and one starting at 61.
This occured in EMBOSS 6.3.1 and 6.5.7.
Is this a wrong expectation, or is something not going entirely right?
More information about the EMBOSS