[EMBOSS] fuzznuc repetition issue
Bernd W
bernd.web at gmail.com
Tue Apr 9 15:12:45 UTC 2013
Hi,
I tried a repetition with fuzznuc in a pattern. It seems that when I start
a range with 0 e.g. (0,1), the pattern is not found when it is located at
the end of the sequence and the count of the character is 0. This only
occurs when there are more matches possible.
The following example shows this. It contains the pattern, once with 4
mismatches.
>test
ACTACTACATACATACACATATACACATGAGGTTTTAGGGGATGACGTAAGGGGGNNNNNGAGGAAGGAGGGGATGACGT
fuzznuc -pmismatch 4 -sequence test.fa -outfile test.fuzznuc -pattern
'GAGGAAGGAGGGGATGACGT'
results in the expected output:
Start End Strand Pattern Mismatch Sequence
29 48 + pattern:GAGGAAGGAGGGGATGACGT 4
GAGGTTTTAGGGGATGACGT
61 80 + pattern:GAGGAAGGAGGGGATGACGT .
GAGGAAGGAGGGGATGACGT
However, fuzznuc -pmismatch 4 -sequence test.fa -outfile test.fuzznuc
-pattern 'GAGGAAGGAGGGGATGACGTn(0,3)'
only find the first pattern at pos 29, with 0,1,2 and 3 times a match any
nucleotide (so 4 matches in total), but not the one at 61-80.
Now, if I request 0 mismatches (-pmismatch 0), then this last pattern is
reported (from 61 to 80). When requesting e.g. 3 mismatches no hit is
found. The first has 4 mismatches, but now also also last with 0 mismatches
is not reported. This only seems to be reported when I ask for 0
mismatches.
However, when allowing 4 mismatches I'd expect 5 hits in total (4 starting
at 29 with 4 mismatches) and one starting at 61.
This occured in EMBOSS 6.3.1 and 6.5.7.
Is this a wrong expectation, or is something not going entirely right?
Kind regards,
Bernd
More information about the EMBOSS
mailing list