[EMBOSS] look for 'Ns' in a sequence using fuzznuc?

Fernan Aguero fernan.aguero at gmail.com
Fri Jun 14 19:02:13 UTC 2013


Thanks!

--
fernan


On Fri, Jun 14, 2013 at 4:01 PM, Peter Rice <ricepeterm at yahoo.co.uk> wrote:

> Hi Fernan,
>
>
> On 14/06/2013 18:09, Fernan Aguero wrote:
>
>> I guess I came across a problem ... I'm trying to rapidly find runs of Ns
>> in a nucleotide sequence, and produce the corresponding 'assembly_gap'
>> annotations in GFF format. This is all derived from scaffolded contigs.
>>
>> I've tried fuzznuc first because it's easy to specify a pattern, and get a
>> list of locations in GFF format. However, fuzznuc uses N to mean any base.
>>
>> Is there a way to subvert fuzznuc to use another character for this
>> purpose?
>>
>
> Already subverted for the next release in July. EMBOSS 6.6 will let you
> escape the N with a backslash in a pattern file (or two backslashes on the
> command line) to cancel the conversion of N to any base.
>
>
>  Or maybe there's another emboss program to do this?
>>
>
> dreg uses regular expressions and so will find the Ns (but I see a bug in
> some of the reported positions if you use wildcards in the pattern ... to
> be fixed in the next release!)
>
> regards,
>
> Peter Rice
> EMBOSS team
>
>


-- 
fernan



More information about the EMBOSS mailing list