[EMBOSS] patmatdb

Peter Rice pmr at ebi.ac.uk
Fri Feb 3 17:31:39 UTC 2006


Hi Stefan,

> what exactly does the flag -snucleotide1 toggle in patmatdb?

-snucleotide (and -sprotein) are available for all sequence inputs. They are 
used for programs that can read DNA of rpotein sequences, where you have a 
sequence that can be both types (a short sequence ni FASTA format for example)

>>From the doc I was thinking it would enable searching against nucleotide
> acid sequences instead of proteins. However, execution aborts with an
> error message saying that the sequence is not protein.

The sequence type tells patmatdb to only accept protein.

Thinking about this ... we can change the -help output (and the program 
documentation) to describe the sequence type much better than the current 
"sequence database USA". We can, for example, say whether the sequence can be 
DNA or protein and whether gaps, stops, and other characters are used. All we 
need is a short description (which we have) for each sequence type. This is 
something we should have added long ago :-)

The pattern syntax was defined for the PROSITE database ... but we can also 
allow it to search nucleotide data. It is only a small change to the program. 
Does anyone neet to search a nucleotide database with patmatdb?

I suspect patmatdb is rather redundant ... you can get the same results from 
fuzzpro (for protein) or fuzznuc (for nucleotide) ... -rformat dbmotif will 
give you the same output format as patmatdb.

Any preferences?

Peter




More information about the EMBOSS mailing list