Patterns, patternlists and nucleus/embpat.h
Henrikki Almusa
henrikki.almusa at helsinki.fi
Fri Aug 6 07:09:18 UTC 2004
Hello,
I've been checking more on how I could make patterns and patternlist system
(some patches sent month ago) to work with fuzz(nuc|pro|tran). Only way that
i could think of is making a new stucture in 'nucleus/embpat.h', which would
hold almost all the variables needed to run embPatGetType (),
embPatCompile(), embPatFuzzSearch().
So heres the proposition:
/* @data EmbPPatComp
**********************************************************
**
** NUCLEUS data structure that holds all needed datas for compiling and
** searching. Not including mismatch number.
**
** @attr pattern [AjPStr] Prosite pattern string
** @attr type [ajint] Prosite pattern compile type
** @attr plen [ajint] Prosite pattern length
** @attr buf [ajint*] Buffer for BMH search
** @attr off [struct EmbSPatBYPNode] Offset buffer for B-Y/P search
** @attr sotable [ajuint*] Buffer for SHIFT-OR
** @attr solimit [ajint] Limit for BMH search
** @attr m [ajint] Real length of pattern (from embPatGetType)
** @attr regex [AjPStr] PCRE regexp string
** @attr skipm [ajint**] Skip buffer for Tarhio-Ukkonen
** @attr amino [AjPBool] Must match left begin
** @attr carboxyl [AjPBool] Must match right
** @@
******************************************************************************/
typedef struct EmbSPatComp
{
AjPStr pattern;
ajint type;
ajint plen;
ajint* buf;
EmbOPatBYPNode off[AJALPHA];
ajuint* sotable;
ajuint solimit;
ajint m;
AjPStr regex;
ajint** skipm;
AjPBool amino;
AjPBool carboxyl;
} EmbOPatComp;
#define EmbPPatComp EmbOPatComp*
And functions for this would be:
void embPatCompileII (EmbPPatComp thys);
void embPatFuzzSearchII (EmbPPATComp thys, ajint begin, const AjPStr name,
const AjPStr text, AjPList l, ajint mismatch,
void** tidy);
void embPatGetTypeII (EmbPPatComp thys, const AjPStr pattern, ajint mismatch,
AjPBool protein);
Any comments?
Thanks,
--
Henrikki Almusa
More information about the emboss-dev
mailing list