Pattern lists and fuzz(nuc|pro|tran) and [pd]reg

Henrikki Almusa henrikki.almusa at helsinki.fi
Wed Jun 16 10:30:27 UTC 2004


On Monday 14 June 2004 12:26, Gary Williams, Tel 01223 494522 wrote:
> Should the file of patterns allow each pattern to have its own allowed
> number of mismatches?
>
> >pat1 <mismatch=1>
>
> ggataata[ac]{2}gg
>
> >pat2 <mismatch=2>
>
> gcggcatgtagc[gc]{3}att

No reason why not.

Now the coding itself. Since reading that file is pretty low level stuff, it 
should probably be in "ajax/" dir? My obj c abilities are not perhaps that 
good. Anyone willing to help on the .c side?

What might be needed in .h (names can be changed). This mainly is for using 
the pattern in program. This is now currently just what I could come up with, 
so I can go completely off here :).

struct AjSPattern {
        AjPStr name;
        AjPStr opropat;
        AjPStr propat;
        AjPRegex regexpat;
        ajint mismatch;
} AjOPattern;
#define AjPPattern AjOPattern*

struct AjSPatlist {
        AjPList patlist;
        ajint type; # 0 regex, 1 prosite
} AjPOPatlist;
#define AjPPatlist AjOPatlist*

AjBool ajPatlistGetNext (patlist, &pattern);
void ajPatlistRewind (patlist);
ajint ajPatlistGetType (patlist);

AjPStr ajPatternGetName (pattern);
ajint ajPatternGetType (pattern); 
        whether propat is not NULL in struct should work
ajint ajPatternGetMismatch (pattern);
AjPStr ajPatternGetPro (pattern);
AjPStr ajPatternGetOrigPro (pattern);
AjPRegex ajPatternGetRegex (pattern);
AjPStr ajPatternGetPattern (pattern);
        should return string representation of pattern it has

Acd would need propably new file type, patlist. It should be defined in the 
programs acd file, whether the pattern is regex or prosite. This would allow 
the reading (and compiling the patterns) in acd command.

acdGetPatlist();

Comments? 
-- 
Henrikki Almusa



More information about the emboss-dev mailing list