[Biojava-dev] new seq searching classes

Schreiber, Mark mark.schreiber at agresearch.co.nz
Wed Sep 3 18:32:59 EDT 2003


Hey,

That sounds really cool.

-----Original Message-----
From: Matthew Pocock [mailto:matthew.pocock at ncl.ac.uk] 
Sent: Wednesday, 3 September 2003 4:10 a.m.
To: biojava-dev at biojava.org
Subject: [Biojava-dev] new seq searching classes


Hi,

I've added a couple of classes in org.biojava.bio.search for finding 
regions of sequence content. They are SeqContentPattern and 
SeqContentMatcher - the API is loosly based upon KMPSearch and the 1.4 
regex libs. These classes aren't javadocked yet.

SeqContentPattern encapsulates the rules about what regions to select - 
the length, and the minimum and maximum number of occurences for each 
nucleotide.

SeqContentMatcher is a cursor produced by scp.matcher(SymbolList) and 
can be used to find the next match, get the matching sub-sequence and to

discover the offset of that match.

E.g. to find regions of length 10 with at least 8 As, no G or T and at 
most 2 Cs, you could do something like:

SeqContentPattern scp = new SeqContentPattern(DNATools.getDNA());
scp.setLength(10);
scp.setMinCounts(DNATools.a(), 8); scp.setMaxCounts(DNATools.g(), 0);
scp.setMaxCounts(DNATools.c(), 2); scp.setMaxCounts(DNATooos.t(), 0);

Then to search with this you'd do something like:

SeqContentMatcher scm = scp.matcher(symList);

while(scm.find()) {
  System.out.println("Hit at: " + scm.pos());
}

Anybody think this is usefull?

Matthew

_______________________________________________
biojava-dev mailing list
biojava-dev at biojava.org http://biojava.org/mailman/listinfo/biojava-dev
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================



More information about the biojava-dev mailing list