[Biojava-dev] Near Matches

Matthew Pocock matthew_pocock at yahoo.co.uk
Fri Sep 12 05:07:17 EDT 2003


Hi John,

I commited code that does a slightly different job - accpets or rejects 
regions based upon sequence content. However, it will be easy to write 
an aproximate matcher that does at most n miss-matches. The algorithms 
that sudgest themselves to me are all O(mn) - not sure if we can do 
better without a math genius, or a good string algorithms book.

Matthew

Schreiber, Mark wrote:

>Hi -
>
>I think Matthew checked in something that does what you want a few days back. Have a look at the list archives, or cvs records for that last few weeks.
>
>- Mark
>
>
>-----Original Message-----
>From: Osborne, John [mailto:jko1 at cdc.gov] 
>Sent: Friday, 12 September 2003 6:20 a.m.
>To: biojava-dev at biojava.org
>Subject: [Biojava-dev] Near Matches
>
>
>Hi,
>
>I am looking for a way in Biojava to iterate quickly through a list of DNA N-mers for sequences that are almost an exact match, like 23 of 25 bases.  The mismatches can occur in ANY position in a sequence.  Other than iterating through a SymbolList and keeping track of the number of mismatches, is there a better (read faster) way to do this?  I was thinking maybe the SuffixTree class, but since sequence order is unimportant it doesn't see like the right tool for the job.
>
>Right now it is going to be a little bit ugly, since I am putting this into a O(n^2) function with a big n...
>
> -John
>_______________________________________________
>biojava-dev mailing list
>biojava-dev at biojava.org http://biojava.org/mailman/listinfo/biojava-dev
>=======================================================================
>Attention: The information contained in this message and/or attachments
>from AgResearch Limited is intended only for the persons or entities
>to which it is addressed and may contain confidential and/or privileged
>material. Any review, retransmission, dissemination or other use of, or
>taking of any action in reliance upon, this information by persons or
>entities other than the intended recipients is prohibited by AgResearch
>Limited. If you have received this message in error, please notify the
>sender immediately.
>=======================================================================
>
>_______________________________________________
>biojava-dev mailing list
>biojava-dev at biojava.org
>http://biojava.org/mailman/listinfo/biojava-dev
>
>  
>




More information about the biojava-dev mailing list