[Biojava-l] Biojava-l Digest, Vol 131, Issue 3

Peter S peters337 at yahoo.co.uk
Fri Jan 17 18:15:05 UTC 2014


Thanks Andreas, 

I am switching from python/perl so my java is not great but with the implementation you mention I would need to pass the sequence each time and run it one by one? SSEARCH is also 'slow' (SW) but has a lot of optimization in place so at the end it does not take that long to run it. It's in C++ though.

Peter



On Friday, 17 January 2014, 18:09, Andreas Prlic <andreas at sdsc.edu> wrote:
 
We do have a Smith Waterman implementation in Biojava. However the algorithm is based on dynamic programming, which by definition is "slow" but gives you the optimal alignment...

http://biojava.org/wiki/BioJava:CookBook3:PSA#Local_alignment


Andreas







On Fri, Jan 17, 2014 at 9:50 AM, Peter S <peters337 at yahoo.co.uk> wrote:

Thanks, I will give it a try. 
>
>Does it mean there is no fast implementation of SW in java that I can use? 
>
>Best,
>Peter
>
>
>
>
>On Friday, 17 January 2014, 17:45, Khalil El Mazouari <khalil.elmazouari at gmail.com> wrote:
>
>Hi Peter,
>
>give it a try with Levenshtein Distance. You can use StringUtils from apache common lang. it has a getLevenshteinDistance method.
>
>best,
>
>Khalil
>
>
>
>On 17 Jan 2014, at 18:37, Peter S <peters337 at yahoo.co.uk> wrote:
>
>Hi Khalil,
>>
>>
>>By short sequence I mean 12-18 nt long. I need to make alignment against the entire transcriptome and detect matches with up to 3 mismatches. This is the reason I need something quite fast but sensitive at the same time. 
>>
>>
>>Many thanks,
>>Peter
>>
>>
>>
>>On Friday, 17 January 2014, 17:26, Khalil El Mazouari <khalil.elmazouari at gmail.com> wrote:
>>
>>Hi,
>>
>>what do you mean by short sequences? NT or AA?
>>
>>Best
>>
>>Khalil
>>
>>On 17 Jan 2014, at 18:00, biojava-l-request at lists.open-bio.org wrote:
>>
>>> Send Biojava-l mailing list submissions to
>>>     biojava-l at lists.open-bio.org
>>>
>>> To subscribe or unsubscribe via the World Wide Web, visit
>>>     http://lists.open-bio.org/mailman/listinfo/biojava-l
>>> or, via email, send a message with subject or body 'help' to
>>>     biojava-l-request at lists.open-bio.org
>>>
>>> You can reach the person managing the list at
>>>     biojava-l-owner at lists.open-bio.org
>>>
>>> When replying, please edit your Subject line so it is more specific
>>> than "Re: Contents of Biojava-l digest..."
>>>
>>>
>>> Today's Topics:
>>>
>>>   1. Database search with Smith and Waterman (Peter S)
>>>
>>>
>>> ----------------------------------------------------------------------
>>>
>>> Message: 1
>>> Date: Fri, 17 Jan 2014 13:27:17 +0000 (GMT)
>>> From: Peter S <peters337 at yahoo.co.uk>
>>> Subject: [Biojava-l] Database search with Smith and Waterman
>>>
> To: "biojava-l at lists.open-bio.org" <biojava-l at lists.open-bio.org>
>>> Message-ID:
>>>     <1389965237.13315.YahooMailNeo at web172703.mail.ir2.yahoo.com>
>>> Content-Type: text/plain; charset=iso-8859-1
>>>
>>> Dear All,?
>>>
>>> I'm looking for an implementation of Smith and Waterman algorithm to use in the Java desktop application I want to develop.?
>>>
>>> I did find some information on pairwise aligners but what I would ideally want to have is something similar to the SSEARCH package that can perform alignments against a very big databases,
> saved locally in a fasta format. Speed is quite important and ideally I would need an output that I can easily parse, identifying mismatch/gap positions etc.
>>>
>>> Any suggestions if there is any java implementation that would fit the description? I will be working on short sequences so sensitivity is crucial.?
>>>
>>> Thanks very much for your help,
>>> Peter
>>>
>>>
>>> ------------------------------
>>>
>>> _______________________________________________
>>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>>
>>>
>>> End of Biojava-l Digest, Vol 131, Issue 3
>>>
> *****************************************
>>
>>
>>
>>
>>
>>-----
>>
>>Confidentiality Notice: This e-mail and any files transmitted with it are private and confidential and are solely for the use of the addressee. It may contain material which is legally privileged. If you are not the addressee or the person responsible for delivering to the addressee, please notify that you have received this e-mail in error and that any use of it is strictly prohibited. It would be helpful if you could notify the author by replying to it.
>>
>>
>>
>>
>>
>
>
>
>
>
>
>-----
>
>Confidentiality Notice: This e-mail and any files transmitted with it are private and confidential and are solely for the use of the addressee. It may contain material which is legally privileged. If you are not the addressee or the person responsible for delivering to the addressee, please notify that you have received this e-mail in error and that any use of it is strictly prohibited. It would be helpful if you could notify the author by replying to it.
>
>_______________________________________________
>Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/biojava-l
>


More information about the Biojava-l mailing list