[Biojava-l] Biojava-l Digest, Vol 125, Issue 2

Khalil El Mazouari khalil.elmazouari at gmail.com
Tue Jun 11 16:10:27 UTC 2013


Hi Andreas,

thanks for the feedback.

increasing gop and gep was not sufficient. I had to use a modified version of nuc-4_2 scoring matrix where I set the mismatch value very low. This solved the problem,

Best

khalil




-----

Confidentiality Notice: This e-mail and any files transmitted with it are private and confidential and are solely for the use of the addressee. It may contain material which is legally privileged. If you are not the addressee or the person responsible for delivering to the addressee, please notify that you have received this e-mail in error and that any use of it is strictly prohibited. It would be helpful if you could notify the author by replying to it.



On 11 Jun 2013, at 18:00, biojava-l-request at lists.open-bio.org wrote:

> Send Biojava-l mailing list submissions to
> 	biojava-l at lists.open-bio.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://lists.open-bio.org/mailman/listinfo/biojava-l
> or, via email, send a message with subject or body 'help' to
> 	biojava-l-request at lists.open-bio.org
> 
> You can reach the person managing the list at
> 	biojava-l-owner at lists.open-bio.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Biojava-l digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: Local aln - contig assembly (Andreas Prlic)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Mon, 10 Jun 2013 17:47:41 -0700
> From: Andreas Prlic <andreas at sdsc.edu>
> Subject: Re: [Biojava-l] Local aln - contig assembly
> To: Khalil El Mazouari <khalil.elmazouari at gmail.com>
> Cc: "Biojava-l at lists.open-bio.org" <biojava-l at lists.open-bio.org>
> Message-ID:
> 	<CALthepyva-+rAoP=8yH=OmDeA1S7i9Ov4js5eAYk-BL4r1Xang at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> Hi Khalil,
> 
> if you can get 100% sequence ID depends on your sequences.. you can try to
> enforce a more strict alignment by increasing the gap penalties
> significantly (try to double or triple gap opening and extension) .
> 
> A
> 
> 
> On Sun, Jun 9, 2013 at 12:32 PM, Khalil El Mazouari <
> khalil.elmazouari at gmail.com> wrote:
> 
>> Hi,
>> 
>> I am trying to assemble overlapping sequence (direct & reverse) via local
>> alignment. I am only searching for local aln with 100% identity.
>> 
>> Which parameters, matrix ... should I use in order to get 100% ident.
>> local aln.
>> 
>> Any other suggestion for assembling overlapping seq (in Java) is welcome.
>> 
>> Thanks
>> 
>> khalil
>> 
>> 
>> 
>>   SubstitutionMatrix<NucleotideCompound> matrix =
>> SubstitutionMatrixHelper.getNuc4_2();
>>   SimpleGapPenalty gapP = new SimpleGapPenalty();
>>   gapP.setOpenPenalty((short) 5);
>>   gapP.setExtensionPenalty((short) 1);
>>   SequencePair<DNASequence, NucleotideCompound> psa =
>>   Alignments.getPairwiseAlignment(query, target,
>>   PairwiseSequenceAlignerType.LOCAL, gapP, matrix);
>> 
>> 
>> 
>> 
>> ========
>> 
>> Local Alignment Identity: 97.84688995215312%
>> 
>> query     GGGGAAAACACGAAAGGCCCTTGGTGGAGGCGCTTGAGACGGTGACAAGGGTTCCCTGGC  68
>>          |||||| || |||  ||||||||||||||||||||||||||||||| |||||||||||||
>> target    GGGGAAGAC-CGATGGGCCCTTGGTGGAGGCGCTTGAGACGGTGACCAGGGTTCCCTGGC 417
>> 
>> query     CCCAGTAGTCAAAGGTCCGTGAGGAGCTCCACTTGTGTGCACAGTAATATGTGGCTGAGT 128
>>          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> target    CCCAGTAGTCAAAGGTCCGTGAGGAGCTCCACTTGTGTGCACAGTAATATGTGGCTGAGT 477
>> 
>> query     CCACAGGGTCCATGTTGGTCATTGTAAGGACCACCTGGTCTTTGGAGGTGTCCTTGGTGA 188
>>          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> target    CCACAGGGTCCATGTTGGTCATTGTAAGGACCACCTGGTCTTTGGAGGTGTCCTTGGTGA 537
>> 
>> query     TGGTGAGCCTGCTCTTCAGAGATGGGCTGTAGCGCTTATCATCATTCCAATAAATGAGTG 248
>>          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> target    TGGTGAGCCTGCTCTTCAGAGATGGGCTGTAGCGCTTATCATCATTCCAATAAATGAGTG 597
>> 
>> query     CAAGCCACTCCAGGGCCTTTCCTGGGGGCTGACGGATCCAGCCCACACCCACTCCACTAG 308
>>          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> target    CAAGCCACTCCAGGGCCTTTCCTGGGGGCTGACGGATCCAGCCCACACCCACTCCACTAG 657
>> 
>> query     TGCTGAGTGAGAACCCAGAGAAGGTGCAGGTCAGCGTGAGGGTCTGTGTGGGTTTCACCA 368
>>          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> target    TGCTGAGTGAGAACCCAGAGAAGGTGCAGGTCAGCGTGAGGGTCTGTGTGGGTTTCACCA 717
>> 
>> query     GCGTAGGACCAGACTCCTTCAAGGTGATCTGGGCCATGGCCGGCTGGGCCGCGAGTAA 426
>>          |||||||||||||||||||||||||| ||||||||| |||||||||| |||| |||||
>> target    GCGTAGGACCAGACTCCTTCAAGGTG-TCTGGGCCA-GGCCGGCTGG-CCGCAAGTAA 772
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> -----
>> 
>> Confidentiality Notice: This e-mail and any files transmitted with it are
>> private and confidential and are solely for the use of the addressee. It
>> may contain material which is legally privileged. If you are not the
>> addressee or the person responsible for delivering to the addressee, please
>> notify that you have received this e-mail in error and that any use of it
>> is strictly prohibited. It would be helpful if you could notify the author
>> by replying to it.
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
> 
> 
> End of Biojava-l Digest, Vol 125, Issue 2
> *****************************************





More information about the Biojava-l mailing list