[Biojava-l] Local aln - contig assembly

Andreas Prlic andreas at sdsc.edu
Tue Jun 11 00:47:41 UTC 2013


Hi Khalil,

if you can get 100% sequence ID depends on your sequences.. you can try to
enforce a more strict alignment by increasing the gap penalties
significantly (try to double or triple gap opening and extension) .

A


On Sun, Jun 9, 2013 at 12:32 PM, Khalil El Mazouari <
khalil.elmazouari at gmail.com> wrote:

> Hi,
>
> I am trying to assemble overlapping sequence (direct & reverse) via local
> alignment. I am only searching for local aln with 100% identity.
>
> Which parameters, matrix ... should I use in order to get 100% ident.
> local aln.
>
> Any other suggestion for assembling overlapping seq (in Java) is welcome.
>
> Thanks
>
> khalil
>
>
>
>    SubstitutionMatrix<NucleotideCompound> matrix =
> SubstitutionMatrixHelper.getNuc4_2();
>    SimpleGapPenalty gapP = new SimpleGapPenalty();
>    gapP.setOpenPenalty((short) 5);
>    gapP.setExtensionPenalty((short) 1);
>    SequencePair<DNASequence, NucleotideCompound> psa =
>    Alignments.getPairwiseAlignment(query, target,
>    PairwiseSequenceAlignerType.LOCAL, gapP, matrix);
>
>
>
>
> ========
>
> Local Alignment Identity: 97.84688995215312%
>
> query     GGGGAAAACACGAAAGGCCCTTGGTGGAGGCGCTTGAGACGGTGACAAGGGTTCCCTGGC  68
>           |||||| || |||  ||||||||||||||||||||||||||||||| |||||||||||||
> target    GGGGAAGAC-CGATGGGCCCTTGGTGGAGGCGCTTGAGACGGTGACCAGGGTTCCCTGGC 417
>
> query     CCCAGTAGTCAAAGGTCCGTGAGGAGCTCCACTTGTGTGCACAGTAATATGTGGCTGAGT 128
>           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> target    CCCAGTAGTCAAAGGTCCGTGAGGAGCTCCACTTGTGTGCACAGTAATATGTGGCTGAGT 477
>
> query     CCACAGGGTCCATGTTGGTCATTGTAAGGACCACCTGGTCTTTGGAGGTGTCCTTGGTGA 188
>           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> target    CCACAGGGTCCATGTTGGTCATTGTAAGGACCACCTGGTCTTTGGAGGTGTCCTTGGTGA 537
>
> query     TGGTGAGCCTGCTCTTCAGAGATGGGCTGTAGCGCTTATCATCATTCCAATAAATGAGTG 248
>           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> target    TGGTGAGCCTGCTCTTCAGAGATGGGCTGTAGCGCTTATCATCATTCCAATAAATGAGTG 597
>
> query     CAAGCCACTCCAGGGCCTTTCCTGGGGGCTGACGGATCCAGCCCACACCCACTCCACTAG 308
>           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> target    CAAGCCACTCCAGGGCCTTTCCTGGGGGCTGACGGATCCAGCCCACACCCACTCCACTAG 657
>
> query     TGCTGAGTGAGAACCCAGAGAAGGTGCAGGTCAGCGTGAGGGTCTGTGTGGGTTTCACCA 368
>           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> target    TGCTGAGTGAGAACCCAGAGAAGGTGCAGGTCAGCGTGAGGGTCTGTGTGGGTTTCACCA 717
>
> query     GCGTAGGACCAGACTCCTTCAAGGTGATCTGGGCCATGGCCGGCTGGGCCGCGAGTAA 426
>           |||||||||||||||||||||||||| ||||||||| |||||||||| |||| |||||
> target    GCGTAGGACCAGACTCCTTCAAGGTG-TCTGGGCCA-GGCCGGCTGG-CCGCAAGTAA 772
>
>
>
>
>
>
>
>
>
>
> -----
>
> Confidentiality Notice: This e-mail and any files transmitted with it are
> private and confidential and are solely for the use of the addressee. It
> may contain material which is legally privileged. If you are not the
> addressee or the person responsible for delivering to the addressee, please
> notify that you have received this e-mail in error and that any use of it
> is strictly prohibited. It would be helpful if you could notify the author
> by replying to it.
>
>
>
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>



More information about the Biojava-l mailing list