[Biojava-l] SimpleGapPenalty defaults

Andreas Prlic andreas at sdsc.edu
Wed Jan 19 21:35:57 UTC 2011


even if I use the global alignment for aligning this sequence against
itself,  it aligns 100% and I don;t see the strange gap. What are the
two sequences you are aligning? Otherwise I can;t reproduce the
behaviour that you describe.

Andreas



On Wed, Jan 19, 2011 at 1:04 PM, Khalil El Mazouari
<khalil.elmazouari at gmail.com> wrote:
> Thank Andreas,
>
> these 2 seq (s1 and s2) are exactly the same. Indeed, it works for 100% identical seq.
>
> I have used the same code as below except, I used .GLOBAL. I am not interested in local alignment.
>
> Regards,
>
> Khalil
>
>
> On 19 Jan 2011, at 16:07, Andreas Prlic wrote:
>
>> Hi Kalil,
>>
>> can you send your code snipplet that you are running? I just re-ran
>> the cookbook example and it works for me. Also this behaves fine:
>>
>> ProteinSequence s1 = new
>> ProteinSequence("QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSSQVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCAR");
>>               ProteinSequence s2 = new
>> ProteinSequence("QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSSQVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCAR");
>>
>>               SubstitutionMatrix<AminoAcidCompound> matrix = new
>> SimpleSubstitutionMatrix<AminoAcidCompound>();
>>               SequencePair<ProteinSequence, AminoAcidCompound> pair =
>> Alignments.getPairwiseAlignment(s1, s2,
>>                               PairwiseSequenceAlignerType.LOCAL, new SimpleGapPenalty(), matrix);
>>               System.out.printf("%n%s vs %s%n%s", pair.getQuery().getAccession(),
>> pair.getTarget().getAccession(), pair);
>>
>>               System.out.println("Identicals:" + pair.getNumIdenticals());
>>               System.out.println("Similars:" + pair.getNumSimilars());
>>
>> Andreas
>>
>>
>>
>> On Wed, Jan 19, 2011 at 2:39 AM, Khalil El Mazouari
>> <khalil.elmazouari at gmail.com> wrote:
>>> Hi all,
>>>
>>> while doing PSA or MSA with default gop and gep values I obtained the following alignment!
>>>
>>> QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSS
>>> QVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCA---------------------R
>>>
>>> Expected PSA should be at least
>>> QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSS
>>> QVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCA-----R----------------
>>>
>>> this expected alignment was obtained with gop=1 and gep=100
>>>
>>> I can't understand while the PSA algorithm with default values always adds many gaps at the end of alignment to end up with a S:R while it is obvious that with less gaps we could obtain better SequencePair with R:R?
>>>
>>> Finally, how to get a score for PSA, that reflects the number of identical, similar residues and gaps?
>>>
>>> Many thanks.
>>>
>>> Khalil
>>>
>>>
>>>
>>> _______________________________________________
>>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>>
>
>




More information about the Biojava-l mailing list