[Biojava-l] SimpleGapPenalty defaults
Khalil El Mazouari
khalil.elmazouari at gmail.com
Thu Jan 20 08:42:11 UTC 2011
please try with the following sequences
>seq1
QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSS
>seq2
QVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCAR
thanks,
khalil
On 19 Jan 2011, at 22:35, Andreas Prlic wrote:
> even if I use the global alignment for aligning this sequence against
> itself, it aligns 100% and I don;t see the strange gap. What are the
> two sequences you are aligning? Otherwise I can;t reproduce the
> behaviour that you describe.
>
> Andreas
>
>
>
> On Wed, Jan 19, 2011 at 1:04 PM, Khalil El Mazouari
> <khalil.elmazouari at gmail.com> wrote:
>> Thank Andreas,
>>
>> these 2 seq (s1 and s2) are exactly the same. Indeed, it works for 100% identical seq.
>>
>> I have used the same code as below except, I used .GLOBAL. I am not interested in local alignment.
>>
>> Regards,
>>
>> Khalil
>>
>>
>> On 19 Jan 2011, at 16:07, Andreas Prlic wrote:
>>
>>> Hi Kalil,
>>>
>>> can you send your code snipplet that you are running? I just re-ran
>>> the cookbook example and it works for me. Also this behaves fine:
>>>
>>> ProteinSequence s1 = new
>>> ProteinSequence("QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSSQVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCAR");
>>> ProteinSequence s2 = new
>>> ProteinSequence("QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSSQVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCAR");
>>>
>>> SubstitutionMatrix<AminoAcidCompound> matrix = new
>>> SimpleSubstitutionMatrix<AminoAcidCompound>();
>>> SequencePair<ProteinSequence, AminoAcidCompound> pair =
>>> Alignments.getPairwiseAlignment(s1, s2,
>>> PairwiseSequenceAlignerType.LOCAL, new SimpleGapPenalty(), matrix);
>>> System.out.printf("%n%s vs %s%n%s", pair.getQuery().getAccession(),
>>> pair.getTarget().getAccession(), pair);
>>>
>>> System.out.println("Identicals:" + pair.getNumIdenticals());
>>> System.out.println("Similars:" + pair.getNumSimilars());
>>>
>>> Andreas
>>>
>>>
>>>
>>> On Wed, Jan 19, 2011 at 2:39 AM, Khalil El Mazouari
>>> <khalil.elmazouari at gmail.com> wrote:
>>>> Hi all,
>>>>
>>>> while doing PSA or MSA with default gop and gep values I obtained the following alignment!
>>>>
>>>> QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSS
>>>> QVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCA---------------------R
>>>>
>>>> Expected PSA should be at least
>>>> QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSS
>>>> QVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCA-----R----------------
>>>>
>>>> this expected alignment was obtained with gop=1 and gep=100
>>>>
>>>> I can't understand while the PSA algorithm with default values always adds many gaps at the end of alignment to end up with a S:R while it is obvious that with less gaps we could obtain better SequencePair with R:R?
>>>>
>>>> Finally, how to get a score for PSA, that reflects the number of identical, similar residues and gaps?
>>>>
>>>> Many thanks.
>>>>
>>>> Khalil
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>>>
>>
>>
More information about the Biojava-l
mailing list