[Biojava-l] SimpleGapPenalty defaults

Khalil El Mazouari khalil.elmazouari at gmail.com
Thu Jan 20 08:42:11 UTC 2011


please try with the following sequences
>seq1
QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSS
>seq2
QVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCAR

thanks,

khalil

On 19 Jan 2011, at 22:35, Andreas Prlic wrote:

> even if I use the global alignment for aligning this sequence against
> itself,  it aligns 100% and I don;t see the strange gap. What are the
> two sequences you are aligning? Otherwise I can;t reproduce the
> behaviour that you describe.
> 
> Andreas
> 
> 
> 
> On Wed, Jan 19, 2011 at 1:04 PM, Khalil El Mazouari
> <khalil.elmazouari at gmail.com> wrote:
>> Thank Andreas,
>> 
>> these 2 seq (s1 and s2) are exactly the same. Indeed, it works for 100% identical seq.
>> 
>> I have used the same code as below except, I used .GLOBAL. I am not interested in local alignment.
>> 
>> Regards,
>> 
>> Khalil
>> 
>> 
>> On 19 Jan 2011, at 16:07, Andreas Prlic wrote:
>> 
>>> Hi Kalil,
>>> 
>>> can you send your code snipplet that you are running? I just re-ran
>>> the cookbook example and it works for me. Also this behaves fine:
>>> 
>>> ProteinSequence s1 = new
>>> ProteinSequence("QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSSQVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCAR");
>>>               ProteinSequence s2 = new
>>> ProteinSequence("QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSSQVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCAR");
>>> 
>>>               SubstitutionMatrix<AminoAcidCompound> matrix = new
>>> SimpleSubstitutionMatrix<AminoAcidCompound>();
>>>               SequencePair<ProteinSequence, AminoAcidCompound> pair =
>>> Alignments.getPairwiseAlignment(s1, s2,
>>>                               PairwiseSequenceAlignerType.LOCAL, new SimpleGapPenalty(), matrix);
>>>               System.out.printf("%n%s vs %s%n%s", pair.getQuery().getAccession(),
>>> pair.getTarget().getAccession(), pair);
>>> 
>>>               System.out.println("Identicals:" + pair.getNumIdenticals());
>>>               System.out.println("Similars:" + pair.getNumSimilars());
>>> 
>>> Andreas
>>> 
>>> 
>>> 
>>> On Wed, Jan 19, 2011 at 2:39 AM, Khalil El Mazouari
>>> <khalil.elmazouari at gmail.com> wrote:
>>>> Hi all,
>>>> 
>>>> while doing PSA or MSA with default gop and gep values I obtained the following alignment!
>>>> 
>>>> QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSS
>>>> QVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCA---------------------R
>>>> 
>>>> Expected PSA should be at least
>>>> QVQLQQPGSELVKPGASVKLSCKASGYTFTNYLIHWVRQRPGRGLEWIGRIDPNSGGTKYSEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCATYYFGRSFFDFWGQGTTLTVSS
>>>> QVQLQQPGAELVKPGASVKLSCKASGYTFTSYWMHWVKQRPGRGLEWIGRIDPNSGGTKYNEKFKSKATLTVDKPSSTAYMQLSSLTSEDSAVYYCA-----R----------------
>>>> 
>>>> this expected alignment was obtained with gop=1 and gep=100
>>>> 
>>>> I can't understand while the PSA algorithm with default values always adds many gaps at the end of alignment to end up with a S:R while it is obvious that with less gaps we could obtain better SequencePair with R:R?
>>>> 
>>>> Finally, how to get a score for PSA, that reflects the number of identical, similar residues and gaps?
>>>> 
>>>> Many thanks.
>>>> 
>>>> Khalil
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>>> 
>> 
>> 





More information about the Biojava-l mailing list