[Bioperl-l] question about positioning peptide in a full protein sequence

Mon Feb 21 02:57:44 UTC 2011

If this is a direct string match (no ambiguity), just use perl's index function:

       index STR,SUBSTR,POSITION
       index STR,SUBSTR
               The index function searches for one string within another, but
               without the wildcard-like behavior of a full regular-expression
               pattern match.  It returns the position of the first occurrence
               of SUBSTR in STR at or after POSITION.  If POSITION is omitted,
               starts searching from the beginning of the string.  POSITION
               before the beginning of the string or after its end is treated
               as if it were the beginning or the end, respectively.  POSITION
               and the return value are based at 0 (or whatever you've set the
               $[ variable to--but don't do that).  If the substring is not
               found, "index" returns one less than the base, ordinarily "-1".

Also see here:

http://perlmeme.org/howtos/perlfunc/index_function.html

chris

On Feb 20, 2011, at 4:28 PM, Mingwei Min wrote:

> Hi Dave,
> 
> Thank you for your suggestion. when I said "too comple for this simple
> job", I just thought that there might be some particular module that
> could do this straightforwardly. I'll have a try of BLAST anyway.
> Thank you.
> 
> Mingwei
> 
> 2011/2/20 Dave Messina <David.Messina at sbc.su.se>:
>> Hi Mingwei,
>> Please remember to "reply all" so others on the mailing list can follow the
>> conversation.
>> Unless you have some way of other way of mapping the coordinates of the
>> sequence with the post-translational sites to the coordinates of the full
>> sequence, I think you'll have to do a similarity search of some form.
>> BLAST may not be best for this, given that it's sloppy with the ends of an
>> alignment, but there are plenty of options for BLAST that may improve your
>> results. Again, you'll need to be specific about your problem for us to
>> help. I don't what "too complex for this simple job" means. Is it too slow?
>> Are you getting too many hits?
>> 
>> 
>> Dave
>> 
>> 
>> On Sun, Feb 20, 2011 at 22:35, Mingwei Min <mm809 at cam.ac.uk> wrote:
>>> 
>>> Hi Dave,
>>> 
>>> Sorry for not making it clear. Yes, I just want to get the coordinates
>>> of the post-translational sites out of a protein sequence. And what I
>>> have now is the peptide sequence with marker on the post-translated
>>> residue... what should i do to map them to the whole protein sequence
>>> and get the coordinates? The only way I could come up with is blast.
>>> But it seems to be too complex for this simple job....
>>> 
>>> Many thanks,
>>> 
>>> Mingwei
>>> 
>>> 2011/2/20 Dave Messina <David.Messina at sbc.su.se>:
>>>> Hi Mingwei,
>>>> I'm not sure what you mean by "positioning" here. Do you want to get the
>>>> coordinates of the post-translational sites out of a protein sequence
>>>> database record? Or do you want to draw the post-translational sites on
>>>> a
>>>> picture of the protein sequence? Or something else entirely?
>>>> 
>>>> Dave
>>>> 
>>>> 
>>>> 
>>>> On Sat, Feb 19, 2011 at 15:53, Mingwei Min <mm809 at cam.ac.uk> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I am trying to positioning some post-tranlational modification sites,
>>>>> which is marked in peptides, in a full length protein sequence. Anyone
>>>>> would be kind to tell me the model I could use for this?
>>>>> 
>>>>> Many thanks
>>>>> 
>>>>> Mingwei
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> 
>>>> 
>> 
>> 
> 
> 
> 
> -- 
> Mingwei Min  PhD student
> University of Cambridge
> Department of Genetics
> Downing St
> CB2 3EH
> UK
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l