[Biojava-dev] Doing pattern matching on Proteins

Andreas Prlic andreas at sdsc.edu
Wed Nov 24 01:41:05 UTC 2010


What about converting the sequence to a string and then use the
standard regular expressions?

Andreas

On Tue, Nov 23, 2010 at 4:42 AM, Uday Kamath <kamathuday at gmail.com> wrote:
> Anyone who can answer this query? I am stuck with this in my research work,
> so appreciate any response.
> Uday
>
> On Fri, Nov 19, 2010 at 1:48 PM, Uday Kamath <kamathuday at gmail.com> wrote:
>
>>
>>
>> Andreas
>> Thanks for your reply. I looked at it, here are the problems i faced
>> 1. Matcher matcher = p.matcher(seq.seqString());
>> Pattern doesn't have a method to do match on the sequence string but on the
>> Sequence, atleast in my verison of BioJava
>>
>> 2.Pattern p = Pattern.compile( MotifTools.createRegex(motif) );
>> doesnn't work so need to create a pattern factory with alphabets and
>> compile it to create a pattern. So i changed the MotifLister in example to
>> have
>>       FiniteAlphabet alphabets = ProteinTools.getTAlphabet();
>>     Pattern p = PatternFactory.makeFactory(alphabets).compile(target);
>>
>> 3. When i use the same code with this modification that is to do matcher()
>> on sequences, it goes into recursion and throws out of memory exception.
>>
>> I have attached my minor modification and my input. I don't know what i am
>> doing is wrong.
>> JVM args were
>> protein C:\Research\HIV-Protease\SampleProtein.fasta an 3
>>
>> Thanks for your reply, would really appreciate if you give more inkling to
>> the problem
>> Uday Kamath
>>
>>
>> On Fri, Nov 19, 2010 at 12:49 PM, Andreas Prlic <andreas at sdsc.edu> wrote:
>>
>>> Hi Uday,
>>>
>>> have you seen this cookbook page?
>>>
>>> http://www.biojava.org/wiki/BioJava:Cookbook:Sequence:Regex
>>>
>>>
>>> Andreas
>>>
>>> On Thu, Nov 18, 2010 at 7:49 PM, Uday Kamath <kamathuday at gmail.com>
>>> wrote:
>>> > Anyone ? anyhelp? a way to to motif search example or what am i doing
>>> wrong
>>> > below?
>>> > Thanks a ton!
>>> > Uday
>>> >
>>> > On Thu, Nov 18, 2010 at 9:54 AM, Uday Kamath <kamathuday at gmail.com>
>>> wrote:
>>> >
>>> >> Hello
>>> >> A simple question,
>>> >>
>>> >> In order to search a motif in Protein i used following code, is my
>>> method
>>> >> to create pattern factory right? Because matcher is going in infinite
>>> >> recurssion. Can someone suggest right usage? Thanks a ton
>>> >>
>>> >> //sample
>>> >> FiniteAlphabet alphabet = ProteinTools.getAlphabet();
>>> >> factory = PatternFactory.makeFactory(alphabet);
>>> >> SymbolList proteinSequence = ProteinTools.createProtein("CANLSTFA");
>>> >> //in the sequence find the match
>>> >> SymbolList motif = ProteinTools.createProtein("FA");
>>> >> Pattern p = HivProteaseProblem.factory.compile(
>>> >> MotifTools.createRegex(motif));
>>> >> Matcher occurences= p.matcher(proteinSequence);
>>> >>
>>> > _______________________________________________
>>> > biojava-dev mailing list
>>> > biojava-dev at lists.open-bio.org
>>> > http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>> >
>>>
>>>
>>>
>>> --
>>> -----------------------------------------------------------------------
>>> Dr. Andreas Prlic
>>> Senior Scientist, RCSB PDB Protein Data Bank
>>> University of California, San Diego
>>> (+1) 858.246.0526
>>> -----------------------------------------------------------------------
>>>
>>
>>
>>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>




More information about the biojava-dev mailing list