<div dir="ltr"><div>See also this discussion for proteins:<br></div><div><br></div><div><a href="https://github.com/biopython/biopython/issues/3636">https://github.com/biopython/biopython/issues/3636</a></div><div><br></div><div>Having the code handle RNA or DNA looks very straightforward in comparison (e.g. treating U and u the same as T and t in the C code).<br></div><div><br></div><div>Peter<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Dec 15, 2023 at 10:33 AM Váczy-Földi Máté <<a href="mailto:vaczy.foldi.mate@semmelweis.hu">vaczy.foldi.mate@semmelweis.hu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div>
<p>Dear Peter,</p>
<p><br>
</p>
<p>Thank thank you very much for the suggestion! I will go ahead
with this method.</p>
<p><br>
</p>
<p>Best wishes,</p>
<p>Máté<br>
</p>
<div>2023. 12. 15. 10:16 keltezéssel, Peter
Cock írta:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Hello Máté,</div>
<div><br>
</div>
<div>I see you are referring to the
PositionSpecificScoringMatrix class</div>
<div>defined in Bio/motifs/matrix.py which does indeed appear to
be DNA</div>
<div>only. I can't comment on any drawbacks in generalizing that
code</div>
<div>(I can see how I would attempt this), but your first idea
is what I would</div>
<div>have suggested in the short term - map any U to T in your
motifs and</div>
<div>sequences to be searched (i.e. treat as DNA).<br>
</div>
<div><br>
</div>
<div>Peter<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Thu, Dec 14, 2023 at
4:18 PM Váczy-Földi Máté <<a href="mailto:vaczy.foldi.mate@semmelweis.hu" target="_blank">vaczy.foldi.mate@semmelweis.hu</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>Dear Mailing List Members,</p>
<p><br>
</p>
<p>I would like to ask a question related to the Bio.motifs
package.<br>
</p>
<p>I am currently working a project where I need to find RNA
motifs in RNA sequences. After consideration we have
decided to search for the motif occurrences using PSSMs,
and I would like to implement this using Biopython. I
looked at the relevant codes in the in the matrix.py file
and I have seen that the PSSM calculate method is hard
coded to work only with DNA. There is also a note saying
"the sequence can only be a DNA sequence". <br>
</p>
<p>My question is that:</p>
<ol>
<li>Would it be safe to replace all Us with Ts in the
sequences/PSSMs and run the search that way? (I have
seen one example of someone doing this while searching.)<br>
</li>
<li>Or would it be possible for me to modify the code to
work with RNA by replacing the Ts with Us in the code
(or in a more sophisticated way providing an option for
both)?</li>
</ol>
<p>For the latter I understand that I have to modify the
_pwm.c code too. I am not experienced in C, but what I
gathered by looking at that code, it should not be a big
problem.</p>
<p>I am just looking for some confirmation that I am not
overlooking some computational or biology related reason
why the above mentioned solutions are not possible.</p>
<p><br>
</p>
<p>Thank you in advance for your kind help!</p>
<p><br>
</p>
<p>Best wishes,</p>
<p>Máté Váczy-Földi<br>
</p>
</div>
_______________________________________________<br>
Biopython mailing list - <a href="mailto:Biopython@biopython.org" target="_blank">Biopython@biopython.org</a><br>
<a href="https://mailman.open-bio.org/mailman/listinfo/biopython" rel="noreferrer" target="_blank">https://mailman.open-bio.org/mailman/listinfo/biopython</a><br>
</blockquote>
</div>
</blockquote>
</div>
</blockquote></div>