<div dir="ltr"><div>Hello Máté,</div><div><br></div><div>I see you are referring to the PositionSpecificScoringMatrix class</div><div>defined in Bio/motifs/matrix.py which does indeed appear to be DNA</div><div>only. I can't comment on any drawbacks in generalizing that code</div><div>(I can see how I would attempt this), but your first idea is what I would</div><div>have suggested in the short term - map any U to T in your motifs and</div><div>sequences to be searched (i.e. treat as DNA).<br></div><div><br></div><div>Peter<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Dec 14, 2023 at 4:18 PM Váczy-Földi Máté <<a href="mailto:vaczy.foldi.mate@semmelweis.hu">vaczy.foldi.mate@semmelweis.hu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div>
<p>Dear Mailing List Members,</p>
<p><br>
</p>
<p>I would like to ask a question related to the Bio.motifs package.<br>
</p>
<p>I am currently working a project where I need to find RNA motifs
in RNA sequences. After consideration we have decided to search
for the motif occurrences using PSSMs, and I would like to
implement this using Biopython. I looked at the relevant codes in
the in the matrix.py file and I have seen that the PSSM calculate
method is hard coded to work only with DNA. There is also a note
saying "the sequence can only be a DNA sequence". <br>
</p>
<p>My question is that:</p>
<ol>
<li>Would it be safe to replace all Us with Ts in the
sequences/PSSMs and run the search that way? (I have seen one
example of someone doing this while searching.)<br>
</li>
<li>Or would it be possible for me to modify the code to work with
RNA by replacing the Ts with Us in the code (or in a more
sophisticated way providing an option for both)?</li>
</ol>
<p>For the latter I understand that I have to modify the _pwm.c code
too. I am not experienced in C, but what I gathered by looking at
that code, it should not be a big problem.</p>
<p>I am just looking for some confirmation that I am not overlooking
some computational or biology related reason why the above
mentioned solutions are not possible.</p>
<p><br>
</p>
<p>Thank you in advance for your kind help!</p>
<p><br>
</p>
<p>Best wishes,</p>
<p>Máté Váczy-Földi<br>
</p>
</div>
_______________________________________________<br>
Biopython mailing list - <a href="mailto:Biopython@biopython.org" target="_blank">Biopython@biopython.org</a><br>
<a href="https://mailman.open-bio.org/mailman/listinfo/biopython" rel="noreferrer" target="_blank">https://mailman.open-bio.org/mailman/listinfo/biopython</a><br>
</blockquote></div>