[Biojava-dev] OutOfMemory when using a big Weight Matrix to find Motifs in 1.3.1 but not in 1.3

mark.schreiber at group.novartis.com mark.schreiber at group.novartis.com
Wed Jan 28 01:51:44 EST 2004


Thanks Michael, that would be great.





Michael Heuer <heuermh at acm.org>
Sent by: Michael Heuer <heuermh at shell3.shore.net>
01/28/2004 02:47 PM

 
        To:     Mark Schreiber/GP/Novartis at PH
        cc:     biodev at ebiointel.com, <biojava-dev at biojava.org>
        Subject:        Re: [Biojava-dev] OutOfMemory when using a big Weight Matrix to find 
Motifs in 1.3.1 but not in 1.3


Hello Mark,

It looks like that change was already made on the main branch, in version
1.45 dated 22 Dec 2003.  Should I commit this to the release-1_3-branch?

   michael


On Wed, 28 Jan 2004 mark.schreiber at group.novartis.com wrote:

> Hi Again,
>
> I've found the problem.
>
> The code starting at line 153 in DP needs changing from
>
>      for (int c = 0; c < cols; c++) {
>        score += scoreType.calculateScore(matrix.getColumn(c),
> symList.symbolAt(c + start));
>      }
>
> to
>
>      for (int c = 0; c < cols; c++) {
>        score += Math.log(scoreType.calculateScore(matrix.getColumn(c),
> symList.symbolAt(c + start)));
>      }
>
> so it will be consistent with the scoreWeightMatrix() method that 
doesn't
> use a ScoreType. Actually, changing it to a log will prevent underflow
> errors on large WeightMatrices. Interestingly the WeightMatrixAnnotator
> converts it back to a normal probability with a Math.exp() operation
> before annotation. I'm sure it doesn't need to be this conveluted??
>
> Can someone add that fix to CVS. I'm having trouble with CVS just know 
so
> I can't.
>
> Mark Schreiber
> Principal Scientist (Bioinformatics)
>
> Novartis Institute for Tropical Diseases (NITD)
> 1 Science Park Road
> #04-14 The Capricorn
> Singapore 117528
>
> phone +65 6722 2973
> fax  +65 6722 2910
>
>
>
>
>
> Bruno Aranda - e-BioIntel <elmosca at terra.es>
> Sent by: biojava-dev-bounces at portal.open-bio.org
> 01/27/2004 07:30 PM
> Please respond to biodev
>
>
>         To:     biojava-dev at biojava.org
>         cc:
>         Subject:        [Biojava-dev] OutOfMemory when using a big 
Weight Matrix to find Motifs in
> 1.3.1 but not in 1.3
>
>
> Hi Mark,
>
> I've tried to increase the memory heap to 512 Mb but my little linux
> almost died... However I've found the origin of the problem. The class I
> tested followed the steps of your wonderful tutorial, and I used the low
> score treshold of "0.1". With the new ScoreType System I got too many
> results for my motif (every base in the sequence), so too many features
> were created and the OutOutMemoryError was raised.
> Now, for instance, I can put a treshold of 4000 (?) and I get some
> results (some of them with a probability higher than 5000 (?)... but I
> don't understand why probability scores are that high. Well, I will send
> to your home a beer truck if you can explain which probability is used
> for these score matrices ;-). Thanks,
>
> Bruno Aranda
> ebioIntel
>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at biojava.org
> http://biojava.org/mailman/listinfo/biojava-dev
>
>
>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at biojava.org
> http://biojava.org/mailman/listinfo/biojava-dev
>






More information about the biojava-dev mailing list