[Biojava-l] issue in class Distribution

Schreiber, Mark mark.schreiber at agresearch.co.nz
Wed Dec 3 16:03:59 EST 2003


Hi -

Below is the method that is being called (copied from DistributionTools). I cannot see anything obvious such as a memory leak but there could be one, can anybody else spot anything?

Maybe if someone has a fancy profilling utility they could try it and let us know where the slow down is occuring.

- Mark



  public static final Distribution[] distOverAlignment(Alignment a,
                                                 boolean countGaps,
                                                 double nullWeight)
  throws IllegalAlphabetException {

    List seqs = a.getLabels();

    FiniteAlphabet alpha = (FiniteAlphabet)((SymbolList)a.symbolListForLabel(seqs.get(0))).getAlphabet();
    for(int i = 1; i < seqs.size();i++){
        FiniteAlphabet test = (FiniteAlphabet)((SymbolList)a.symbolListForLabel(seqs.get(i))).getAlphabet();
        if(test != alpha){
          throw new IllegalAlphabetException("Cannot Calculate distOverAlignment() for alignments with"+
          "mixed alphabets");
        }
    }

    Distribution[] pos = new Distribution[a.length()];
    DistributionTrainerContext dtc = new SimpleDistributionTrainerContext();
    dtc.setNullModelWeight(nullWeight);
    try{
      for(int i = 0; i < a.length(); i++){// For each position
        pos[i] = DistributionFactory.DEFAULT.createDistribution(alpha);
        dtc.registerDistribution(pos[i]);

        for(Iterator j = seqs.iterator(); j.hasNext();){// of each sequence
          Object seqLabel = j.next();
          Symbol s = a.symbolAt(seqLabel,i + 1);

          /*If this is working over a flexible alignment there is a possibility
          that s could be null if this Sequence is not really preset in this
          region of the Alignment. In this case it will be skipped*/
          if(s == null)
            continue;

          Symbol gap = alpha.getGapSymbol();
          if(countGaps == false &&
             s.equals(gap)){
            //do nothing, not counting gaps
          }else{
            dtc.addCount(pos[i],s,1.0);// count the symbol
          }
        }
      }

      dtc.train();
    }catch(Exception e){
      e.printStackTrace(System.err);
    }
    return pos;
  }

> -----Original Message-----
> From: Alberto Ambesi [mailto:ambesi at tigem.it] 
> Sent: Wednesday, 3 December 2003 11:01 p.m.
> To: biojava-l at biojava.org
> Subject: Re: [Biojava-l] issue in class Distribution
> 
> 
> I apologize, the line:
> Distribution[] dists = distOverAlignment2(align, false, 0.01);
> 
> should actually be:
> Distribution[] dists = DistributionTools.distOverAlignment(align, 
> false, 0.01);
> 
> Thank you.
> 
> Alberto Ambesi
> 
> 
> On 3 Dec 2003, at 09:55, David Huen wrote:
> 
> > On Tuesday 02 Dec 2003 4:49 pm, Alberto Ambesi wrote:
> >> hi, I found this bug when using Distrubutions iteratively 
> many times. 
> >> The problem is that when creating Distrubution objects iteratively 
> >> computation time for each iteration increases with time.
> >>
> >> this is a piece of code that demonstrates the issue:
> >>
> >> public class DistributionTest {
> >>      public static void main(String[] args) throws Exception{
> >>          long timePoint = System.currentTimeMillis();
> >>          for (int i=0; i<2500; i++) {
> >>              Map map = new HashMap();
> >>              map.put("seq0", DNATools.createDNA("aggag"));
> >>              map.put("seq1", DNATools.createDNA("aggaa"));
> >>              map.put("seq2", DNATools.createDNA("aggag"));
> >>              map.put("seq3", DNATools.createDNA("aagag"));
> >>              Alignment align = new SimpleAlignment(map);
> >>              Distribution[] dists = 
> distOverAlignment2(align, false, 
> >> 0.01);
> >>              long previousPoint = timePoint;
> >>              timePoint = System.currentTimeMillis();
> >>              System.out.println(timePoint - previousPoint);
> >>          }
> >>      }
> >> }
> >>
> >
> > Could you provide the source to distOverAlignment2 so it 
> becomes clear
> > what
> > it is doing?
> >
> > Thanks,
> > David Huen
> >
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at biojava.org 
> http://biojava.org/mailman/listinfo/biojava-l
> 
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================



More information about the Biojava-l mailing list