[Biojava-l] Analysis output on sequences (calculating properties ofSymbolLists)

Gerald Loeffler Gerald.Loeffler@vienna.at
Tue, 16 May 2000 00:09:10 +0200


interesting! allow me to interpret your idea and add something:

whenever we talk about "calculating something", the Strategy pattern
springs to mind, where a specific "algorithm" is represented as an
object, and the type of that object specifies the method through which
all these different algorithms are invoked - that's what my original
suggestion tried to do:

interface SymbolListPropertyCalculator {
  Object calculateSymbolListProperty(SymbolList sl);
}

I.e. i defined exactly one algorithm type that works on a SymbolList and
returns "anything". As allways, when instantiating a concrete
implementation you can pass any algorithm-specific information (to the
constructor), and so the method arguments can be minimal and should just
mirror the essence of the algorithm.

Matthew on the other hand favoured a specific algorithm type that
perfectly matched the examples of algorithms i needed, namely an
algorithm type that returns a double, i.e. a more concrete Strategy

interface SymbolListDoublePropertyCalculator {
  double calculateSymbolListDoubleProperty(SymbolList sl);
}

What you are saying in part is that what we should really do is define a
set of algorithm-types, where we should categorise on the return type
and/or argument types of the algorithm, e.g.

interface SingleValueAnalysis {
  double calculateSingleValue(SymbolList sl);
}

interface ContinuousValueAnalysis {
  Scalar1DFunction calculateContinuousValue(SymbolList sl);
}

...

Thus instead of an "catch-all" algorithm type (as in my original
proposal) we would have a set of more concisely defined algorithm types
(as also in Matthews proposal).

Additionally, you are saying that details on how exactly the algorithm
was invoked (which parameters were used) should be returned as well. I
don't see a need for this, because when constructing the Strategy object
you must know which parameters to pass to it's constructor. Anyways, the
only possible type for such an informative object would be Object, so
may be we could derive the analysis interfaces from a common
super-interface that defines a method for getting this
parameters-object:

interface Analysis {
  Object getAnalysisParameters();
}

But anybody interested in the parameters of a specific algorithms would
need to know the type of object returned here...

	cheers,
	gerald

David Martin wrote:
> 
> Some while ago I started a project that is now on the back burner that was
> designed to take generic analysis output and map it onto sequences.
> 
> There are a number of different aspects of geralds requests to consider:
> 
> A single value for a calculation is fine (eg gribskov stat, gc content, aa
> content etc.). That can be represented quite easily by a generic 'property
> value' object interface.
> 
> When you have other properties that relate to a sequence, such as AA
> composition calculated in a sliding window over the sequence then you run
> into problems. It is not a property of the whole sequence but a property
> of a subsequence, often much larger than a single position in the
> sequence.
> 
> One would probably want a heavier weight object than just a single
> analysis. GC content for the whole sequence is a double and there isn't
> much else one can add.
> GC content ove a sliding window has a minimum of two parameters, one of
> which varies over the sequence length.
> 
> If there was to be a generic interface for an analysis it should
> probably return
> some generic analysis object and then we start to head towards something
> that looks like the analysis section of the OMG CORBA spec for
> Biomolecular Sequence Analysis.
> 
> I would want an analysis to carry with it suitable information onthe
> program, parameters and so on used to create the result. These can easily
> be bundled into a fairly distinct set of analysis types (about 4 or 5)
> that can be treated generically with the program parameters as a
> Collection.
> 
> So we have a generic
> SequenceAnalysis interface (probably really a result factory)
> 
> >From which we derive a variety of subtypes depending on the input sequence
> type and return type
> 
> SingleValueAnalysis
> takes a sequence and returns an analysis result with two components:
> A parameter object of some sort and a value object of some sort.
> 
> ContinuousValueAnalysis
> returns a result object that can give a value for every point in the
> sequence. as well as holding its parameters
> 
> and so on.
> Probably a bit heavier weight than Gerald had in mind.
> 
> Sorry to be so vague but it is late here, and I am adding a note from home
> before I forget.
> 
> ..d
> 
> ---------------------------------------------------------------------
> *  Dr. David Martin                  Biotechnology Centre of Oslo   *
> *  Node Manager                      Gaustadalleen 21               *
> *  The Norwegian EMBNet Node         P.O. box 1125 Blindern         *
> *  tel +47 22 95 87 56               N-0317 Oslo                    *
> *  fax +47 22 69 41 30               Norway                         *
> ---------------------------------------------------------------------
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l

-- 
   Gerald.Loeffler@vienna.at _________________ Software Architect
   http://www.imp.univie.ac.at ____ http://www.daemonstration.com
   OOA&D, Java, J2EE, JSP, Servlets, JavaBeans, ODBMS, RDBMS, XML