[Biojava-l] calculating properties of SymbolLists

Gerald Loeffler Gerald.Loeffler@vienna.at
Mon, 15 May 2000 15:29:56 +0200


Matthew Pocock wrote:
> 
> Dear Gerald,
> 
> Gerald Loeffler wrote:
> 
> > Hi!
> >
> > I need to calculate some properties (isoelectric point, molecular
> > weight, ...) of a protein sequence (which is of course a SymbolList) and
> > thought about introducing a general interface for this purpose:
> >
> > public interface SymbolListPropertyCalculator {
> >   /**
> >    * calculate a property of the given symbol list and return it as an
> > Object.
> >    * @param sl the symbol list whose property should be calculate (may
> > not be null otherwise an
> >    *           IllegalArgumentException is thrown)
> >    * @return the calculated property. Never returns null.
> >    * @exception BioException if calculating the property was not
> > possible due to mis-usage of the
> >    *                         method, e.g. because the implementation
> > requires that the symbol list
> >    *                         be over a specific Alphabet but sl does not
> > fulfill this precondition.
> >    */
> >   Object calculateProperty(SymbolList sl) throws BioException;
> > }
> >
> 
> Under what circumstances would you return something that needed to be an
> Object? The properties that you have outlined are both double values. If
> this interface is to be usefull for things like GUIs or configuration
> scripts, then mabey:
> 
> @throws IllegalAlphabetException  if sl is over the wrong alphabet for this
> metric
> @throws BioException if for any reason the metric could not be calculated
> double calculateProperty(SymbolList sl) throws IllegalAlphabetException,
> BioException
> 
> would be more apropreate.

I had in mind that there are probably many properties of a sequence that
are of a more complex nature and could hence not be represented as a
simple floating point value, e.g. amino acid distribution (symbol
distribution), hydrophobicity plot (as a function of symbol position),
secondary structure class of the whole protein (with only 3 or 4 allowed
values), ... the question is whether it makes sense to provide one
interface to deal with all of these different kinds of properties...

> 
> >
> > The one could add implementations like
> >
> >   MolecularWeightCalculator (returns the MW in kD as a Double)
> >   IsoelectricPointCalculator (requires the symbol list to be over the
> > protein alphabet, return the pI as a Double)
> >
> > etc.
> >
> > 1) is this needed or is there another way of doing it in BioJava
> 
> There is definitely going to be call for this funcitonality from the protein
> people.
> 
> >
> > 2) does this make sense to you
> > 3) in which package should this stuff go?
> 
> I would make the interface public to org.biojava.bio.sequence. These two
> implementations could become public classes in sequence, but I would prefer
> intances of them to be public static final properties of ProteinTools. I may
> be off the wall here :-)

so we would have both: public classes and (since the classes are
essentially singletons), a static final member of ProteinTools for each
of the classes?

	cheers,
	gerald

> 
> Matthew
> 
> >
> >
> >         cheers,
> >         gerald
> >
> > --
> >    Gerald.Loeffler@vienna.at _________________ Software Architect
> >    http://www.imp.univie.ac.at ____ http://www.daemonstration.com
> >    OOA&D, Java, J2EE, JSP, Servlets, JavaBeans, ODBMS, RDBMS, XML
> >
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> 
> --
> Joon: You're out of your tree
> Sam:  It wasn't my tree
>                                                  (Benny & Joon)

-- 
   Gerald.Loeffler@vienna.at _________________ Software Architect
   http://www.imp.univie.ac.at ____ http://www.daemonstration.com
   OOA&D, Java, J2EE, JSP, Servlets, JavaBeans, ODBMS, RDBMS, XML