[Biojava-l] SVM

Matthew Pocock mrp@sanger.ac.uk
Mon, 14 Feb 2000 13:05:38 +0000


Dear all,

I have made some changes to the SVM training and classification API
which may break some other code under org.biojava.stats.svm - please be
patient with me. On the up side, it is now much more intuitive to use.

I have moved the stringy methods of sequence down to residue list - this
helps for the corba interfaces, and is probably where these methods
should have been all the time.

The feature creation stuff is now changed to be more friendly. The
FeatureFactory interface is used by SimpleFeature so that you can plug
in your own favorite type of feature (genes, exons & stuff), but it is
not part of the basic sequence/feature model.

At the moment features have no strand information. This is good from the
point of view of re-use (eg strand makes no sence in Protein) but is bad
for day-to-day DNA stuff. Strandidness could go into the Location
objects, or a sub-interface of Feature, or somewhere else. The contained
sequence depends on direction & alphabet. We shouldn't
reverse-complement proteins even though we may wish to indicate
directionality for things like beta-strands in beta-sheets. What do
people think?

All ideas welcome - from the stupid to the sublime (but hopefully some
we can implements).

Matthew