[Biojava-l] Re: the current discussion

Mike Marsh mm692227@bcm.tmc.edu
Sun, 23 Jan 2000 23:17:46 -0600 (CST)


My code:
I have just started writing.  I currently have working classes for
ProteinSequence, DNASequence, DNASequenceList, and others.  My objects are
pretty smart.  For example DNASequence has a method transcribe() that
returns an RNASequence.  DNASequenceList has a great static method that
takes a FASTA file of the genome as an argument and returns a
DNASequenceList encapsulating all DNASequences (genes) for that genome. 

My code is not yet javadoc'd.  And won't be before next weekend.  But you
can have a look at a UML diagram of my protein classes.  It communicates a
lot.  
condor.bcm.tmc.edu/~mm692227/biosequence/Protein.html   //scaled to fit
condor.bcm.tmc.edu/~mm692227/biosequence/Protein.gif    //full size
condor.bcm.tmc.edu/~mm692227/biosequence/Protein.ps     //printable




On GUIS:  agree with most of what's been said.  Should definitely keep the
GUI isolated from the implementation of the model, in accordance with
Model-View-Contoller paradigm



On licenses:  Without a doubt, open source for academic use.  But I have
no idea what those acronyms stand for.  GPL = gnu public license; LGPL =
??? ; MPL = ???.




On String implementation of Sequences:
Ewan says that the Sequence class should implement the internal data as a
string.  I really have to disagree with this.  It makes much more sense to
model the data structure like the real thing.  For example,
ProteinSequence is a linear sequence of Amino Acids.  In my
implementation, I do exactly this.  ProteinSequence is a linear list of
Objects which implement ProteinChar interface.  The ProteinChar interface
defines all of the state properties we have for amino acids (e.g. charge,
aromaticity).  

Because all of my ProteinChar objects are smart (i.e. they know their
internal state), I can write some simple methods really easily.

For example, the ProteinSequence class can include such methods as:

public int CountChargedResidues ()
{
  int chargedCount=0;

  for (int i=0; i< this.getLength();i++)
    if ( this.getCharAt(i).isCharged() )
      chargedCount++:

  return chargedCount;
}

See how many lines it takes to do that if your Sequence is a string.  You
can do it, but you need to develop a hashtable for every property
(ChargedHashTable, AromaticHashTable, etc.)

This discussion is great.
Don't let it die!

Cheers,
mike

-------------------------------------------------------------
Mike Marsh
Graduate Student in Structural and Computational Biology
Baylor College of Medicine.  Houston, TX

FON: 713/798-6034
Permanent Email:  mikemarsh@bigfoot.com
-------------------------------------------------------------