[Biojava-dev] The future of BioJava

george waldon gwaldon at geneinfinity.org
Sat Sep 22 16:03:15 UTC 2007


Thank you Mark for making the point so clearly. 

I could see String being used internally in SymbolList, but still what is really the point of rewriting a logic that is already present in the current code? Again, rewrite appears easier.

There is nothing that prevent us to write with the current code:

SymbolList sl = new DNASymbolList();
sl.setName("AB123456");
sl.setSequence("aAaA-aAaA"); //a polyadenine

Ok, setName and setSequence are not part of the current SymbolList interface but we can have SymbolListEx to complete it, or we can have a new SymbolList in the biojavax domain or the bj3 domain, or even we can create SymbolString (//cool!) in the current biojava domain.

The point is that we can have both old and new interfaces coexisting in the same overall project and swap from one to another module per module and nothing is ever broken.

- George


> -----Original Message-----
> From: Mark Schreiber [mailto:markjschreiber at gmail.com]
> Sent: Friday, September 21, 2007 12:24 AM
> To: george waldon
> Cc: biojava-dev at biojava.org
> Subject: Re: [Biojava-dev] The future of BioJava
> 
> Hello -
> 
> Just to clarify my opinion on Strings vs Symbols.
> 
> I generally prefer Symbols and SymbolLists to Strings cause
> SymbolLists are smart and Strings are dumb. Classic case is ambiguity
> symbols like 'W'. BioJava knows, in the context of DNA this is A or T.
> However, I think it would be vastly simpler if there where simpler
> getters and setters for SymbolLists that exposed Strings in a
> friendlier manner.
> 
> I also think there is a case for SymbolLists that are backed by
> Strings (more likely a char[]) instead of Symbol arrays and only do
> the needed conversion when required (ie, when the user calls
> SymbolAt().  These would be ideal for the case where someone is
> converting GenBank to Fasta and there is no need to go through the
> Symbol parsing.
> 
> Finally, I think SymbolLists (or whatever they get called) should
> implement more of the methods found in String to make them look more
> like Strings.  Ideally we should think about implementing some of the
> methods that Groovy likes to use for operator overloading. If we do
> this is would be possible to concatenate two sequences in groovy by
> doing this (I may have the syntax wrong).
> 
> Seq3 = Seq1 + Seq2
> 
> The other issue with SymbolLists is that they are not intuitive to
> construct because they are not so bean like. This is not just a
> problem for newbies but also a major hinderance to the use of JEE,
> Spring, JAXB and other important frameworks. It should be possible to
> do this:
> 
> SymbolList sl = new SymbolList();
> sl.setName("AB123456");
> sl.setSequence(seqString);
> 
> The final hinderance to the use of JEE is serialization. If we keep
> Symbols flyweight (singleton) we need to make this bullet proof from
> the start. It is also practicaly impossible to make something a bean
> and make it a Singleton, some careful thought is required.  If we keep
> symbols behind the scenes they may not need to be so bean like.
> 
> - Mark
> 



More information about the biojava-dev mailing list