[Biojava-dev] The future of BioJava
Andy Yates
ayates at ebi.ac.uk
Fri Sep 21 09:20:18 UTC 2007
>
> Finally, I think SymbolLists (or whatever they get called) should
> implement more of the methods found in String to make them look more
> like Strings. Ideally we should think about implementing some of the
> methods that Groovy likes to use for operator overloading. If we do
> this is would be possible to concatenate two sequences in groovy by
> doing this (I may have the syntax wrong).
>
> Seq3 = Seq1 + Seq2
Yup that seems about right. It's on of the nice things about groovy that
you can overload the operators and create something which approaches an
in-language DSL (can't really call it a true DSL since it's constrained
by the Groovy language). But anyway you can start mucking around with
the operators to get things like:
fasta = new Fasta('id','AAAAAA')
fasta_output = new FastaWriter('some_location');
fasta_output << fasta
Assuming that the Fasta class would represent a Fasta record & the
FastaWriter is just that; you can begin to write some very nice & tight
code which just looks nice to use :).
>
> The other issue with SymbolLists is that they are not intuitive to
> construct because they are not so bean like. This is not just a
> problem for newbies but also a major hinderance to the use of JEE,
> Spring, JAXB and other important frameworks. It should be possible to
> do this:
>
> SymbolList sl = new SymbolList();
> sl.setName("AB123456");
> sl.setSequence(seqString);
Yup I'll agree with that.
>
> The final hinderance to the use of JEE is serialization. If we keep
> Symbols flyweight (singleton) we need to make this bullet proof from
> the start. It is also practicaly impossible to make something a bean
> and make it a Singleton, some careful thought is required. If we keep
> symbols behind the scenes they may not need to be so bean like.
I think we may need a bit of both. I would suggest something like an
interface which back onto Symbol. Then collections of symbols are
actually enums e.g.
public interface Symbol {
String toString();
}
public enum DNA implements Symbol, java.io.Serializable {
A,
C,
G,
T;
public String toString() {
return this.name().toLowerCase();
}
private Object readResolve () throws java.io.ObjectStreamException {
DNA symbol = null;
for(DNA dna: values()) {
if(dna.toString().equals(this.toString()) {
symbol = dna;
break;
}
}
return symbol;
}
}
The read resolve needs to go in here to make sure this is bullet proof
to serialization. Otherwise we end up in a situation where you can
serialize an enum, deserialize it & then you'll end up where
deserialzied enum is not equal (using ==) to the statically available enum.
From what I've done previously using Enums are a very nice way of
working with static constants. However they are very hard to extend so
they're fine for known constants like DNA (don't think we're going to
stumble onto a new nucleotide) but the symbol interface does mean that
people can extend the symbol concept if need be.
More information about the biojava-dev
mailing list