[Biojava-l] creating a new alphabet?

Thomas Down td2@sanger.ac.uk
Fri, 22 Nov 2002 23:09:10 +0000


Hi...

I think you're right in blaming the 'n' symbol, or in fact
any ambiguity symbol, for this problem.  All ambiguity symbols
have an internal Alphabet object representing the set of atomic
symbols which they match, and it's this which isn't serializing
sensibly.

BioJava jumps through a number of hoops to make Symbol and
objects properly serializable (and therefore usable in RMI).
However, this only seems to apply to atomic symbols, and not
ambiguities.  Obviously not had a great deal of testing (actually,
a number of people *do* use Symbol serialization, but usually
to serialize Distribution objects [which don't rely on
ambiguity symbols] rather than SymbolLists.


I'll see what can be done to extend the existing mechanisms
to work with ambiguity symbols as well.

      Thomas.

On Fri, Nov 22, 2002 at 04:46:24PM -0500, Dave Barkan wrote:
> Hi, I have been running into a problem.  I have long strings representing
> sequences, some of which contain the character 'n'.  I am working with
> them using a client/server over RMI.  When the server attempt to create a
> new SymbolList using the following method:
> 
>  SymbolList dna =  DNATools.createDNA(sequence) // sequence contains 'n'
> 
> I get this exception on the Client side:
> 
> java.lang.NullPointerException
> 	at
> org.biojava.bio.symbol.AlphabetManager.alphabetForName(AlphabetManager.java:148)
> 	at
> org.biojava.bio.symbol.AbstractAlphabet.readResolve(AbstractAlphabet.java:76)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	...lots more RMI exceptions
> 
> I am assuming this happens because 'n' is in the sequence because it works
> fine when 'n' is not in the sequence, everything else being equal (If I'm
> wrong, any other ideas as to why this might be occuring?)
> 
> I'm considering trying to create my own alphabet, symbolTokenization, etc
> to get the functionality of everything I get using AlphabetManager
> methods.  However, I've tried this before a while ago and as I recall it
> was a little complicated.  I think I got stuck when I tried to create a
> SymbolTokenization, the API for the Alphabet said I 'need a symbol
> parser under the name "token" and one under the name "name"' and I wasn't
> sure how to set names for SymbolTokenizations.
> 
> So I guess my options are
> 1) Continue trying to create my own alphabet or
> 2) An easier way which hopefully someone can suggest?