[Biojava-dev] Serialization problems, "-" turns to "n" after serializing sequence

Kalle Näslund kalle.naslund at genpat.uu.se
Tue Oct 18 14:04:39 EDT 2005


Hi!

I seem to be stuck with a serialization issue, somewhere deep in the 
alphabet stuff. The problem is that "-" turns into "n". This happens 
both with farily new CVS code as well as 1.4 release code.

The code i am using is the following:

import java.util.*;
import java.io.*;

import org.biojava.bio.seq.*;
import org.biojava.bio.symbol.*;
import org.biojava.utils.*;
import org.biojava.bio.*;

/**
 * Temp class, just to check out some serialization issues im having.
 *
 * @author kalle
 */
public class AlignmentSerializationTest {

    public void run() throws Exception {
        Sequence dnaSeq1 = 
DNATools.createDNASequence("---ATGC---ATGC---", "seq1" );

        dumpInfoAboutSequence( dnaSeq1 );

        System.out.println("Writing alignment to disk");

        File file = new File("/tmp/ali.obj");
        FileOutputStream fOS = new FileOutputStream( file );
        ObjectOutputStream oOS = new ObjectOutputStream( fOS );

        oOS.writeObject( dnaSeq1 );

        oOS.close();
        fOS.close();

        System.out.println( "Loading alignment from disk" );
        FileInputStream     fIS = new FileInputStream( file );
        ObjectInputStream   oIS = new ObjectInputStream( fIS );

        Sequence  serSeq  = ( Sequence )oIS.readObject();

        dumpInfoAboutSequence( serSeq );
    }

    public static void main( String[] flags ) throws Exception {
        AlignmentSerializationTest myAST = new AlignmentSerializationTest();
        myAST.run();
    }

    private void dumpInfoAboutSequence( Sequence sequence ) throws 
Exception {
        System.out.println("Name      :" + sequence.getName() );
        System.out.println("Alphabet  :" + sequence.getAlphabet() );
        System.out.println("GapSymbol :" + 
sequence.getAlphabet().getGapSymbol() );
        System.out.println("Sequence  :" + sequence.seqString() );
        System.out.println("Tokeniz   :" + 
sequence.getAlphabet().getTokenization( "token" ) );
    }
}


And the output i get is :

Name      :seq1
Alphabet  
:org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper at 1bc887b
GapSymbol :org.biojava.bio.symbol.SimpleBasisSymbol: []
Sequence  :---atgc---atgc---
Tokeniz   
:org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper at 120cc56

Writing alignment to disk

Loading alignment from disk

Name      :seq1
Alphabet  
:org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper at 1bc887b
GapSymbol :org.biojava.bio.symbol.SimpleBasisSymbol: []
Sequence  :nnnatgcnnnatgcnnn
Tokeniz   
:org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper at 120cc56


I have spent some time using a debugger and stepping trough the bj code 
but realised that it will most likely take me loads of time, and was 
hoping that some of you guys that have some more experience with the 
alphabet stuff could atleast point me in the right direction, if not 
outright recognize the bug =)

kind regards Kalle


More information about the biojava-dev mailing list