[Biojava-l] handling gap symbols

Wim De Smet Wim.DeSmet at UGent.be
Thu May 20 14:58:55 UTC 2010


Hello all,

I've been trying to figure out how to determine the location of gap 
symbols in an alignment, but I keep running into trouble determining 
what is a gap symbol. Apparently there are two different possible gap 
symbols and they can both appear in the same alignment?

An example might make it clearer, suppose I perform the following 
alignment (matrix is the EDNA matrix):
SequenceAlignment aligner = new NeedlemanWunsch((short) 0, (short) 3, 
(short) 10, (short) 10, (short) 1, matrix);
Sequence first = DNATools.createDNASequence("ACT", "query");
Sequence second = DNATools.createDNASequence("AACTA", "target");
Alignment alignment = aligner.getAlignment(first, second);

And Obtain the symbollist for "query", which should look like "-ACT-", I 
get the following Symbols:
AlphabetManager$GapSymbol
AlphabetManager$WellKnownAtomicSymbol
AlphabetManager$WellKnownAtomicSymbol
AlphabetManager$WellKnownAtomicSymbol
AlphabetManager$WellKnownGapSymbol

AlphabetManager.getGapSymbol() returns AlphabetManager$GapSymbol, while 
symbolList.getAlphabet().getGapSymbol() returns 
AlphabetManager$WellKnownGapSymbol. Am I supposed to test against both 
or is there a bug here somewhere? I'm using biojava 1.7.1.

regards,
Wim
-- 
Wim De Smet
http://www.straininfo.net/



More information about the Biojava-l mailing list