[Biojava-l] Gap Problem

Thomas Down td2@sanger.ac.uk
Thu, 28 Jun 2001 21:37:11 +0100


On Thu, Jun 28, 2001 at 12:58:09PM -0700, Emig, Robin wrote:
> 
> Is there any reason why the following line should return false if the
> current symbol is the 
> gap symbol?
> 
> ((sl.symbolAt(i) == sl.getAlphabet().getGapSymbol()) 
> 
> I tried this simple code and s=AlphabetManager$GapSymbol, but
> sl1.getAlphabet.getGapSymbol()== a simple basis symbol. What gives, why did
> the parser put a AlphabetManager$GapSymbol in the symbol list and not its
> appropriate alphabet.getGapSymbol?

Yep, it's a bug.  For anyone interested, it can be reproduced
using:

	SymbolList dna = DNATools.createDNA("-gataca");
	System.out.println(dna.getAlphabet().getGapSymbol());
	System.out.println(dna.symbolAt(1));

As far as I can tell, the problem comes because the AlphabetMager.xml
resource ffile still contains an explicit reference to the gap
symbol.  This just isn't needed any more -- a gap symbol is an
implied member of every alphabet.  If you remove this from
your AlphabetManager.xml, your code should then run fine.

I suspect there's also, strictly speaking, a bug in the AlphabetManager
parser that means it still accepts `old' alphabets with an explicit
gap symbol.  That's a side issue, though.

I'd like to get a quick 2nd opinion from Matt before checking this
in, since he's more familiar with the guts of symbol creation.  But
this fix does seems to work, anyway...

Hope this helps,
    Thomas.