[Biojava-l] Gap: Basis Symbol vs Symbol

Matthew Pocock mrp@sanger.ac.uk
Fri, 09 Feb 2001 11:05:40 +0000


Emig, Robin wrote:

> 	I have some code that uses codons with ambiguous bases using
> basissymbol. Problem is I also try to deal with gap symbols at the same
> time. I thought the idea behind the gap symbol was that it would be
> universal, ie gap or gapxgapxgap would be the same symbol. However, I can't
> use my current code like that because I need to do a getsymbols, on standard
> codons. This comes from the BasisSymbol. Since gap only implements Symbol,
> my code blows up when a gap is thrown in.
> 	We could have gap implement BasisSymbol or AtomicSymbol, any ideas
> why not?
> 	The workaround is that I will create a basis symbol which is
> gapxgapxgap and try to deal with it differently elsewhere( I really liked
> being able to just say AlphabetManager.getGapSymbol()) to deal with gaps.
> 	-Robin
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l

Hi Robin,

Could you post the usage-case with the codons that is causing you 
trouble? There may be a quick-fix, or it may be showing up a fatal flaw 
with gaps.

A column of gaps in an alignment (empty column) is not the same as a gap 
in the whole alingment. The first case means that every sequence happens 
to be snapped at that position and is modeled correctly by gap^n, easily 
obtainable by parsing a list of gap symbol into the alphabet's 
getSymbol(list) method - this will return a BasisSymbol that spans a 
null-sub space of the cross-product alphabet. The second case is modeled 
corectly by the alignment being behind a GappedSymbolList that inserts a 
gap at the required position.

I will double-check the documentation before release time, but the 
algebra for symbols definitely requires gap^n to be distinct from gap to 
make everything work out (it is clearly a differently shaped null 
sub-space).

Matthew