[Biojava-l] Re: [Biojava-dev] Tutorial 1 and ambiguous symbols

Matthew Pocock matthew_pocock@yahoo.co.uk
Mon, 21 Oct 2002 17:05:36 +0100 (BST)

Hi Andy,

You can check if a Symbol is non-ambiguous by seeing
if it is an instance of AtomicSymbol. If not, you can
use the getMatches() method to get back an alphabet
over all the AtomicSymbol instances it could match. So
something like this:

Symbol sym = ...;

if(sym instanceof AtomicSymbol) {
  // add counts as normal
} else {
  Finite Alphabet matches = (FiniteAlphabet)
  for(Iterator si = matches.iterator(); si.hasNext();
) {
    AtomicSymbol as = (AtomicSymbol) si.next();
    // do stuff with this potential match

Alternatively, you could replace the == in the demo
with an Alphabet.contains() where you make one
ambiguity symbol from the pair (G,C) and one from
(A,T) using the method getSymbol(Set) method, and
extract getMatches() from them.

You could try both and compare the relative speed of
each. Also, what should you be doing in the case where
the ambiguity matches both symbols in (A,T) and (G,C)?
Do you want to add whole counts to both pots, or part


 --- andy hammer <ahammer@genetics.utah.edu> wrote:
> Hello!
> I just discovered biojava a few days ago and am very
> excited to start using
> it in my code.
> The tutorials and demos are very helpful.
> I still have a question on how to handle ambibuous
> symbols.
> The last sentence in Tutorial 1 suggests modifing
> GCContent.java to ignore
> any ambiguous symbols.
> When I run GCContent it already ignores any
> ambiguous symbols.
> I am interesting in counting the ambiguous symbols.
> Any ideas on how I can include ambiguous symbols in
> the count?
> Is there a demo or something I could look at?
> Andy Hammer
> University of Utah
> Human Genetics
> _______________________________________________
> biojava-dev mailing list
> biojava-dev@biojava.org
> http://biojava.org/mailman/listinfo/biojava-dev 

Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts