[Biojava-l] SimpleGappedSymbolList from a String

Don Naki d.naki at cox.net
Fri May 14 11:37:43 EDT 2004


Thanks for the quick reply,

I suppose I can use a GappedSequence instead of a GappedSymbolList; it's
just that I don't need features and annotations.
Thanks for addressing the ProteinTools.createGappedProteinSequence bug.

P.S. The web site looks great; very clean.

> Don Naki wrote:
>
> >Hi all,
> >I have a couple of 'novice' questions...
> >
> >I can't seem to figure out how to create a SimpleGappedSymbolList from a
String. I want to parse "-AQSD--VP-" and create a SimpleGappedSymbolList
from it.
> >ProteinTools has methods to return a SymbolList, Sequence, and
GappedSequence from a String, but not a GappedSymbolList. I understand
GappedSequence extends GappedSymbolList, but I want just the
GappedSymbolList. Alternatively, is there a way to get a GappedSymbolList
from a GappedSequence?
> >
> >
> We could add a uitlity method to do this. Why do you /have/ to have a
> GappedSymbolList that is not a GappedSequence? Is there a specific
> memory constraint?
>
> >A second question is that
ProteinTools.createGappedProteinSequence("-AQSD--VP-").seqString() results
in the String "XAQSD--VPX". The first and last '-' characters are now
represented by 'X'. Is this a special kind of gap symbol? If so, how can I
distinguish between '-' and 'X' Symbols?
> >
> >
> This is a tokenization bug - the leading/trailing gaps are not being
> recognised by the tokenizer, and then replaced by X. It's probably in
> CharacterTokenization - needs a special-case for
> AlphabetManager.getGapSymbol() - could someone look a this?
>
> >Thanks in advance,
> >Don
> >
> >
> Matthew



More information about the Biojava-l mailing list