[Biojava-l] Editing SymbolLists

Thomas Down td2@sanger.ac.uk
Tue, 28 Aug 2001 14:33:46 +0100


On Fri, Aug 24, 2001 at 06:09:57PM -0700, David Waring wrote:
> Perhaps I am missing something big here, but I can not figure out how in the
> world I can edit a Sequence or a SymbolList.
> 
> There is a class Edit, and there seems to be strong support for handling
> changes and passing them to anything that is listening, but SimpleSymbolList
> is immutable (even the addSymbol method is deprecated). SimpleSequence wraps
> a SimpleSymbolist so that is immutable (at least the SymbolList is). The
> other subclasses of SymbolList don't seem to be the appropriate. As best I
> can tell if I wanted to be able to read a fasta file, and make a change to
> the sequence, I would have to.
> 
> Create an EditableSymbolList.
> Create a FastaSequenceBuilder that used this.
> Write a new method or two in SeqIOTools to read my file into my new Sequence
> that has an EditableSymbolList.
> 
> Is this correct? Does no one else have any desire to edit a SymbolList?

The API for editing has existed on the SymbolList interface for
a long time, but up until now, there seems to have been very
little interest, hence no mutable implementation.

It would be really good to have an EditableSymbolList (better
still, replace SimpleSymbolList with an editable implementation).
If you feel like coding this up, we'd be really interested to
here about it!

As you say, you'll probably then also want a special SequenceBuilder
implementation which creates these editable implementations.  An
alternative, rather simpler, approach would be to just make sure
that the editable SymbolList has a copy-constructor, just as
the current SimpleSymbolList does.  Then you can do:

  SequenceIterator si = SeqIOTools.readFastaDNA(stream);
  Sequence seq = si.nextSequence();

  SymbolList editableSeq = new SimpleSymbolList(seq); // mutable copy
  editableSeq.edit(new Edit(20, 10, SymbolList.EMPTY_LIST));

Or something similar...

Does this make sense?
 

> PS: ChunkedSymbolList does not appear in the javadoc anywhere, what gives?

ChunkedSymbolList is a private implementation -- the only way
you can get hold of them is via the ChunkedSymbolListBuilder.

ChunkedSymbolLists are special, optimized implementations,
designed for the case of loading large sequences from files
or other streams -- the hope was that the majority of users
wouldn't need to know about them.


   Thomas.