[Biojava-l] Editing a RichSequence

Ian Yi-Feng Chang cif077 at gmail.com
Thu Apr 23 01:39:14 UTC 2009


Thanks for your detail explanation.
I got it now.

On Wed, Apr 22, 2009 at 11:50 PM, Richard Holland <holland at eaglegenomics.com
> wrote:

> I forgot to mention - ChunkedSymbolListFactory is currently the only
> SymbolListFactory implementation in BioJava which can accept 'streamed'
> data rather than taking the whole sequence at once. So, the other
> alternative to changing CHUNK_SIZE is to create a new SymbolListFactory
> implementation which can accept 'streamed' data and use it to replace
> the reference to ChunkedSymbolListFactory in SimpleRichSequenceBuilder.
>
> Richard.
>
> Richard Holland wrote:
> > The problem lies in SimpleRichSequenceBuilder:
> >
> >     public void addSymbols(Alphabet alpha, Symbol[] syms, int start, int
> > length) throws IllegalAlphabetException {
> >         if (this.symbols==null) {
> >             if (threshold<=0) {
> >                 this.symbols = new
> ChunkedSymbolListFactory(this.factory);
> >             } else {
> >                 this.symbols = new
> > ChunkedSymbolListFactory(this.factory,threshold);
> >             }
> >         }
> >         this.symbols.addSymbols(alpha, syms, start, length);
> >     }
> >
> > The references to ChunkedSymbolListFactory are causing the problem.
> > ChunkedSymbolListFactory is supposed to perform the threshold
> > checking/factory selection. However it is also applying a further layer
> > of abstraction which forces all symbol lists for sequences over 16k
> > (1<<14) long to be ChunkedSymbolLists, regardless of the factory
> > specified - the factory only specifies what the constituent sequences
> > are within the ChunkedSymbolList. ChunkedSymbolList is immutable so will
> > not allow edits even if its constituents are mutable. However if your
> > sequence is less than 16k long, it behaves properly and you will get the
> > type of sequence you asked for (SimpleSymbolList below the threshold,
> > whatever you specify above it - SimpleSymbolList also happens to be the
> > only SymbolList implementation in BioJava that is actually mutable at
> > present.)
> >
> > As the older thread describes, ChunkedSymbolList and its Factory are
> > very embedded into the core of BioJava and are hard to change - it could
> > break all kinds of things. Therefore the only real solution for now is
> > to temporarily modify your local copy so that inside ChunkedSymbolList,
> > you change the CHUNK_SIZE to something much larger than 1<<14.
> >
> > thanks,
> > Richard
> >
> > Ian Yi-Feng Chang wrote:
> >> Dear All,
> >> I've a problem while editing a richsequence.
> >> and got this exception:
> >> Exception in thread "main" org.biojava.utils.ChangeVetoException:
> >> AbstractSymbolList is immutable
> >>      at
> org.biojava.bio.symbol.AbstractSymbolList.edit(AbstractSymbolList.java:113)
> >>
> >>      at
> org.biojavax.bio.seq.DummyRichSequenceHandler.edit(DummyRichSequenceHandler.java:31)
> >>      at
> org.biojavax.bio.seq.ThinRichSequence.edit(ThinRichSequence.java:163)
> >>      at gizmo.tools.GBKCurator.main(GBKCurator.java:176)
> >>
> >> I trace this problem in this mailing list and find a latest thread
> >> in** *Wed Feb 20 21:33:39 EST 2008*
> >>
> >> However, I still have no idea how to
> >>
> >> Here is the solution (from the JavaDoc)
> >>
> >>
> >>  SimpleRichSequenceBuilderFactory public
> >> SimpleRichSequenceBuilderFactory(SymbolListFactory fact, int threshold)
> >>  Creates a new instance of SimpleRichSequenceBuilderFactory that uses
> >> a specified factory for SymbolLists longer than a specified length.
> >> Before that a SimpleSymbolListFacotry is used.
> >>
> >>  Parameters:
> >> fact - the factory to use when building the
> >> SymbolList.threshold - the threshold to exceed before using this factory
> >>
> >> However, could you please help to demonstrate how to use this solution
> >> to edit a richsequence?
> >>
> >> Thank you so much.
> >>
> >> ian chang
> >> _______________________________________________
> >> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/biojava-l
> >>
> >
>
> --
> Richard Holland, BSc MBCS
> Finance Director, Eagle Genomics Ltd
> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
> http://www.eaglegenomics.com/
>



More information about the Biojava-l mailing list