[Biojava-l] Editing a RichSequence
Richard Holland
holland at eaglegenomics.com
Wed Apr 22 15:50:17 UTC 2009
I forgot to mention - ChunkedSymbolListFactory is currently the only
SymbolListFactory implementation in BioJava which can accept 'streamed'
data rather than taking the whole sequence at once. So, the other
alternative to changing CHUNK_SIZE is to create a new SymbolListFactory
implementation which can accept 'streamed' data and use it to replace
the reference to ChunkedSymbolListFactory in SimpleRichSequenceBuilder.
Richard.
Richard Holland wrote:
> The problem lies in SimpleRichSequenceBuilder:
>
> public void addSymbols(Alphabet alpha, Symbol[] syms, int start, int
> length) throws IllegalAlphabetException {
> if (this.symbols==null) {
> if (threshold<=0) {
> this.symbols = new ChunkedSymbolListFactory(this.factory);
> } else {
> this.symbols = new
> ChunkedSymbolListFactory(this.factory,threshold);
> }
> }
> this.symbols.addSymbols(alpha, syms, start, length);
> }
>
> The references to ChunkedSymbolListFactory are causing the problem.
> ChunkedSymbolListFactory is supposed to perform the threshold
> checking/factory selection. However it is also applying a further layer
> of abstraction which forces all symbol lists for sequences over 16k
> (1<<14) long to be ChunkedSymbolLists, regardless of the factory
> specified - the factory only specifies what the constituent sequences
> are within the ChunkedSymbolList. ChunkedSymbolList is immutable so will
> not allow edits even if its constituents are mutable. However if your
> sequence is less than 16k long, it behaves properly and you will get the
> type of sequence you asked for (SimpleSymbolList below the threshold,
> whatever you specify above it - SimpleSymbolList also happens to be the
> only SymbolList implementation in BioJava that is actually mutable at
> present.)
>
> As the older thread describes, ChunkedSymbolList and its Factory are
> very embedded into the core of BioJava and are hard to change - it could
> break all kinds of things. Therefore the only real solution for now is
> to temporarily modify your local copy so that inside ChunkedSymbolList,
> you change the CHUNK_SIZE to something much larger than 1<<14.
>
> thanks,
> Richard
>
> Ian Yi-Feng Chang wrote:
>> Dear All,
>> I've a problem while editing a richsequence.
>> and got this exception:
>> Exception in thread "main" org.biojava.utils.ChangeVetoException:
>> AbstractSymbolList is immutable
>> at org.biojava.bio.symbol.AbstractSymbolList.edit(AbstractSymbolList.java:113)
>>
>> at org.biojavax.bio.seq.DummyRichSequenceHandler.edit(DummyRichSequenceHandler.java:31)
>> at org.biojavax.bio.seq.ThinRichSequence.edit(ThinRichSequence.java:163)
>> at gizmo.tools.GBKCurator.main(GBKCurator.java:176)
>>
>> I trace this problem in this mailing list and find a latest thread
>> in** *Wed Feb 20 21:33:39 EST 2008*
>>
>> However, I still have no idea how to
>>
>> Here is the solution (from the JavaDoc)
>>
>>
>> SimpleRichSequenceBuilderFactory public
>> SimpleRichSequenceBuilderFactory(SymbolListFactory fact, int threshold)
>> Creates a new instance of SimpleRichSequenceBuilderFactory that uses
>> a specified factory for SymbolLists longer than a specified length.
>> Before that a SimpleSymbolListFacotry is used.
>>
>> Parameters:
>> fact - the factory to use when building the
>> SymbolList.threshold - the threshold to exceed before using this factory
>>
>> However, could you please help to demonstrate how to use this solution
>> to edit a richsequence?
>>
>> Thank you so much.
>>
>> ian chang
>> _______________________________________________
>> Biojava-l mailing list - Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>
--
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/
More information about the Biojava-l
mailing list