[Biojava-l] Editing a RichSequence

Richard Holland holland at eaglegenomics.com
Wed Apr 22 15:50:17 UTC 2009


I forgot to mention - ChunkedSymbolListFactory is currently the only
SymbolListFactory implementation in BioJava which can accept 'streamed'
data rather than taking the whole sequence at once. So, the other
alternative to changing CHUNK_SIZE is to create a new SymbolListFactory
implementation which can accept 'streamed' data and use it to replace
the reference to ChunkedSymbolListFactory in SimpleRichSequenceBuilder.

Richard.

Richard Holland wrote:
> The problem lies in SimpleRichSequenceBuilder:
> 
>     public void addSymbols(Alphabet alpha, Symbol[] syms, int start, int
> length) throws IllegalAlphabetException {
>         if (this.symbols==null) {
>             if (threshold<=0) {
>                 this.symbols = new ChunkedSymbolListFactory(this.factory);
>             } else {
>                 this.symbols = new
> ChunkedSymbolListFactory(this.factory,threshold);
>             }
>         }
>         this.symbols.addSymbols(alpha, syms, start, length);
>     }
> 
> The references to ChunkedSymbolListFactory are causing the problem.
> ChunkedSymbolListFactory is supposed to perform the threshold
> checking/factory selection. However it is also applying a further layer
> of abstraction which forces all symbol lists for sequences over 16k
> (1<<14) long to be ChunkedSymbolLists, regardless of the factory
> specified - the factory only specifies what the constituent sequences
> are within the ChunkedSymbolList. ChunkedSymbolList is immutable so will
> not allow edits even if its constituents are mutable. However if your
> sequence is less than 16k long, it behaves properly and you will get the
> type of sequence you asked for (SimpleSymbolList below the threshold,
> whatever you specify above it - SimpleSymbolList also happens to be the
> only SymbolList implementation in BioJava that is actually mutable at
> present.)
> 
> As the older thread describes, ChunkedSymbolList and its Factory are
> very embedded into the core of BioJava and are hard to change - it could
> break all kinds of things. Therefore the only real solution for now is
> to temporarily modify your local copy so that inside ChunkedSymbolList,
> you change the CHUNK_SIZE to something much larger than 1<<14.
> 
> thanks,
> Richard
> 
> Ian Yi-Feng Chang wrote:
>> Dear All,
>> I've a problem while editing a richsequence.
>> and got this exception:
>> Exception in thread "main" org.biojava.utils.ChangeVetoException:
>> AbstractSymbolList is immutable
>> 	at org.biojava.bio.symbol.AbstractSymbolList.edit(AbstractSymbolList.java:113)
>>
>> 	at org.biojavax.bio.seq.DummyRichSequenceHandler.edit(DummyRichSequenceHandler.java:31)
>> 	at org.biojavax.bio.seq.ThinRichSequence.edit(ThinRichSequence.java:163)
>> 	at gizmo.tools.GBKCurator.main(GBKCurator.java:176)
>>
>> I trace this problem in this mailing list and find a latest thread
>> in** *Wed Feb 20 21:33:39 EST 2008*
>>
>> However, I still have no idea how to
>>
>> Here is the solution (from the JavaDoc)
>>
>>
>>  SimpleRichSequenceBuilderFactory public
>> SimpleRichSequenceBuilderFactory(SymbolListFactory fact, int threshold)
>>  Creates a new instance of SimpleRichSequenceBuilderFactory that uses
>> a specified factory for SymbolLists longer than a specified length.
>> Before that a SimpleSymbolListFacotry is used.
>>
>>  Parameters:
>> fact - the factory to use when building the
>> SymbolList.threshold - the threshold to exceed before using this factory
>>
>> However, could you please help to demonstrate how to use this solution
>> to edit a richsequence?
>>
>> Thank you so much.
>>
>> ian chang
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
> 

-- 
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/



More information about the Biojava-l mailing list