<html><body><div style="color:#000; background-color:#fff; font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:16px"><div id="yiv5364475655"><div id="yui_3_16_0_1_1423091317507_17841"><div id="yui_3_16_0_1_1423091317507_17840" style="color:#000;background-color:#fff;font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:16px;"><div id="yiv5364475655"><div id="yiv5364475655yui_3_16_0_1_1423091317507_3683"><div id="yiv5364475655yui_3_16_0_1_1423091317507_3682" style="color:#000;background-color:#fff;font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:16px;"><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1130482"><span id="yiv5364475655yui_3_16_0_1_1423091317507_5624">Hi Andreas,</span></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1130687"><br clear="none"><span></span></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1130689"><span class="yiv5364475655" id="yiv5364475655yui_3_16_0_1_1422406308895_1140284" style="">yes I took a look at </span>FastaWriterHelper as well as GenbankWriter and they only seem to implement writing the name and sequence as fasta. Also they do not allow to read/write a mixed array of protein and DNA sequences. I asked myself what is the sense of constructing a complicated sequence with annotations, features and links, if I can only write fasta? <br clear="none"></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1137443"><br clear="none"></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1134142">This lead me to check out why one of the most basic classes of biojava like sequence (i.e. AbstractSequence) is not serializable. <br clear="none"></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1132801"><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1423091317507_11822">(Isn't it like String for java?)<br clear="none"></div><div id="yiv5364475655yui_3_16_0_1_1423091317507_11823"><br clear="none"></div></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1131198">The first thing I noticed is that for some reason every sequence has a proxyloader. As fas as I understand the proxy is implemented in order to not load the entire sequence in case it is very big. Sure, then you can load sequences which have Gigabase length. But I have never in my 25 years of biochemistry actually worked with a single sequence of > 1GB. While there are some plant chromosomes which might fit this description, I would argue that the vast majority of biological sequences are much smaller and thus do not need a proxy for a single sequence. Thus, I would conclude that a small subset of ChromosomeSequence might need a proxyreader implementation.</div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1134143">And thus it should be implemented there and not in the most basic class?</div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1138801"><br clear="none"></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1138803"><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1423091317507_6912">The first class which prevents serialization is as you mentioned NucleotideCompound. I lack the biojava experience to say what is essential in NucleotideCompond and why it does not allow an empty constructor. But I saw for example in biojava 3.1 that compounds are allowed to have flexible name length, which I have never seen in actual sequence data, where it is always 1 or three characters. Is it not a better strategy to keep basic classes such as Sequence and Compound more basic in order to allow serialization. Implementation of more complex features could then be moved to classes which extend the basic classes? <br clear="none"></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1423091317507_6913"><br clear="none"></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1423091317507_6937">In my humble opinion one could instantiate a compound without a 'base' name but once this compound is added to the compound set, I could check that it actually has a base name?<br clear="none"></div></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1134146"><div id="yiv5364475655yui_3_16_0_1_1423091317507_8746"><br clear="none"></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1423091317507_8744">I do not want to sound like a know-it-all and do not try to reinvent biojava. However to be honest the (unsuccessful) effort in trying to serialize an ArrayList<Sequence<?>> either to send it around over TCP/IP, to JSON or to disk has been so frustrating and time consuming, that I actually consider changing to jython/biopython, biojavaX, or to write my own implementation.</div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1423091317507_8745"><br clear="none"></div></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1134147"><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1423091317507_8135">Cheers</div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1423091317507_9984">Stefan</div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1423091317507_9985"><br clear="none"></div><div id="yiv5364475655yui_3_16_0_1_1423091317507_8136"><br clear="none"></div><div id="yiv5364475655yui_3_16_0_1_1423091317507_9986"><br clear="none"></div></div><div dir="ltr" id="yiv5364475655yui_3_16_0_1_1422406308895_1134796"><br clear="none"></div> <div class="yiv5364475655qtdSeparateBR" id="yiv5364475655yui_3_16_0_1_1423091317507_3692"><br clear="none"><br clear="none"></div><div class="yiv5364475655yahoo_quoted" style="display: block;"> <div style="font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:16px;"> <div style="font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:16px;"> <div dir="ltr"> <font face="Arial" size="2"> Andreas Prlic <andreas@sdsc.edu> schrieb am 4:32 Donnerstag, 5.Februar 2015:<br clear="none"> </font> </div> <br clear="none"><br clear="none"> <div class="yiv5364475655qtdSeparateBR"><br clear="none"><br clear="none"></div><div class="yiv5364475655yqt7223084620" id="yiv5364475655yqt41591"><div class="yiv5364475655yqt7637293266" id="yiv5364475655yqt91153"><div class="yiv5364475655y_msg_container"><div id="yiv5364475655"><div><div dir="ltr">Hi Stefan,<div><br clear="none"></div><div>just another quick follow up. You took a look at FastaWriterHelper and it was not useful, right? You need to serialize some header information as well, or what was the problem with it?</div><div><br clear="none"></div><div><a rel="nofollow" shape="rect" target="_blank" href="http://www.biojava.org/docs/api/org/biojava/nbio/core/sequence/io/FastaWriterHelper.html">http://www.biojava.org/docs/api/org/biojava/nbio/core/sequence/io/FastaWriterHelper.html</a><br clear="none"></div><div><br clear="none"></div><div>Thanks,</div><div><br clear="none"></div><div>Andreas</div><div><br clear="none"></div></div><div class="yiv5364475655gmail_extra"><br clear="none"><div class="yiv5364475655yqt7988422787" id="yiv5364475655yqt64151"><div class="yiv5364475655gmail_quote">On Wed, Feb 4, 2015 at 7:13 AM, Andreas Prlic <span dir="ltr"><<a rel="nofollow" shape="rect" ymailto="mailto:andreas@sdsc.edu" target="_blank" href="mailto:andreas@sdsc.edu">andreas@sdsc.edu</a>></span> wrote:<br clear="none"><blockquote class="yiv5364475655gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div dir="ltr">Thanks for pointing this out, Stefan. The problem is that the NucleotideCompound class does not have a zero-args constructor. That means you need to tweak kryo a bit. Kryo can be configured to use an InstantiatorStrategy to handle creating instances of a class. <a rel="nofollow" shape="rect" target="_blank" href="https://github.com/EsotericSoftware/kryo/blob/master/README.md">https://github.com/EsotericSoftware/kryo/blob/master/README.md</a><div><br clear="none"></div><div>Having said that, we need to improve the API and make something like this easier. </div><span class="yiv5364475655HOEnZb"><font color="#888888"></font></span><div><br clear="none"></div><div>Andreas</div><div><div class="yiv5364475655h5"><div><br clear="none"><div><br clear="none"></div></div><div class="yiv5364475655gmail_extra"><br clear="none"><div class="yiv5364475655gmail_quote">On Wed, Feb 4, 2015 at 2:54 AM, stefan harjes <span dir="ltr"><<a rel="nofollow" shape="rect" ymailto="mailto:stefanharjes@yahoo.de" target="_blank" href="mailto:stefanharjes@yahoo.de">stefanharjes@yahoo.de</a>></span> wrote:<br clear="none"><blockquote class="yiv5364475655gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><div style="color:#000;background-color:#fff;font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:16px;"><div dir="ltr"><span>I finally had some time to try the serialization/deserialization library (Kryo) you mentioned, but I do not seem to get it to work. I can not even save a DNASequence:</span></div><div dir="ltr"><br clear="none"><span></span></div><div dir="ltr"><span>void test() {<br clear="none"> Kryo kryo = new Kryo();<br clear="none"> DNASequence dna=null;<br clear="none"> try {<br clear="none"> dna = new DNASequence("AGCT");<br clear="none"> } catch (CompoundNotFoundException e1) {<br clear="none"> // TODO Auto-generated catch block<br clear="none"> e1.printStackTrace();<br clear="none"> }<br clear="none"> try {<br clear="none"> Output output = new Output(new FileOutputStream("test.ser"));<br clear="none"> kryo.writeObject(output, dna);<br clear="none"> output.close(); <br clear="none"> } catch (FileNotFoundException e) {<br clear="none"> // TODO Auto-generated catch block<br clear="none"> e.printStackTrace();<br clear="none"> }<br clear="none"> try {<br clear="none"> Input input = new Input(new FileInputStream("test.ser"));<br clear="none"> dna = kryo.readObject(input, DNASequence.class);<br clear="none"> input.close();<br clear="none"> } catch (FileNotFoundException e) {<br clear="none"> // TODO Auto-generated catch block<br clear="none"> System.out.println("file not found");<br clear="none"> e.printStackTrace();<br clear="none"> }<br clear="none">}<br clear="none"></span></div><div dir="ltr"><span>I tried several calls of Kryo and also registration, but I can not get it to work.... Any ideas?</span></div><div dir="ltr"><span><br clear="none"></span></div><div dir="ltr"><br clear="none"><span></span></div><div dir="ltr"><span>Cheers</span></div><div dir="ltr"><span>Stefan</span></div><div dir="ltr"><span></span></div> <div><br clear="none"><br clear="none"></div><div style="display:block;"> <div style="font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:16px;"> <div style="font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:16px;"> <div dir="ltr"> <font face="Arial"> Andreas Prlic <<a rel="nofollow" shape="rect" ymailto="mailto:andreas@sdsc.edu" target="_blank" href="mailto:andreas@sdsc.edu">andreas@sdsc.edu</a>> schrieb am 3:47 Samstag, 31.Januar 2015:<br clear="none"> </font> </div><div><div> <br clear="none"><br clear="none"> <div><div><div><div dir="ltr">Hi Stefan,<div><br clear="none"></div><div>for your use case (save and load at server start/stop) I'd recommend the Kryo library. It will store your data as a binary. Should be only two lines of code each to persist and load the data. <a rel="nofollow" shape="rect" target="_blank" href="https://github.com/EsotericSoftware/kryo">https://github.com/EsotericSoftware/kryo</a></div><div><br clear="none"></div><div>You are right, writing is not very well developed, but then there are so many utility libraries in Java that can be used for efficient serialization/deserialization in many ways, once you have an object in memory.</div><div><br clear="none"></div><div>Andreas</div><div><br clear="none"></div><div><br clear="none"></div><div><div><br clear="none"><div>On Fri, Jan 30, 2015 at 3:01 AM, stefan harjes <span dir="ltr"><<a rel="nofollow" shape="rect" ymailto="mailto:stefanharjes@yahoo.de" target="_blank" href="mailto:stefanharjes@yahoo.de">stefanharjes@yahoo.de</a>></span> wrote:<br clear="none"><blockquote style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex;"><div style="color:rgb(0,0,0);font-family:HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif;font-size:16px;background-color:rgb(255,255,255);">Hi biojava-l<div><br clear="none"><br clear="none"></div><div style="display:block;"><div style="font-family:HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif;font-size:16px;"><div style="font-family:HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif;font-size:16px;"><div><div><div><div><div style="color:rgb(0,0,0);font-family:HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif;font-size:16px;background-color:rgb(255,255,255);"><div style="display:block;"><div style="font-family:HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif;font-size:16px;"><div style="font-family:HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif;font-size:16px;"><div><div><div><div style="color:rgb(0,0,0);font-family:HelveticaNeue, 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif;font-size:16px;background-color:rgb(255,255,255);"><div><br clear="none"></div><div dir="ltr">I have a huge number of small sequences in an Array (ListArray<Sequence<?>>) which for server start and stop I would like to store on disk. Unfortunately Sequence is not serilizable, so I searched and found that GenbankWriterHelper.writeSequences(OutputStream os, Collection<Sequence<?>> seqs) should be able to do the job. <br clear="none"></div><div dir="ltr"><div>However when looking at GenbankReaderHelper, there are no methods which correspond to the above writer method. Am I on the wrong track completely? <br clear="none"></div><div><br clear="none"></div><div dir="ltr">When looking at the writer/reader helpers, I think I remember reading that they are rudimentary and save only the sequence (fasta)? I would expect in such an advanced verision of biojava (4.0 is being prepared?) that there must be a standard way to serialize rich sequences/arrays of them in order to send them around on streams/Json etc?<br clear="none"></div><div><br clear="none"></div><div>Any help would be appreciated</div></div><div dir="ltr"><br clear="none"></div><div dir="ltr">Cheers</div><span><font color="#888888"></font></span><div dir="ltr">Stefan</div><br clear="none"></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></blockquote></div><div><br clear="none"></div>
</div></div></div></div></div><br clear="none"><br clear="none"></div> </div></div></div> </div> </div> </div></div></blockquote></div><br clear="none"><div><br clear="none"></div>
</div></div></div></div>
</blockquote></div></div><br clear="none"><br clear="all"><div><br clear="none"></div>-- <br clear="none"><div class="yiv5364475655gmail_signature"><div dir="ltr">-----------------------------------------------------------------------<br clear="none">Dr. Andreas Prlic<br clear="none">RCSB PDB Protein Data Bank<br clear="none">University of California, San Diego<div><br clear="none"></div><div>Editor Software Section <br clear="none"><div>PLOS Computational Biology<div><div><div><br clear="none"></div><div>BioJava Project Lead<br clear="none">-----------------------------------------------------------------------<br clear="none"></div></div></div></div></div></div></div>
</div></div></div><br clear="none"><br clear="none"></div></div></div> </div> </div> </div> </div></div></div></div></div></div></div></body></html>