<div dir="ltr">Thank you for your answers. I've looked through sources of forester and found some parsers (org/forester/io/parsers/<div tabindex="1" id=":bh" class="">nexus). On first sight it could be what you talked about but I'm not sure.<br><br></div><div tabindex="1" id=":bh" class="">Cheers,<br></div><div tabindex="1" id=":bh" class="">Pola<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">2014-11-06 13:49 GMT+01:00 Ben Stöver <span dir="ltr"><<a href="mailto:benstoever@uni-muenster.de" target="_blank">benstoever@uni-muenster.de</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
<br>
Spencer Bliven schrieb am 2014-11-06:<br>
<span class="">> Ben,<br>
<br>
> This sounds like a great idea and a really useful addition to<br>
> biojava! I<br>
> would lean towards only parsing the consensus tree, as the other<br>
> formats<br>
> are pretty specific use cases. We're sure forester doesn't provide<br>
> Nexus<br>
> parsing, right? The documentation isn't particularly complete, but<br>
> it's<br>
> already a phylo dependency so we should avoid duplicating any<br>
> features.<br>
<br>
</span>No, I'm personally not 100 % sure if any Nexus features are implemented in<br>
forester, but I thought they are not, because otherwise there would have been<br>
no Nexus parsing system in BioJava 1.x?<br>
<span class=""><br>
<br>
> As to your second suggestion, it sounds very similar to how<br>
> FastaReader<br>
> currently works, with the user providing a SequenceCreator which<br>
> instantiates whatever Sequence implementation you want to use.<br>
> Mutable<br>
> sequences can lead to a host of additional problems, which is why the<br>
> sequences are currently generated atomically. Or am I<br>
> misunderstanding your<br>
> suggestion?<br>
<br>
</span>I just looked at the code<br>
(<a href="https://github.com/biojava/biojava/blob/master/biojava3-core/src/main/java/org/biojava3/core/sequence/io/FastaReader.java" target="_blank">https://github.com/biojava/biojava/blob/master/biojava3-core/src/main/java/org/biojava3/core/sequence/io/FastaReader.java</a><br>
) and SequenceCreator does not do exactly what I meant, since in the process()<br>
method of FastaReader, the whole sequence is first loaded into a StringBuilder<br>
and afterwards passed to sequenceCreator, which means there is no compression<br>
during loading. So SequenceCreator does a part of what I was thinking of, but<br>
it would not work for very large sequences. (Although I don't find it now, I<br>
think I read a similar statement somewhere in the JavaDocs of the compresses<br>
Sequence implementation.)<br>
<br>
The main benefits I still see for the idea, would first be the abstract<br>
strategy pattern for alignment parsers which would allow to write code<br>
independent of the used format (which is not possible e.g. with the current<br>
FASTA reader) and second editable sequences would of course be usable in use<br>
cases you cannot really solve with the current sequence model (e.g. using it<br>
as the data backend for an alignment editor or GUI components I have in<br>
LibrAlign).<br>
<br>
I'm not sure which problems you mean which would arise from having mutable<br>
sequences (remember: the idea was not to replace current implementations of<br>
the Sequence interface, but to add additional mutable versions). Mayby you<br>
could give same examples? (Are thinking about the need for change listers or<br>
similar things?)<br>
<br>
Anyway it was only an idea for discussion, I'm really not saying that we<br>
definitely need to go in that direction. (For my own projects I already have a<br>
mutable sequence model with bridges to the current BioJava model, so I would<br>
be fine there.) Maybe there are really problems comming with this idea I<br>
currently do not see? In that case we could of course also think about just<br>
adding a interface for sequence parsers, that allows to use them in an<br>
abstract strategy pattern. (That would than really be a slight API change, if<br>
the existing readers and writers would implement such an interface, but it<br>
might be possible, when there is anyway a version 4 comming?)<br>
<br>
Best<br>
<span class="HOEnZb"><font color="#888888">Ben<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
> It would be fantastic to have some additional development of multiple<br>
> alignments and the phylo package! Thanks for the offer to contribute!<br>
<br>
> -Spencer<br>
<br>
> On Thu, Nov 6, 2014 at 12:19 PM, Jose Manuel Duarte<br>
> <<a href="mailto:jose.duarte@psi.ch">jose.duarte@psi.ch</a>><br>
> wrote:<br>
<br>
> > Hi Ben<br>
<br>
> > Thanks a lot for all the insights. I am really not the most<br>
> > appropriate<br>
> > person to comment on all the biojava phylogeny and sequence related<br>
> > things<br>
> > but anyway below are some of my opinions.<br>
<br>
<br>
> > On 05/11/14 17:22, Ben Stöver wrote:<br>
<br>
<br>
<br>
> >> The more interesting/urgent thing though might be parsing the<br>
> >> consensus<br>
> >> tree<br>
> >> which is in Nexus format (or writing the input files for MrBayes).<br>
> >> Although<br>
> >> the Nexus format is not really state of the art anymore and<br>
> >> replacements<br>
> >> like<br>
> >> e.g. NeXML (<a href="http://nexml.org/" target="_blank">http://nexml.org/</a> ) - which overcome its limitations<br>
> >> -<br>
> >> should be<br>
> >> prefered if you implement a new software, the Nexus format is<br>
> >> still widely<br>
> >> used and supporting in BioJava 3 (or 4) would surely be a good<br>
> >> idea.<br>
> >> There was<br>
> >> a extensible Nexus parser in BioJava 1.x<br>
> >> (<a href="http://www.biojava.org/docs/api1.9.1/org/biojavax/bio/" target="_blank">http://www.biojava.org/docs/api1.9.1/org/biojavax/bio/</a><br>
> >> phylo/io/nexus/package-summary.html<br>
> >> ) which could be ported to BioJava 3 (4). (This has never been<br>
> >> done until<br>
> >> now,<br>
> >> hasen't it?)<br>
<br>
<br>
> > If I understand it properly they were not ported yet to 3 because<br>
> > of lack<br>
> > of time, so I think the porting of the nexus stuff would be a great<br>
> > thing.<br>
> > +1 to that.<br>
<br>
<br>
<br>
> >> Therefore I would offer to implement such functionality for<br>
> >> BioJava, but<br>
> >> before making a pull request or anything, I wanted to ask for<br>
> >> opinion of<br>
> >> the<br>
> >> cummunity on that idea and also if I might have missed concepts in<br>
> >> BioJava<br>
> >> that would currently already allow to do something similar.<br>
<br>
<br>
> > To me the whole idea sounds great. Especially if it can be made<br>
> > compatible<br>
> > with the existing Biojava interfaces. If I understand what you<br>
> > propose, you<br>
> > would only introduce a new way of parsing things which could even<br>
> > live<br>
> > alongside the current parsers. It could even go to its own package<br>
> > (sequence.nio ?). For me this is a +1 too.<br>
<br>
> > Cheers<br>
<br>
> > Jose<br>
<br>
> > _______________________________________________<br>
> > Biojava-l mailing list - <a href="mailto:Biojava-l@mailman.open-bio.org">Biojava-l@mailman.open-bio.org</a><br>
> > <a href="http://mailman.open-bio.org/mailman/listinfo/biojava-l" target="_blank">http://mailman.open-bio.org/mailman/listinfo/biojava-l</a><br>
<br>
<br>
_______________________________________________<br>
Biojava-l mailing list - <a href="mailto:Biojava-l@mailman.open-bio.org">Biojava-l@mailman.open-bio.org</a><br>
<a href="http://mailman.open-bio.org/mailman/listinfo/biojava-l" target="_blank">http://mailman.open-bio.org/mailman/listinfo/biojava-l</a><br>
</div></div></blockquote></div><br></div>