<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi Paolo<br>
<br>
Sorry for the late reply, only now I've been able to have a quick
look at your new SearchIO module and I think it looks fantastic. In
my opinion this is a great and needed feature. I would definitely
want to use it for my own projects as soon as possible. Some
comments inline below.<br>
<br>
<div class="moz-cite-prefix">On 22.08.2015 19:26, Paolo Pavan wrote:<br>
</div>
<blockquote
cite="mid:5886D938-09A4-43D8-B7E6-8EE078485382@gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<div><span></span></div>
<div>
<div dir="ltr">
<div dir="auto">
<div><span></span></div>
<div>
<div><span></span></div>
<div>
<div><span></span></div>
<div><br>
<span>Also note that this is a required part of
another module I have written that can potentially
be of community interest: a biojava-run module, to
bless it similarly to something already listened.
This latter aims to be a generic module used to run
an analysis performed by an external program. In my
case I needed ncbi blast search. So the API was
written to declare a database of biojava Sequence
objects, pass a collection of query sequences and
retrieve in output Result objects of the SearchIO
module. </span><br>
<span>I know from previous attempt echoed in the
mailing list that the orientation of the project was
to reimplement the blast algorithm in pure Java and
I agree that it would be a great idea. But until now
this project as far as I know is late and I solved
the platform portability issue by including several
binaries for all the platforms (well, the major)
packaging all together in one jar file relying upon
this great Java facility. </span><br>
<span>Anyway, all this came later. </span><br>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
The biojava-run module sounds interesting too, do you have it
anywhere in github that we can have a look at it?<br>
<br>
<blockquote
cite="mid:5886D938-09A4-43D8-B7E6-8EE078485382@gmail.com"
type="cite">
<div>
<div dir="ltr">
<div dir="auto">
<div>
<div>
<div><span></span><br>
<span>Just to spend few technical comments on the
SearchIO module:</span><br>
<span>- included in core module since it defines a new
base data structure</span><br>
<span>- include a dependency from biojava-alignment.
This is not compulsory, it is there since the
alignment data structure is included in that
package. In my opinion, moving this important data
structure in core will solve this and avoid similar
problems in the future. This is also the reason why
I choose to add those new implemented Hits/Hsp etc
directly in core, after all search is one of the
most important tasks in bioinformatics.</span><br>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
I agree it only makes sense to have it in core right now. However
the dependency on biojava-alignment is not ideal. In principle the
best solution would be to move the alignment data structure to core,
I agree with you. For the time being another possibility would be to
put all the new SearchIO stuff in biojava-alignment, but that
doesn't really improve the structure of things. I'd vote for moving
the alignment data structure as a sounder and better long-term
solution. <br>
<br>
This would surely require a new minor version, so that all the new
stuff would be released as biojava 4.2.<br>
<br>
<br>
<blockquote
cite="mid:5886D938-09A4-43D8-B7E6-8EE078485382@gmail.com"
type="cite">
<div>
<div dir="ltr">
<div dir="auto">
<div>
<div>
<div><span>- BlastXML parser is implemented in the
BlastXMLQuery class. Maybe this name it is not so
meaningful, it comes from the original class that is
still there in biojava even if it seems not so much
utilised, that I initially started to improve trying
to remain tighter to the original project. From here
also the use of the class XMLHelper and some
deprecated tags I added. From the old thread I
understood that there was not any "elective choice"
of biojava for XML parsing, but anyway the job was
already done with the XMLHelper module and so this
class came to new life.</span></div>
<div><span>- it was designed to be easy to extend: </span>add
support for a new file format a developer must just
write a single class that implements the ResultFactory
interface (<span
style="background-color:rgba(255,255,255,0)">I have
implemented also a blast tabular parser to show it)</span>.
The Api for biojava user does not change, it is just:</div>
<div><span style="background-color:rgba(255,255,255,0)"> <span>SearchIO</span>
reader <span>=</span> <span>new</span> <span>SearchIO</span>(<span>new</span>
<span>File</span>(<span><span>"BlastReport.blastxml</span><span>"</span></span>),
blastResultFactory);</span></div>
<div><br>
<span>- it is possible to auto recognise file formats
relying upon standard file extension. Just try a
different constructor:</span></div>
<div><span style="background-color:rgba(255,255,255,0)">
<span>SearchIO</span> reader <span>=</span> <span>new</span> <span>SearchIO</span>(<span>new</span> <span>File</span>("BlastReport.blastxml<span><span>"</span></span>));</span><br>
<span></span><span></span></div>
<div><span><br>
</span></div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
How about the different blast XML formats? Would it work with all
the latest ones? Blast+ 2.2.31 has introduced some modifications to
the format (see <a class="moz-txt-link-freetext" href="http://www.ncbi.nlm.nih.gov/books/NBK131777/">http://www.ncbi.nlm.nih.gov/books/NBK131777/</a>)<br>
<br>
<blockquote
cite="mid:5886D938-09A4-43D8-B7E6-8EE078485382@gmail.com"
type="cite">
<div>
<div dir="ltr">
<div dir="auto">
<div>
<div>
<div><span>If you agree that this feature would be
interesting for the project I can send a pull
request for the SearchIO part and then push on my
GitHub also the run module. </span><br>
<span></span><br>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
I think it is a very nice addition, in my opinion you should go
ahead with the pull request which will make it easier for everyone
to check it out and review it for a while.<br>
<br>
Jose<br>
<br>
</body>
</html>