<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
To be honest I don't really know gmap, that's because I'm not into
sequencing data at all. My use-case is protein database search,
that's why something like lambda that can do both dna and protein is
so important for me. If I understand it, gmap wouldn't be able to do
protein searches, would it?<br>
<br>
In the lambda publication
(<a class="moz-txt-link-freetext" href="http://bioinformatics.oxfordjournals.org/content/30/17/i349.full">http://bioinformatics.oxfordjournals.org/content/30/17/i349.full</a>)
the authors compare it to a few other methods (blast, pauda,
rapsearch2, ublast), but not to gmap.<br>
<br>
SANSparallel is another new method which is apparently also very
fast. In their publication
(<a class="moz-txt-link-freetext" href="http://nar.oxfordjournals.org/content/early/2015/04/08/nar.gkv317.long">http://nar.oxfordjournals.org/content/early/2015/04/08/nar.gkv317.long</a>)
they compare to some others, but no gmap again.<br>
<br>
Jose<br>
<br>
<br>
<br>
<div class="moz-cite-prefix">On 11.05.2015 11:30, Erik McKee wrote:<br>
</div>
<blockquote
cite="mid:CABCu8hh7HOMpLMtCfr-94CZVm7KAsn6W-4Wy3eohcUwZmrMTrw@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<p dir="ltr">How does gmap compare to these? </p>
<div class="gmail_quote">On May 11, 2015 5:26 AM, "Jose Manuel
Duarte" <<a moz-do-not-send="true"
href="mailto:jose.duarte@psi.ch">jose.duarte@psi.ch</a>>
wrote:<br type="attribution">
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> Just one more comment
regarding alternatives to blast. Recently I've come across
such an alternative that is not as sensitive as blast but a
lot faster, it's called lambda:<br>
<br>
<a moz-do-not-send="true"
href="http://www.seqan.de/projects/lambda/"
target="_blank">http://www.seqan.de/projects/lambda/</a><br>
<br>
I've tried it out and I'm very impressed with the results,
it can do full UniRef100 searches in a split of a second.
There are still some issues to iron out, especially in the
indexing which is very memory and disk hungry. But all in
all it does seem to be a real alternative to blast.<br>
<br>
Their output is blast compatible: they can do either classic
pairwise output (-m 0) or tabular output (-m 8). No XML
output yet though.<br>
<br>
So this would support the case to have some kind of
framework that can deal with the results of a sequence
homology search. The actual parsers would be then
implemented on a per-case basis. <br>
<br>
Jose<br>
<br>
<br>
<br>
<div>On 10.05.2015 14:04, Paolo Pavan wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>Hello!<br>
</div>
I obviously share the opinion of Peter and Jose.
Moreover, as already written, I have used this new
feature in a second work that I could also
describe and submit to biojava, if of any
interest.<br>
<br>
About Andreas' questions:<br>
</div>
" Does your module support psiblast, rpsblast,
tblastx and blast+ and what versions?": At now, it
supports the blastn, blastp, blastx, tblastn and
tblastx version 2.2.29. I'm not very sure about
psiblast and rpsblast, I should test it. <br>
But it has been designed so that to update a single
parser (as well to add a new search program and
still remaining in the designed framework) there
will be the need to write just a single class. This
will keep the code simple and neat, very important
in my opinion for future developers.<br>
<br>
"the disadvantage is that you constantly need to
update them to the variant of blast plus version of
the output file format": this unfortunately is a
problem that everyone of us have to face if wants to
use new ncbi programs. It happened for legacy-blast,
it happened a lot of time for genbank format, it is
happening for blast+. Just hoping that they would
have the kindness explicit the format version inside
the xml if not to name the program itself in
different way, such for example blast3 or blast++,
to avoid confusion. We can't do anything about that,
we can just try to make the things simple and easy
to reuse.<br>
<br>
</div>
Just to express my opinion, I think that every bio
project should first of all address theese "base
level" problem more than others to allow the developer
to focus on higher abstraction details. I'm sure that
this will be appreciated by the community, increasing
the base of users of biojava.<br>
<br>
</div>
Paolo<br>
<div>
<div>
<div class="gmail_extra"><br>
<div class="gmail_quote">2015-05-06 12:15
GMT+02:00 Jose Manuel Duarte <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:jose.duarte@psi.ch"
target="_blank">jose.duarte@psi.ch</a>></span>:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">I'd say
that having some common data structure to
model the output of a sequence homology search
should be benefitial. For instance a blast
alternative might appear one day (I'm eagerly
awaiting for it!). The common data structure
should be able to model the outputs of any of
the different softwares.<br>
<br>
There are already some alternatives to blast:<br>
<br>
SANS and SANSparallel by Liisa Holm (<a
moz-do-not-send="true"
href="http://www.ncbi.nlm.nih.gov/pubmed/22962464"
target="_blank">http://www.ncbi.nlm.nih.gov/pubmed/22962464</a>,
<a moz-do-not-send="true"
href="http://nar.oxfordjournals.org/content/early/2015/04/08/nar.gkv317.full"
target="_blank">http://nar.oxfordjournals.org/content/early/2015/04/08/nar.gkv317.full</a>)<br>
USEARCH (commercial) (<a
moz-do-not-send="true"
href="http://drive5.com/usearch/"
target="_blank">http://drive5.com/usearch/</a>)<br>
BLAT (<a moz-do-not-send="true"
href="https://genome.ucsc.edu/FAQ/FAQblat.html#blat3"
target="_blank">https://genome.ucsc.edu/FAQ/FAQblat.html#blat3</a>)<br>
<br>
In fact SANSparallel looks very promising,
it's incredibly fast though less sensitive
than blast.<br>
<br>
Cheers<span><font color="#888888"><br>
<br>
Jose</font></span>
<div>
<div><br>
<br>
<br>
<br>
On <a moz-do-not-send="true"
href="tel:06.05.2015%2010"
value="+390605201510" target="_blank">06.05.2015
10</a>:47, Peter Cock wrote:<br>
</div>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<div> On Wed, May 6, 2015 at 6:02 AM,
Andreas Prlic <<a
moz-do-not-send="true"
href="mailto:andreas@sdsc.edu"
target="_blank">andreas@sdsc.edu</a>>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> On
Tue, May 5, 2015 at 1:18 PM, Paolo
Pavan <<a moz-do-not-send="true"
href="mailto:paolo.pavan@gmail.com"
target="_blank">paolo.pavan@gmail.com</a>>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
As seen in other Bio projects, aside
with Sequence IO and Alignment IO<br>
procedures it could have a Search
result IO also.<br>
</blockquote>
I never understood why other Bio*
projects have special Blast modules.<br>
Perhaps XML parsing is not as easy as
it is in Java? Please see the code at<br>
the bottom of this message.<br>
</blockquote>
Python at least has a range of XML
parsing libraries which are up to the<br>
task. However, as Paolo wrote:<br>
<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
The advantage is to define common
data structures that models Hsp,
Hits,<br>
Results without taking care (ie.
making abstraction) of the
underlying<br>
search program.<br>
</blockquote>
</blockquote>
This is the big advantage of BioPerl and
Biopython's SearchIO module.<br>
You can at least in theory switch
between parsing BLAST XML, BLAST<br>
tabular, BLAST plain text (shudder), or
another related format without<br>
major changes to your code.<br>
<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
and the disadvantage is that you
constantly need to update them to the<br>
variant of blast plus version of the
output file format.<br>
</blockquote>
I think it is much better to have this
housekeeping done once centrally in<br>
a Bio* library that re-invented by
anyone parsing the BLAST output.<br>
However, the NCBI BLAST XML output has
been fairly stable, and the<br>
new output has a formal schema so should
be even more dependable.<br>
<br>
Peter<br>
</div>
</div>
<span>
_______________________________________________<br>
biojava-dev mailing list<br>
<a moz-do-not-send="true"
href="mailto:biojava-dev@mailman.open-bio.org"
target="_blank">biojava-dev@mailman.open-bio.org</a><br>
<a moz-do-not-send="true"
href="http://mailman.open-bio.org/mailman/listinfo/biojava-dev"
target="_blank">http://mailman.open-bio.org/mailman/listinfo/biojava-dev</a><br>
</span></blockquote>
<div>
<div> <br>
_______________________________________________<br>
biojava-dev mailing list<br>
<a moz-do-not-send="true"
href="mailto:biojava-dev@mailman.open-bio.org"
target="_blank">biojava-dev@mailman.open-bio.org</a><br>
<a moz-do-not-send="true"
href="http://mailman.open-bio.org/mailman/listinfo/biojava-dev"
target="_blank">http://mailman.open-bio.org/mailman/listinfo/biojava-dev</a><br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
<br>
_______________________________________________<br>
biojava-dev mailing list<br>
<a moz-do-not-send="true"
href="mailto:biojava-dev@mailman.open-bio.org">biojava-dev@mailman.open-bio.org</a><br>
<a moz-do-not-send="true"
href="http://mailman.open-bio.org/mailman/listinfo/biojava-dev"
target="_blank">http://mailman.open-bio.org/mailman/listinfo/biojava-dev</a><br>
</blockquote>
</div>
</blockquote>
<br>
</body>
</html>