<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    To be honest I don't really know gmap, that's because I'm not into

    sequencing data at all. My use-case is protein database search,

    that's why something like lambda that can do both dna and protein is

    so important for me. If I understand it, gmap wouldn't be able to do

    protein searches, would it?<br>

    <br>

    In the lambda publication

    (<a class="moz-txt-link-freetext" href="http://bioinformatics.oxfordjournals.org/content/30/17/i349.full">http://bioinformatics.oxfordjournals.org/content/30/17/i349.full</a>)

    the authors compare it to a few other methods (blast, pauda,

    rapsearch2, ublast), but not to gmap.<br>

    <br>

    SANSparallel is another new method which is apparently also very

    fast. In their publication

    (<a class="moz-txt-link-freetext" href="http://nar.oxfordjournals.org/content/early/2015/04/08/nar.gkv317.long">http://nar.oxfordjournals.org/content/early/2015/04/08/nar.gkv317.long</a>)

    they compare to some others, but no gmap again.<br>

    <br>

    Jose<br>

    <br>

    <br>

    <br>

    <div class="moz-cite-prefix">On 11.05.2015 11:30, Erik McKee wrote:<br>

    </div>

    <blockquote

cite="mid:CABCu8hh7HOMpLMtCfr-94CZVm7KAsn6W-4Wy3eohcUwZmrMTrw@mail.gmail.com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

      <p dir="ltr">How does gmap compare to these? </p>

      <div class="gmail_quote">On May 11, 2015 5:26 AM, "Jose Manuel

        Duarte" &lt;<a moz-do-not-send="true"

          href="mailto:jose.duarte@psi.ch">jose.duarte@psi.ch</a>&gt;

        wrote:<br type="attribution">

        <blockquote class="gmail_quote" style="margin:0 0 0

          .8ex;border-left:1px #ccc solid;padding-left:1ex">

          <div bgcolor="#FFFFFF" text="#000000"> Just one more comment

            regarding alternatives to blast. Recently I've come across

            such an alternative that is not as sensitive as blast but a

            lot faster, it's called lambda:<br>

            <br>

            <a moz-do-not-send="true"

              href="http://www.seqan.de/projects/lambda/"

              target="_blank">http://www.seqan.de/projects/lambda/</a><br>

            <br>

            I've tried it out and I'm very impressed with the results,

            it can do full UniRef100 searches in a split of a second.

            There are still some issues to iron out, especially in the

            indexing which is very memory and disk hungry. But all in

            all it does seem to be a real alternative to blast.<br>

            <br>

            Their output is blast compatible: they can do either classic

            pairwise output (-m 0) or tabular output (-m 8). No XML

            output yet though.<br>

            <br>

            So this would support the case to have some kind of

            framework that can deal with the results of a sequence

            homology search. The actual parsers would be then

            implemented on a per-case basis. <br>

            <br>

            Jose<br>

            <br>

            <br>

            <br>

            <div>On 10.05.2015 14:04, Paolo Pavan wrote:<br>

            </div>

            <blockquote type="cite">

              <div dir="ltr">

                <div>

                  <div>

                    <div>

                      <div>Hello!<br>

                      </div>

                      I obviously share the opinion of Peter and Jose.

                      Moreover, as already written, I have used this new

                      feature in a second work that I could also

                      describe and submit to biojava, if of any

                      interest.<br>

                      <br>

                      About Andreas' questions:<br>

                    </div>

                    " Does your module support psiblast, rpsblast,

                    tblastx and blast+ and what versions?": At now, it

                    supports the blastn, blastp, blastx, tblastn and

                    tblastx version 2.2.29. I'm not very sure about

                    psiblast and rpsblast, I should test it. <br>

                    But it has been designed so that to update a single

                    parser (as well to add a new search program and

                    still remaining in the designed framework) there

                    will be the need to write just a single class. This

                    will keep the code simple and neat, very important

                    in my opinion for future developers.<br>

                    <br>

                    "the disadvantage is that you constantly need to

                    update them to the variant of blast plus version of

                    the output file format": this unfortunately is a

                    problem that everyone of us have to face if wants to

                    use new ncbi programs. It happened for legacy-blast,

                    it happened a lot of time for genbank format, it is

                    happening for blast+. Just hoping that they would

                    have the kindness explicit the format version inside

                    the xml if not to name the program itself in

                    different way, such for example blast3 or blast++,

                    to avoid confusion. We can't do anything about that,

                    we can just try to make the things simple and easy

                    to reuse.<br>

                    <br>

                  </div>

                  Just to express my opinion, I think that every bio

                  project should first of all address theese "base

                  level" problem more than others to allow the developer

                  to focus on higher abstraction details. I'm sure that

                  this will be appreciated by the community, increasing

                  the base of users of biojava.<br>

                  <br>

                </div>

                Paolo<br>

                <div>

                  <div>

                    <div class="gmail_extra"><br>

                      <div class="gmail_quote">2015-05-06 12:15

                        GMT+02:00 Jose Manuel Duarte <span dir="ltr">&lt;<a

                            moz-do-not-send="true"

                            href="mailto:jose.duarte@psi.ch"

                            target="_blank">jose.duarte@psi.ch</a>&gt;</span>:<br>

                        <blockquote class="gmail_quote"

                          style="margin:0px 0px 0px

                          0.8ex;border-left:1px solid

                          rgb(204,204,204);padding-left:1ex">I'd say

                          that having some common data structure to

                          model the output of a sequence homology search

                          should be benefitial. For instance a blast

                          alternative might appear one day (I'm eagerly

                          awaiting for it!). The common data structure

                          should be able to model the outputs of any of

                          the different softwares.<br>

                          <br>

                          There are already some alternatives to blast:<br>

                          <br>

                          SANS and SANSparallel by Liisa Holm (<a

                            moz-do-not-send="true"

                            href="http://www.ncbi.nlm.nih.gov/pubmed/22962464"

                            target="_blank">http://www.ncbi.nlm.nih.gov/pubmed/22962464</a>,

                          <a moz-do-not-send="true"

href="http://nar.oxfordjournals.org/content/early/2015/04/08/nar.gkv317.full"

                            target="_blank">http://nar.oxfordjournals.org/content/early/2015/04/08/nar.gkv317.full</a>)<br>

                          USEARCH (commercial) (<a

                            moz-do-not-send="true"

                            href="http://drive5.com/usearch/"

                            target="_blank">http://drive5.com/usearch/</a>)<br>

                          BLAT (<a moz-do-not-send="true"

                            href="https://genome.ucsc.edu/FAQ/FAQblat.html#blat3"

                            target="_blank">https://genome.ucsc.edu/FAQ/FAQblat.html#blat3</a>)<br>

                          <br>

                          In fact SANSparallel looks very promising,

                          it's incredibly fast though less sensitive

                          than blast.<br>

                          <br>

                          Cheers<span><font color="#888888"><br>

                              <br>

                              Jose</font></span>

                          <div>

                            <div><br>

                              <br>

                              <br>

                              <br>

                              On <a moz-do-not-send="true"

                                href="tel:06.05.2015%2010"

                                value="+390605201510" target="_blank">06.05.2015

                                10</a>:47, Peter Cock wrote:<br>

                            </div>

                          </div>

                          <blockquote class="gmail_quote"

                            style="margin:0px 0px 0px

                            0.8ex;border-left:1px solid

                            rgb(204,204,204);padding-left:1ex">

                            <div>

                              <div> On Wed, May 6, 2015 at 6:02 AM,

                                Andreas Prlic &lt;<a

                                  moz-do-not-send="true"

                                  href="mailto:andreas@sdsc.edu"

                                  target="_blank">andreas@sdsc.edu</a>&gt;

                                wrote:<br>

                                <blockquote class="gmail_quote"

                                  style="margin:0px 0px 0px

                                  0.8ex;border-left:1px solid

                                  rgb(204,204,204);padding-left:1ex"> On

                                  Tue, May 5, 2015 at 1:18 PM, Paolo

                                  Pavan &lt;<a moz-do-not-send="true"

                                    href="mailto:paolo.pavan@gmail.com"

                                    target="_blank">paolo.pavan@gmail.com</a>&gt;

                                  wrote:<br>

                                  <blockquote class="gmail_quote"

                                    style="margin:0px 0px 0px

                                    0.8ex;border-left:1px solid

                                    rgb(204,204,204);padding-left:1ex">

                                    As seen in other Bio projects, aside

                                    with Sequence IO and Alignment IO<br>

                                    procedures it could have a Search

                                    result IO also.<br>

                                  </blockquote>

                                  I never understood why other Bio*

                                  projects have special Blast modules.<br>

                                  Perhaps XML parsing is not as easy as

                                  it is in Java? Please see the code at<br>

                                  the bottom of this message.<br>

                                </blockquote>

                                Python at least has a range of XML

                                parsing libraries which are up to the<br>

                                task. However, as Paolo wrote:<br>

                                <br>

                                <blockquote class="gmail_quote"

                                  style="margin:0px 0px 0px

                                  0.8ex;border-left:1px solid

                                  rgb(204,204,204);padding-left:1ex">

                                  <blockquote class="gmail_quote"

                                    style="margin:0px 0px 0px

                                    0.8ex;border-left:1px solid

                                    rgb(204,204,204);padding-left:1ex">

                                    The advantage is to define common

                                    data structures that models Hsp,

                                    Hits,<br>

                                    Results without taking care (ie.

                                    making abstraction) of the

                                    underlying<br>

                                    search program.<br>

                                  </blockquote>

                                </blockquote>

                                This is the big advantage of BioPerl and

                                Biopython's SearchIO module.<br>

                                You can at least in theory switch

                                between parsing BLAST XML, BLAST<br>

                                tabular, BLAST plain text (shudder), or

                                another related format without<br>

                                major changes to your code.<br>

                                <br>

                                <blockquote class="gmail_quote"

                                  style="margin:0px 0px 0px

                                  0.8ex;border-left:1px solid

                                  rgb(204,204,204);padding-left:1ex">

                                  and the disadvantage is that you

                                  constantly need to update them to the<br>

                                  variant of blast plus version of the

                                  output file format.<br>

                                </blockquote>

                                I think it is much better to have this

                                housekeeping done once centrally in<br>

                                a Bio* library that re-invented by

                                anyone parsing the BLAST output.<br>

                                However, the NCBI BLAST XML output has

                                been fairly stable, and the<br>

                                new output has a formal schema so should

                                be even more dependable.<br>

                                <br>

                                Peter<br>

                              </div>

                            </div>

                            <span>

                              _______________________________________________<br>

                              biojava-dev mailing list<br>

                              <a moz-do-not-send="true"

                                href="mailto:biojava-dev@mailman.open-bio.org"

                                target="_blank">biojava-dev@mailman.open-bio.org</a><br>

                              <a moz-do-not-send="true"

                                href="http://mailman.open-bio.org/mailman/listinfo/biojava-dev"

                                target="_blank">http://mailman.open-bio.org/mailman/listinfo/biojava-dev</a><br>

                            </span></blockquote>

                          <div>

                            <div> <br>

_______________________________________________<br>

                              biojava-dev mailing list<br>

                              <a moz-do-not-send="true"

                                href="mailto:biojava-dev@mailman.open-bio.org"

                                target="_blank">biojava-dev@mailman.open-bio.org</a><br>

                              <a moz-do-not-send="true"

                                href="http://mailman.open-bio.org/mailman/listinfo/biojava-dev"

                                target="_blank">http://mailman.open-bio.org/mailman/listinfo/biojava-dev</a><br>

                            </div>

                          </div>

                        </blockquote>

                      </div>

                      <br>

                    </div>

                  </div>

                </div>

              </div>

            </blockquote>

            <br>

          </div>

          <br>

          _______________________________________________<br>

          biojava-dev mailing list<br>

          <a moz-do-not-send="true"

            href="mailto:biojava-dev@mailman.open-bio.org">biojava-dev@mailman.open-bio.org</a><br>

          <a moz-do-not-send="true"

            href="http://mailman.open-bio.org/mailman/listinfo/biojava-dev"

            target="_blank">http://mailman.open-bio.org/mailman/listinfo/biojava-dev</a><br>

        </blockquote>

      </div>

    </blockquote>

    <br>

  </body>

</html>