<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    To be honest I don't really know gmap, that's because I'm not into
    sequencing data at all. My use-case is protein database search,
    that's why something like lambda that can do both dna and protein is
    so important for me. If I understand it, gmap wouldn't be able to do
    protein searches, would it?<br>
    <br>
    In the lambda publication
    (<a class="moz-txt-link-freetext" href="http://bioinformatics.oxfordjournals.org/content/30/17/i349.full">http://bioinformatics.oxfordjournals.org/content/30/17/i349.full</a>)
    the authors compare it to a few other methods (blast, pauda,
    rapsearch2, ublast), but not to gmap.<br>
    <br>
    SANSparallel is another new method which is apparently also very
    fast. In their publication
    (<a class="moz-txt-link-freetext" href="http://nar.oxfordjournals.org/content/early/2015/04/08/nar.gkv317.long">http://nar.oxfordjournals.org/content/early/2015/04/08/nar.gkv317.long</a>)
    they compare to some others, but no gmap again.<br>
    <br>
    Jose<br>
    <br>
    <br>
    <br>
    <div class="moz-cite-prefix">On 11.05.2015 11:30, Erik McKee wrote:<br>
    </div>
    <blockquote
cite="mid:CABCu8hh7HOMpLMtCfr-94CZVm7KAsn6W-4Wy3eohcUwZmrMTrw@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <p dir="ltr">How does gmap compare to these? </p>
      <div class="gmail_quote">On May 11, 2015 5:26 AM, "Jose Manuel
        Duarte" &lt;<a moz-do-not-send="true"
          href="mailto:jose.duarte@psi.ch">jose.duarte@psi.ch</a>&gt;
        wrote:<br type="attribution">
        <blockquote class="gmail_quote" style="margin:0 0 0
          .8ex;border-left:1px #ccc solid;padding-left:1ex">
          <div bgcolor="#FFFFFF" text="#000000"> Just one more comment
            regarding alternatives to blast. Recently I've come across
            such an alternative that is not as sensitive as blast but a
            lot faster, it's called lambda:<br>
            <br>
            <a moz-do-not-send="true"
              href="http://www.seqan.de/projects/lambda/"
              target="_blank">http://www.seqan.de/projects/lambda/</a><br>
            <br>
            I've tried it out and I'm very impressed with the results,
            it can do full UniRef100 searches in a split of a second.
            There are still some issues to iron out, especially in the
            indexing which is very memory and disk hungry. But all in
            all it does seem to be a real alternative to blast.<br>
            <br>
            Their output is blast compatible: they can do either classic
            pairwise output (-m 0) or tabular output (-m 8). No XML
            output yet though.<br>
            <br>
            So this would support the case to have some kind of
            framework that can deal with the results of a sequence
            homology search. The actual parsers would be then
            implemented on a per-case basis. <br>
            <br>
            Jose<br>
            <br>
            <br>
            <br>
            <div>On 10.05.2015 14:04, Paolo Pavan wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">
                <div>
                  <div>
                    <div>
                      <div>Hello!<br>
                      </div>
                      I obviously share the opinion of Peter and Jose.
                      Moreover, as already written, I have used this new
                      feature in a second work that I could also
                      describe and submit to biojava, if of any
                      interest.<br>
                      <br>
                      About Andreas' questions:<br>
                    </div>
                    " Does your module support psiblast, rpsblast,
                    tblastx and blast+ and what versions?": At now, it
                    supports the blastn, blastp, blastx, tblastn and
                    tblastx version 2.2.29. I'm not very sure about
                    psiblast and rpsblast, I should test it. <br>
                    But it has been designed so that to update a single
                    parser (as well to add a new search program and
                    still remaining in the designed framework) there
                    will be the need to write just a single class. This
                    will keep the code simple and neat, very important
                    in my opinion for future developers.<br>
                    <br>
                    "the disadvantage is that you constantly need to
                    update them to the variant of blast plus version of
                    the output file format": this unfortunately is a
                    problem that everyone of us have to face if wants to
                    use new ncbi programs. It happened for legacy-blast,
                    it happened a lot of time for genbank format, it is
                    happening for blast+. Just hoping that they would
                    have the kindness explicit the format version inside
                    the xml if not to name the program itself in
                    different way, such for example blast3 or blast++,
                    to avoid confusion. We can't do anything about that,
                    we can just try to make the things simple and easy
                    to reuse.<br>
                    <br>
                  </div>
                  Just to express my opinion, I think that every bio
                  project should first of all address theese "base
                  level" problem more than others to allow the developer
                  to focus on higher abstraction details. I'm sure that
                  this will be appreciated by the community, increasing
                  the base of users of biojava.<br>
                  <br>
                </div>
                Paolo<br>
                <div>
                  <div>
                    <div class="gmail_extra"><br>
                      <div class="gmail_quote">2015-05-06 12:15
                        GMT+02:00 Jose Manuel Duarte <span dir="ltr">&lt;<a
                            moz-do-not-send="true"
                            href="mailto:jose.duarte@psi.ch"
                            target="_blank">jose.duarte@psi.ch</a>&gt;</span>:<br>
                        <blockquote class="gmail_quote"
                          style="margin:0px 0px 0px
                          0.8ex;border-left:1px solid
                          rgb(204,204,204);padding-left:1ex">I'd say
                          that having some common data structure to
                          model the output of a sequence homology search
                          should be benefitial. For instance a blast
                          alternative might appear one day (I'm eagerly
                          awaiting for it!). The common data structure
                          should be able to model the outputs of any of
                          the different softwares.<br>
                          <br>
                          There are already some alternatives to blast:<br>
                          <br>
                          SANS and SANSparallel by Liisa Holm (<a
                            moz-do-not-send="true"
                            href="http://www.ncbi.nlm.nih.gov/pubmed/22962464"
                            target="_blank">http://www.ncbi.nlm.nih.gov/pubmed/22962464</a>,
                          <a moz-do-not-send="true"
href="http://nar.oxfordjournals.org/content/early/2015/04/08/nar.gkv317.full"
                            target="_blank">http://nar.oxfordjournals.org/content/early/2015/04/08/nar.gkv317.full</a>)<br>
                          USEARCH (commercial) (<a
                            moz-do-not-send="true"
                            href="http://drive5.com/usearch/"
                            target="_blank">http://drive5.com/usearch/</a>)<br>
                          BLAT (<a moz-do-not-send="true"
                            href="https://genome.ucsc.edu/FAQ/FAQblat.html#blat3"
                            target="_blank">https://genome.ucsc.edu/FAQ/FAQblat.html#blat3</a>)<br>
                          <br>
                          In fact SANSparallel looks very promising,
                          it's incredibly fast though less sensitive
                          than blast.<br>
                          <br>
                          Cheers<span><font color="#888888"><br>
                              <br>
                              Jose</font></span>
                          <div>
                            <div><br>
                              <br>
                              <br>
                              <br>
                              On <a moz-do-not-send="true"
                                href="tel:06.05.2015%2010"
                                value="+390605201510" target="_blank">06.05.2015
                                10</a>:47, Peter Cock wrote:<br>
                            </div>
                          </div>
                          <blockquote class="gmail_quote"
                            style="margin:0px 0px 0px
                            0.8ex;border-left:1px solid
                            rgb(204,204,204);padding-left:1ex">
                            <div>
                              <div> On Wed, May 6, 2015 at 6:02 AM,
                                Andreas Prlic &lt;<a
                                  moz-do-not-send="true"
                                  href="mailto:andreas@sdsc.edu"
                                  target="_blank">andreas@sdsc.edu</a>&gt;

                                wrote:<br>
                                <blockquote class="gmail_quote"
                                  style="margin:0px 0px 0px
                                  0.8ex;border-left:1px solid
                                  rgb(204,204,204);padding-left:1ex"> On
                                  Tue, May 5, 2015 at 1:18 PM, Paolo
                                  Pavan &lt;<a moz-do-not-send="true"
                                    href="mailto:paolo.pavan@gmail.com"
                                    target="_blank">paolo.pavan@gmail.com</a>&gt;

                                  wrote:<br>
                                  <blockquote class="gmail_quote"
                                    style="margin:0px 0px 0px
                                    0.8ex;border-left:1px solid
                                    rgb(204,204,204);padding-left:1ex">
                                    As seen in other Bio projects, aside
                                    with Sequence IO and Alignment IO<br>
                                    procedures it could have a Search
                                    result IO also.<br>
                                  </blockquote>
                                  I never understood why other Bio*
                                  projects have special Blast modules.<br>
                                  Perhaps XML parsing is not as easy as
                                  it is in Java? Please see the code at<br>
                                  the bottom of this message.<br>
                                </blockquote>
                                Python at least has a range of XML
                                parsing libraries which are up to the<br>
                                task. However, as Paolo wrote:<br>
                                <br>
                                <blockquote class="gmail_quote"
                                  style="margin:0px 0px 0px
                                  0.8ex;border-left:1px solid
                                  rgb(204,204,204);padding-left:1ex">
                                  <blockquote class="gmail_quote"
                                    style="margin:0px 0px 0px
                                    0.8ex;border-left:1px solid
                                    rgb(204,204,204);padding-left:1ex">
                                    The advantage is to define common
                                    data structures that models Hsp,
                                    Hits,<br>
                                    Results without taking care (ie.
                                    making abstraction) of the
                                    underlying<br>
                                    search program.<br>
                                  </blockquote>
                                </blockquote>
                                This is the big advantage of BioPerl and
                                Biopython's SearchIO module.<br>
                                You can at least in theory switch
                                between parsing BLAST XML, BLAST<br>
                                tabular, BLAST plain text (shudder), or
                                another related format without<br>
                                major changes to your code.<br>
                                <br>
                                <blockquote class="gmail_quote"
                                  style="margin:0px 0px 0px
                                  0.8ex;border-left:1px solid
                                  rgb(204,204,204);padding-left:1ex">
                                  and the disadvantage is that you
                                  constantly need to update them to the<br>
                                  variant of blast plus version of the
                                  output file format.<br>
                                </blockquote>
                                I think it is much better to have this
                                housekeeping done once centrally in<br>
                                a Bio* library that re-invented by
                                anyone parsing the BLAST output.<br>
                                However, the NCBI BLAST XML output has
                                been fairly stable, and the<br>
                                new output has a formal schema so should
                                be even more dependable.<br>
                                <br>
                                Peter<br>
                              </div>
                            </div>
                            <span>
                              _______________________________________________<br>
                              biojava-dev mailing list<br>
                              <a moz-do-not-send="true"
                                href="mailto:biojava-dev@mailman.open-bio.org"
                                target="_blank">biojava-dev@mailman.open-bio.org</a><br>
                              <a moz-do-not-send="true"
                                href="http://mailman.open-bio.org/mailman/listinfo/biojava-dev"
                                target="_blank">http://mailman.open-bio.org/mailman/listinfo/biojava-dev</a><br>
                            </span></blockquote>
                          <div>
                            <div> <br>
_______________________________________________<br>
                              biojava-dev mailing list<br>
                              <a moz-do-not-send="true"
                                href="mailto:biojava-dev@mailman.open-bio.org"
                                target="_blank">biojava-dev@mailman.open-bio.org</a><br>
                              <a moz-do-not-send="true"
                                href="http://mailman.open-bio.org/mailman/listinfo/biojava-dev"
                                target="_blank">http://mailman.open-bio.org/mailman/listinfo/biojava-dev</a><br>
                            </div>
                          </div>
                        </blockquote>
                      </div>
                      <br>
                    </div>
                  </div>
                </div>
              </div>
            </blockquote>
            <br>
          </div>
          <br>
          _______________________________________________<br>
          biojava-dev mailing list<br>
          <a moz-do-not-send="true"
            href="mailto:biojava-dev@mailman.open-bio.org">biojava-dev@mailman.open-bio.org</a><br>
          <a moz-do-not-send="true"
            href="http://mailman.open-bio.org/mailman/listinfo/biojava-dev"
            target="_blank">http://mailman.open-bio.org/mailman/listinfo/biojava-dev</a><br>
        </blockquote>
      </div>
    </blockquote>
    <br>
  </body>
</html>