<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Hi Andreas, thanks very much. I've compiled some (working) code to

    illustrate how I think this should work. The artificial sample fasta

    file contains only one sequence:<br>

    <br>

    <br>

    <br>

    ---------------<br>

    &gt;test test<br>

    PEPTIDEK<br>

    <br>

    ---------------<br>

    If you use a larger FASTA file, the file is first parsed correctly,

    but when it finishes, the loop just continues. I'm aware I'm

    probably doing something wrong in my code, but to me it's just not

    clear how to do it correctly, and that's basically my question.<br>

    <br>

    The code below loops forever, the output is repeating this:<br>

    <br>

    --------------<br>

    11:18:56 [main] WARN  org.biojava.nbio.core.sequence.io.FastaReader

    - Can't parse sequence 12. Got sequence of length 0!<br>

    11:18:56 [main] WARN  org.biojava.nbio.core.sequence.io.FastaReader

    - header: test test<br>

    test test<br>

    ---------------<br>

    <br>

    package nl.hecklab.bioinformatics.fastafilereaderexample;<br>

    <br>

    import java.io.IOException;<br>

    import java.io.InputStream;<br>

    import java.util.LinkedHashMap;<br>

    import java.util.logging.Level;<br>

    import java.util.logging.Logger;<br>

    import org.biojava.nbio.core.sequence.ProteinSequence;<br>

    import org.biojava.nbio.core.sequence.compound.AminoAcidCompound;<br>

    import org.biojava.nbio.core.sequence.compound.AminoAcidCompoundSet;<br>

    import org.biojava.nbio.core.sequence.io.FastaReader;<br>

    import org.biojava.nbio.core.sequence.io.GenericFastaHeaderParser;<br>

    import org.biojava.nbio.core.sequence.io.ProteinSequenceCreator;<br>

    <br>

    /**<br>

     *<br>

     * @author toorn101<br>

     */<br>

    public class App {<br>

    <br>

        public App() {<br>

            try {<br>

                InputStream inStream =

    this.getClass().getResourceAsStream("/test.fasta");<br>

                FastaReader&lt;ProteinSequence, AminoAcidCompound&gt;

    fastaReader = new FastaReader&lt;&gt;(<br>

                        inStream,<br>

                        new GenericFastaHeaderParser&lt;ProteinSequence,

    AminoAcidCompound&gt;(),<br>

                        new

    ProteinSequenceCreator(AminoAcidCompoundSet.getAminoAcidCompoundSet()));<br>

                LinkedHashMap&lt;String, ProteinSequence&gt; b;<br>

                while ((b = fastaReader.process(10)) != null) {<br>

                    for (String seq : b.keySet()) {<br>

                        System.out.println(seq);<br>

                    }<br>

                }<br>

            } catch (IOException ex) {<br>

                Logger.getLogger(App.class.getName()).log(Level.SEVERE,

    null, ex);<br>

            }<br>

        }<br>

    <br>

        public static void main(String[] args) {<br>

            new App();<br>

        }<br>

    <br>

    }<br>

    <br>

    <br>

    <div class="moz-cite-prefix">On 6/17/2015 7:04 AM, Andreas Prlic

      wrote:<br>

    </div>

    <blockquote

cite="mid:CALthepzgUpRr-39z9mrci84h1Tq+viqiiyGoDAUqEdLU0s2+FQ@mail.gmail.com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

      <div dir="ltr">Hi Henk,

        <div><br>

        </div>

        <div>Do you want to share some code-snippets so we can help you

          debug?</div>

        <div><br>

        </div>

        <div>Thanks,</div>

        <div><br>

        </div>

        <div>Andreas</div>

        <div><br>

        </div>

        <div><br>

        </div>

      </div>

      <div class="gmail_extra"><br>

        <div class="gmail_quote">On Mon, Jun 15, 2015 at 1:58 AM, Toorn,

          H.W.P. van den (Henk) <span dir="ltr">&lt;<a

              moz-do-not-send="true"

              href="mailto:h.w.p.vandentoorn@uu.nl" target="_blank">h.w.p.vandentoorn@uu.nl</a>&gt;</span>

          wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear List,<br>

            <br>

            I've just started using BioJava 4.0.0 in my projects, and

            wanted to ask a question about parsing large Fasta files.

            There is the option to read parts of the fasta file.<br>

            <br>

            FastaReader.process(number)<br>

            <br>

            The problem I have is that it's not documented what happens

            if the file is read in its entirety. I was expecting a null

            or an empty map, or even some exception, but none happened

            and the parser kept on producing (empty) sequences.<br>

            <br>

            Could anyone enlighten me? I'm probably missing the point

            here. Maybe there is a better way to do this (there used to

            be the SequenceIterator if I remember correctly, but I can't

            find that in version 4.0).<br>

            <br>

            <br>

            <br>

            Regards, Henk<br>

            <br>

            My setup: windows 7 64-bit, java 1.8.0_45 64 bit, BioJava

            4.0.0 via Maven.<span class="HOEnZb"><font color="#888888"><br>

                -- <br>

                <br>

              </font></span><br>

            _______________________________________________<br>

            Biojava-l mailing list  -  <a moz-do-not-send="true"

              href="mailto:Biojava-l@mailman.open-bio.org">Biojava-l@mailman.open-bio.org</a><br>

            <a moz-do-not-send="true"

              href="http://mailman.open-bio.org/mailman/listinfo/biojava-l"

              target="_blank">http://mailman.open-bio.org/mailman/listinfo/biojava-l</a><br>

          </blockquote>

        </div>

        <br>

        <br clear="all">

        <div><br>

        </div>

        -- <br>

        <div class="gmail_signature">

          <div dir="ltr">

            <div>

              <div dir="ltr">-----------------------------------------------------------------------<br>

                Dr. Andreas Prlic<br>

                RCSB PDB Protein Data Bank</div>

              <div>Technical &amp; Scientific Team Lead</div>

              <div dir="ltr">University of California, San Diego

                <div><br>

                </div>

                <div>Editor Software Section <br>

                  <div>PLOS Computational Biology

                    <div>

                      <div>

                        <div><br>

                        </div>

                        <div>BioJava Project Lead<br>

-----------------------------------------------------------------------<br>

                        </div>

                      </div>

                    </div>

                  </div>

                </div>

              </div>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

    <div class="moz-signature">-- <br>

      <br>

    </div>

  </body>

</html>