[Biojava-l] readFasta problem

xyz mitlox at op.pl
Sun Apr 25 05:19:25 UTC 2010


On Wed, 21 Apr 2010 12:29:57 +0100
Richard Holland wrote:

> > Q1:
> > Does RichSequenceIterator read the complete file in memory and then
> > I retrieve each read from memory? Or does it read the file line by
> > line and I get each read?
> 
> 
> Line by line.

That save memory.

> > Q2:
> > Why am I not able to retrieve the header from the following fasta
> > file:
> >> 1
> > atccccc
> >> 2
> > atccccctttttt
> >> 3
> > atccccccccccccccccctttt
> >> 4
> > tttttttccccccccccccccccccccccc
> >> 5
> > tttttttcccccccccccccccccccccca
> 
> Try the other methods on RichSequence - getName() for instance.

Thank you getName() works.

I have tried to write fasta file line by line with IOTools, but I have
got the following error:
Exception in thread "main" java.lang.RuntimeException: Uncompilable
source code 1
        at SortFasta.main(SortFasta.java:31)
atccccc
Java Result: 1

Here is the complete code:

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.FileReader;
import org.biojava.bio.BioException;
import org.biojava.bio.seq.io.SymbolTokenization;
import org.biojava.bio.symbol.AlphabetManager;
import org.biojavax.bio.seq.RichSequence;
import org.biojavax.bio.seq.RichSequenceIterator;

public class SortFasta {

  public static void main(String[] args) throws FileNotFoundException,
  BioException {


    BufferedReader br = new BufferedReader(new
    FileReader("sortFasta.fasta")); String type = "DNA";
    SymbolTokenization toke = AlphabetManager.alphabetForName(type)
					.getTokenization("token");

    FileOutputStream outputFasta = new FileOutputStream("test.fasta");

    RichSequenceIterator rsi = RichSequence.IOTools.readFasta(br, toke,
    null);

    while (rsi.hasNext()) {
      RichSequence rs = rsi.nextRichSequence();
      System.out.println(rs.getName());
      System.out.println(rs.seqString());

      RichSequence.IOTools.writeFasta(outputFasta, rs.seqString(), null,
              rs.getName() + "1");
    }
  }
}

How is it possible to write fasta files line by line?



More information about the Biojava-l mailing list