[Biojava-l] Reading and writting Fastq files

xyz mitlox at op.pl
Thu Apr 8 11:30:13 UTC 2010


On Wed, 31 Mar 2010 23:56:42 -0400 (EDT)
Michael Heuer wrote:

> import static ...RichSequence.Tools.*;
> import static ...RichSequence.IOTools.*;
> 
> Fastq fastq = ...;
> Namespace namepace = ...;
> RichSequence richSequence = createRichSequence(
>   namespace,
>   fastq.getDescription(),
>   fastq.getSequence(),
>   DNATools.getDNA());
> 
> writeFasta(outputStream, richSequence, namespace);

I have tried this but I got this error:
Fastq2Fasta.java:52: cannot find symbol
symbol  : method
createRichSequence(org.biojavax.SimpleNamespace,java.lang.String,java.lang.String,org.biojava.bio.symbol.FiniteAlphabet)
location: class Fastq2Fasta RichSequence richSequence =
createRichSequence(ns, 
1 error

The complete code looks now :

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import org.biojava.bio.program.fastq.Fastq;
import org.biojava.bio.program.fastq.FastqBuilder;
import org.biojava.bio.program.fastq.FastqReader;
import org.biojava.bio.program.fastq.FastqVariant;
import org.biojava.bio.program.fastq.FastqWriter;
import org.biojava.bio.program.fastq.IlluminaFastqReader;
import org.biojava.bio.program.fastq.IlluminaFastqWriter;
import org.biojava.bio.seq.DNATools;
import org.biojavax.SimpleNamespace;
import org.biojavax.bio.seq.RichSequence;


public class Fastq2Fasta {

  public static void main(String[] args) throws FileNotFoundException,
  IOException {

    FileInputStream inputFastq = new FileInputStream("fastq2fasta.fastq"); 
    FastqReader qReader = new IlluminaFastqReader();

    FileOutputStream outputFastq = new FileOutputStream("fastq2fastaTrim.fastq"); 
    FastqWriter qWriter = new IlluminaFastqWriter();

    //SimpleNamespace ns = new SimpleNamespace("biojava");

    FileOutputStream outputFasta = new FileOutputStream("fastq2fastaTrim.fasta");


    for (Fastq fastq : qReader.read(inputFastq)) {
      System.out.println(fastq.getDescription());
      System.out.println(fastq.getSequence());
      String trimSeq = fastq.getSequence().substring(0,
      		fastq.getSequence().length() - 6); 
      System.out.println(trimSeq);
      System.out.println(fastq.getQuality());
      String trimQual = fastq.getQuality().substring(0,
    		fastq.getQuality().length() - 6);
      System.out.println(trimQual);

      FastqBuilder trimFastq = new FastqBuilder();
      trimFastq.withVariant(FastqVariant.FASTQ_ILLUMINA);
      trimFastq.withDescription(fastq.getDescription());
      trimFastq.appendSequence(trimSeq);
      trimFastq.appendQuality(trimQual);

      qWriter.write(outputFastq, trimFastq.build());


      SimpleNamespace ns = new SimpleNamespace("biojava");
      RichSequence richSequence = createRichSequence(ns,
              fastq.getDescription(), trimSeq, DNATools.getDNA());
      RichSequence.IOTools.writeFasta(outputFasta, richSequence, ns);
    }
  }
}

What did I wrong?


> 
> > Suggestions:
> > 1)
> > After I trimmed the fastq files the header information for quality
> > is empty
> >
> > @HWI-EAS406:5:1:0:1390#0/1
> > GGGTGATGGCCGCTGCCGATGGCGTCAAAA
> > +
> > OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
> >
> > this reduced the size of the files but is it compatible with
> > SOAP and TopHat?
> 
> Sorry, not sure what you are asking here.
> 
Usually  @-headerand and +-header are equal eg.
@HWI-EAS406:5:1:0:1390#0/1
+HWI-EAS406:5:1:0:1390#0/1
but after trimming and writting to fastq file I got this
@HWI-EAS406:5:1:0:1390#0/1
+
The +-header is empty. Is this ok like this and standard compatible?

Best regards,



More information about the Biojava-l mailing list