[Biojava-l] Reading and writting Fastq files

Richard Holland holland at eaglegenomics.com
Thu Apr 8 11:36:36 UTC 2010


You haven't included the two import static lines in your code. See first two lines of Michael's example code (expanding the ellipses to the full classpath).

On 8 Apr 2010, at 12:30, xyz wrote:

> On Wed, 31 Mar 2010 23:56:42 -0400 (EDT)
> Michael Heuer wrote:
> 
>> import static ...RichSequence.Tools.*;
>> import static ...RichSequence.IOTools.*;
>> 
>> Fastq fastq = ...;
>> Namespace namepace = ...;
>> RichSequence richSequence = createRichSequence(
>>  namespace,
>>  fastq.getDescription(),
>>  fastq.getSequence(),
>>  DNATools.getDNA());
>> 
>> writeFasta(outputStream, richSequence, namespace);
> 
> I have tried this but I got this error:
> Fastq2Fasta.java:52: cannot find symbol
> symbol  : method
> createRichSequence(org.biojavax.SimpleNamespace,java.lang.String,java.lang.String,org.biojava.bio.symbol.FiniteAlphabet)
> location: class Fastq2Fasta RichSequence richSequence =
> createRichSequence(ns, 
> 1 error
> 
> The complete code looks now :
> 
> import java.io.FileInputStream;
> import java.io.FileNotFoundException;
> import java.io.FileOutputStream;
> import java.io.IOException;
> import org.biojava.bio.program.fastq.Fastq;
> import org.biojava.bio.program.fastq.FastqBuilder;
> import org.biojava.bio.program.fastq.FastqReader;
> import org.biojava.bio.program.fastq.FastqVariant;
> import org.biojava.bio.program.fastq.FastqWriter;
> import org.biojava.bio.program.fastq.IlluminaFastqReader;
> import org.biojava.bio.program.fastq.IlluminaFastqWriter;
> import org.biojava.bio.seq.DNATools;
> import org.biojavax.SimpleNamespace;
> import org.biojavax.bio.seq.RichSequence;
> 
> 
> public class Fastq2Fasta {
> 
>  public static void main(String[] args) throws FileNotFoundException,
>  IOException {
> 
>    FileInputStream inputFastq = new FileInputStream("fastq2fasta.fastq"); 
>    FastqReader qReader = new IlluminaFastqReader();
> 
>    FileOutputStream outputFastq = new FileOutputStream("fastq2fastaTrim.fastq"); 
>    FastqWriter qWriter = new IlluminaFastqWriter();
> 
>    //SimpleNamespace ns = new SimpleNamespace("biojava");
> 
>    FileOutputStream outputFasta = new FileOutputStream("fastq2fastaTrim.fasta");
> 
> 
>    for (Fastq fastq : qReader.read(inputFastq)) {
>      System.out.println(fastq.getDescription());
>      System.out.println(fastq.getSequence());
>      String trimSeq = fastq.getSequence().substring(0,
>      		fastq.getSequence().length() - 6); 
>      System.out.println(trimSeq);
>      System.out.println(fastq.getQuality());
>      String trimQual = fastq.getQuality().substring(0,
>    		fastq.getQuality().length() - 6);
>      System.out.println(trimQual);
> 
>      FastqBuilder trimFastq = new FastqBuilder();
>      trimFastq.withVariant(FastqVariant.FASTQ_ILLUMINA);
>      trimFastq.withDescription(fastq.getDescription());
>      trimFastq.appendSequence(trimSeq);
>      trimFastq.appendQuality(trimQual);
> 
>      qWriter.write(outputFastq, trimFastq.build());
> 
> 
>      SimpleNamespace ns = new SimpleNamespace("biojava");
>      RichSequence richSequence = createRichSequence(ns,
>              fastq.getDescription(), trimSeq, DNATools.getDNA());
>      RichSequence.IOTools.writeFasta(outputFasta, richSequence, ns);
>    }
>  }
> }
> 
> What did I wrong?
> 
> 
>> 
>>> Suggestions:
>>> 1)
>>> After I trimmed the fastq files the header information for quality
>>> is empty
>>> 
>>> @HWI-EAS406:5:1:0:1390#0/1
>>> GGGTGATGGCCGCTGCCGATGGCGTCAAAA
>>> +
>>> OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
>>> 
>>> this reduced the size of the files but is it compatible with
>>> SOAP and TopHat?
>> 
>> Sorry, not sure what you are asking here.
>> 
> Usually  @-headerand and +-header are equal eg.
> @HWI-EAS406:5:1:0:1390#0/1
> +HWI-EAS406:5:1:0:1390#0/1
> but after trimming and writting to fastq file I got this
> @HWI-EAS406:5:1:0:1390#0/1
> +
> The +-header is empty. Is this ok like this and standard compatible?
> 
> Best regards,
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l

--
Richard Holland, BSc MBCS
Operations and Delivery Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/





More information about the Biojava-l mailing list