[Biojava-l] converting fastq format

Peter Cock p.j.a.cock at googlemail.com
Wed Sep 16 07:22:54 UTC 2015


Hi Daniel,

I can't help you on the BioJava specifics, but do your input files really
have Illumina's old FASTQ quality encoding? Can you show use the
first couple of records which is usually enough to guess.

http://dx.doi.org/10.1093/nar/gkp1137

Peter

On Wed, Sep 16, 2015 at 4:28 AM, Daniel Katzel <dkatzel at gmail.com> wrote:
> Sorry if this has been asked many times, but I couldn't find it when
> searching the mailing list or popular forums. When I follow the BioJava
> cookbook to convert a Sanger fastq file into an illumina fastq I get
> validation errors.
>
> The cookbook
>
> http://biojava.org/wiki/BioJava:CookBook3:FASTQ#Convert_between_FASTQ_variants_using_streaming_API
>
> says this will work:
>
> FastqReader fastqReader = new IlluminaFastqReader();
> final FastqWriter fastqWriter = new SangerFastqWriter();
> final FileWriter fileWriter = new FileWriter(new File("sanger.fastq"))));
> InputStream in = ...
>
> fastqReader.stream(in, new StreamListener()
>   {
>     @Override
>     public void fastq(final Fastq fastq)
>     {
>       fastqWriter.append(fileWriter, fastq);
>     }
>   });
>
>
> But instead it throws this error:
>
> Caused by: java.io.IOException: sequence SRR062634.1
> HWI-EAS110_103327062:6:1:1092:8469/1 not fastq-illumina format, was
> fastq-sanger
>         at
> org.biojava.nbio.sequencing.io.fastq.IlluminaFastqWriter.validate(IlluminaFastqWriter.java:43)
>         at
> org.biojava.nbio.sequencing.io.fastq.AbstractFastqWriter.append(AbstractFastqWriter.java:62)
>         at
> org.biojava.nbio.sequencing.io.fastq.AbstractFastqWriter.append(AbstractFastqWriter.java:46)
>
>
> My workaround was to create a new Fastq instance inside the
> StreamListener#fastq() method to manually convert the quality chars
>
>
>  char[] oldQual = fastq.getQuality().toCharArray();
>                     char[] newQual = new char[oldQual.length];
>                     for(int i=0; i< oldQual.length; i++){
>                         newQual[i] =
> FastqVariant.FASTQ_ILLUMINA.quality(FastqVariant.FASTQ_SANGER.qualityScore(oldQual[i]));
>                     }
>
>                     Fastq newFastq = new
> FastqBuilder().withDescription(fastq.getDescription())
>                                     .withSequence(fastq.getSequence())
>                                     .withQuality(new String(newQual))
>
> .withVariant(FastqVariant.FASTQ_ILLUMINA)
>                                     .build();
>                     try {
>                         fastqWriter.append(writer, newFastq);
>                     } catch (IOException e) {
>                        throw new UncheckedIOException(e);
>                     }
>
>
> Is that the correct way to do it? Is there a better way?
>
> Thanks
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biojava-l


More information about the Biojava-l mailing list