[Biojava-l] converting fastq format

Michael Heuer heuermh at gmail.com
Wed Sep 16 15:26:41 UTC 2015


Hello Daniel,

The whole point of the FASTQ APIs in biojava is to support conversions as
you are trying to do, so you shouldn't have to do it yourself.

I would also be interested in seeing (some of) your input data.  If you've
found a historical edge case, or something that otherwise bends the rules,
we could add it to the repo as another test case.

   michael


On Wed, Sep 16, 2015 at 2:22 AM, Peter Cock <p.j.a.cock at googlemail.com>
wrote:

> Hi Daniel,
>
> I can't help you on the BioJava specifics, but do your input files really
> have Illumina's old FASTQ quality encoding? Can you show use the
> first couple of records which is usually enough to guess.
>
> http://dx.doi.org/10.1093/nar/gkp1137
>
> Peter
>
> On Wed, Sep 16, 2015 at 4:28 AM, Daniel Katzel <dkatzel at gmail.com> wrote:
> > Sorry if this has been asked many times, but I couldn't find it when
> > searching the mailing list or popular forums. When I follow the BioJava
> > cookbook to convert a Sanger fastq file into an illumina fastq I get
> > validation errors.
> >
> > The cookbook
> >
> >
> http://biojava.org/wiki/BioJava:CookBook3:FASTQ#Convert_between_FASTQ_variants_using_streaming_API
> >
> > says this will work:
> >
> > FastqReader fastqReader = new IlluminaFastqReader();
> > final FastqWriter fastqWriter = new SangerFastqWriter();
> > final FileWriter fileWriter = new FileWriter(new File("sanger.fastq"))));
> > InputStream in = ...
> >
> > fastqReader.stream(in, new StreamListener()
> >   {
> >     @Override
> >     public void fastq(final Fastq fastq)
> >     {
> >       fastqWriter.append(fileWriter, fastq);
> >     }
> >   });
> >
> >
> > But instead it throws this error:
> >
> > Caused by: java.io.IOException: sequence SRR062634.1
> > HWI-EAS110_103327062:6:1:1092:8469/1 not fastq-illumina format, was
> > fastq-sanger
> >         at
> >
> org.biojava.nbio.sequencing.io.fastq.IlluminaFastqWriter.validate(IlluminaFastqWriter.java:43)
> >         at
> >
> org.biojava.nbio.sequencing.io.fastq.AbstractFastqWriter.append(AbstractFastqWriter.java:62)
> >         at
> >
> org.biojava.nbio.sequencing.io.fastq.AbstractFastqWriter.append(AbstractFastqWriter.java:46)
> >
> >
> > My workaround was to create a new Fastq instance inside the
> > StreamListener#fastq() method to manually convert the quality chars
> >
> >
> >  char[] oldQual = fastq.getQuality().toCharArray();
> >                     char[] newQual = new char[oldQual.length];
> >                     for(int i=0; i< oldQual.length; i++){
> >                         newQual[i] =
> >
> FastqVariant.FASTQ_ILLUMINA.quality(FastqVariant.FASTQ_SANGER.qualityScore(oldQual[i]));
> >                     }
> >
> >                     Fastq newFastq = new
> > FastqBuilder().withDescription(fastq.getDescription())
> >                                     .withSequence(fastq.getSequence())
> >                                     .withQuality(new String(newQual))
> >
> > .withVariant(FastqVariant.FASTQ_ILLUMINA)
> >                                     .build();
> >                     try {
> >                         fastqWriter.append(writer, newFastq);
> >                     } catch (IOException e) {
> >                        throw new UncheckedIOException(e);
> >                     }
> >
> >
> > Is that the correct way to do it? Is there a better way?
> >
> > Thanks
> >
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l at mailman.open-bio.org
> > http://mailman.open-bio.org/mailman/listinfo/biojava-l
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biojava-l
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biojava-l/attachments/20150916/ebf737b1/attachment.html>


More information about the Biojava-l mailing list