[EMBOSS] Conservation of FASTQ scores by the EMBOSS tools.

Peter Rice pmr at ebi.ac.uk
Thu Sep 17 09:18:59 UTC 2009


Peter C. wrote:
> On Thu, Sep 17, 2009 at 8:24 AM, Peter Rice <pmr at ebi.ac.uk> wrote:
>>> Also, in contrary to what the documentation predicts, using the fastq
>>> format for the output does not ignore the quality scores. (Not that
>>> would be particularly useful, but…)
>> This is deliberate. We have to write somethign in FASTQ format and we
>> default to the fastq-sanger format. On input, fastq-sanger ignores qualities
>> because there is no safe way to decide which format is correct.
> 
> So again, could you reconsider making "fastq" act like "fastq-sanger"?
> The Sanger FASTQ format allows ASCII 33 to 126 for the quality scores,
> a superset of the Solexa/Illumina FASTQ varaints - so even if you don't
> know which kind of FASTQ file you have, and you don't care about the
> qualities, parsing it as a Sanger FASTQ file will work.

Yes, but it is dangerous if they could really be Solexa qualities.

What we could do is provide a utility that reads in fastq-sanger format 
and checks whether the quality scores make most sense as Sanger, SOlexa 
or Ilumina.

I consider reading as fastq-sanger by default to be rather dangerous.

Peter



More information about the EMBOSS mailing list