[Bioperl-l] Merging separate sequence and quality files to FASTQ ?

Peter biopython at maubp.freeserve.co.uk
Thu Dec 3 12:12:15 UTC 2009


On Thu, Dec 3, 2009 at 11:44 AM, Dan Bolser <dan.bolser at gmail.com> wrote:
> Hi, can someone test the script here on zero length fasta / qual files?
>
> http://www.bioperl.org/wiki/Merging_separate_sequence_and_quality_files_to_FASTQ
>
> It seems the output has an extra newline in the sequence part of the
> output (which throws off scripts that rely on the 'four lines per
> record' structure of the fastq (although I'm not sure if it's illegal
> fastq).

Hi Dan,

The OBF consensus was FASTQ records with a zero length
sequence might be useful, and should be output as exactly
four lines (one blank sequence line, one blank quality line).
However for parsing, any number of blank lines should be OK.
http://lists.open-bio.org/pipermail/open-bio-l/2009-July/000522.html

I can confirm the perl script currently outputs a FASTQ file
with TWO blank lines for the sequence, giving five lines in
total for the zero length record. That does suggest a bug.
What version of BioPerl are you running?

Peter

P.S. The script is throwing away any description after the
identifier.



More information about the Bioperl-l mailing list