[Bioperl-l] fastq splitter

Fields, Christopher J cjfields at illinois.edu
Wed Feb 29 17:05:57 UTC 2012


No, the output by default leaves off the optional descriptor:

[cjfields at pyrimidine]$ cat test.pl 
#!/usr/bin/env perl
use strict;
use warnings;
use Bio::SeqIO;

my $in = Bio::SeqIO->new(-fh => \*STDIN, -format => 'fastq');
my $out = Bio::SeqIO->new(-fh => \*STDOUT, -format => 'fastq');
while (my $seq = $in->next_seq) {$out->write_seq($seq)};
[cjfields at pyrimidine]$ perl test.pl < example.fastq 
@EAS54_6_R1_2_1_413_324
CCCTTCTTGTCTTCAGCGTTTCTCC
+
;;3;;;;;;;;;;;;7;;;;;;;88
@EAS54_6_R1_2_1_540_792
TTGGCAGGCCAAGGCCGATGGATCA
+
;;;;;;;;;;;7;;;;;-;;;3;83
@EAS54_6_R1_2_1_443_348
GTTGCTTCTGGCGTGGGTGGGGGGG
+
;;;;;;;;;;;9;7;;.7;393333


chris

On Feb 29, 2012, at 10:56 AM, Sean O'Keeffe wrote:

> But wouldn't that result in a 3 line fastq output line which would screw up other programs expecting 4 fastq lines? - e.g. bowtie.
> 
> On 29 February 2012 11:39, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Wed, Feb 29, 2012 at 4:30 PM, Sean O'Keeffe <limericksean at gmail.com> wrote:
> > Hi Chris,
> > Here's the perldoc for fastq - it does seem to indicate that the optional
> > descriptor (+) must match the first header. (See DESCRIPTION).
> 
> i.e. If present, it must match. But the repeated descriptor can
> (and for space efficiency should) be omitted.
> 
> As Chris mentioned earlier, there are sample files in the test suite
> which omit the repeated descriptor so this should be working OK.
> 
> Peter
> 





More information about the Bioperl-l mailing list