[Bioperl-l] randomizing fastq sequences
simon andrews (BI)
simon.andrews at bbsrc.ac.uk
Tue Feb 8 08:41:10 UTC 2011
On 7 Feb 2011, at 22:07, shalu sharma wrote:
> Hi,
> i am trying to test one program for which i need to change order of
> sequences in a fastq file.
> My fastq file contains about 50,000 sequences.
> Is there any script that can do this task?
Since FastQ is supported in SeqIO you could do something like (untested):
#!/usr/bin/perl
use warnings;
use strict;
use List::Util 'shuffle';
use Bio::SeqIO;
my @seqs;
my $in = Bio::SeqIO->new(-file => 'your_intput.fastq',
-format => 'Fastq');
while (my $seq = $in -> next_seq()) {
push @seqs,$seq;
}
@seqs = shuffle(@seqs);
my $out = Bio::SeqIO->new(-file => '>your_output.fastq',
-format => 'Fastq');
foreach my $seq (@seqs) {
$out->write_seq($seq);
}
## End
This has the disadvantage that it will hold all of the sequences in memory whilst shuffling, but I don't think there's an easy way around that.
Simon.
More information about the Bioperl-l
mailing list