[Bioperl-l] fastq splitter

Sean O'Keeffe limericksean at gmail.com
Wed Feb 29 17:33:01 UTC 2012


Yes. I ran my script on a cluster which may have had bioperl installed, not
sure.
Running it locally = success.

Thanks all!



On 29 February 2012 12:13, Fields, Christopher J <cjfields at illinois.edu>wrote:

> Sean,
>
> To follow up just in case it was a bug, tested with your seq examples and
> they also work, so my guess is something else is wrong locally.
>
> [cjfields at pyrimidine-laptop sean]$ perl test.pl < example2.fastq
> @HWI-ST156:445:C0EDLACXX:4:1101:1496:1039 1:N:0:ATCACG
> CTGCTGGTAGTGCCCAAAGACCTCGAATACAATGGGCTTGGTTTTGATGT
> +
> BCCFFFFEHHHHHJJJJJHIIJIJJIIGIJJJJJJJIJJJI?FHJJIIJA
> @HWI-ST156:445:C0EDLACXX:4:2308:20877:199811 2:Y:0:ATCACG
> TCATAAAAATAACAAAACCACCACCCCATACAAACTCTACTCATCTCCAC
> +
> ##################################################
>
> chris
>
> On Feb 28, 2012, at 3:11 PM, Sean O'Keeffe wrote:
>
> > Hi,
> > I'm trying to write a quick script to separate one large PE fastq file
> into
> > 2 separate files, one for each mate pair
> >
> > The file is of the format (mate1)
> > @HWI-ST156:445:C0EDLACXX:4:1101:1496:1039 1:N:0:ATCACG
> > CTGCTGGTAGTGCCCAAAGACCTCGAATACAATGGGCTTGGTTTTGATGT
> > +
> > BCCFFFFEHHHHHJJJJJHIIJIJJIIGIJJJJJJJIJJJI?FHJJIIJA
> >
> > && (mate2)
> >
> > @HWI-ST156:445:C0EDLACXX:4:2308:20877:199811 2:Y:0:ATCACG
> > TCATAAAAATAACAAAACCACCACCCCATACAAACTCTACTCATCTCCAC
> > +
> > ##################################################
> >
> >
> > My idea is to separate using a regex such that / 1:/ would be the first
> > mate pair and / 2:/ would go in the second mate file.
> > I implemented the code below but each output file is empty. Can someone
> > spot my error?
> >
> > Thanks,
> > Sean.
> >
> > my $infile   = shift;
> > my $outfile1 = $infile."_1";
> > my $outfile2 = $infile."_2";
> >
> > my $seqin = Bio::SeqIO->new(
> >                             -file   => "<$infile",
> >                             -format => "fastq",
> >                             );
> > my $seqout1 = Bio::SeqIO->new(
> >                              -file   => ">$outfile1",
> >                              -format => "fastq",
> >                              );
> >
> > my $seqout2 = Bio::SeqIO->new(
> >                              -file   => ">$outfile2",
> >                              -format => "fastq",
> >                              );
> > while (my $inseq = $seqin->next_seq) {
> >    if ($seqin->desc =~ / 1:/){
> >      $seqout1->write_seq($inseq);
> >    } else {
> >      $seqout2->write_seq($inseq);
> >    }
> > }
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>



More information about the Bioperl-l mailing list