[Bioperl-l] fastq splitter

Wed Feb 29 15:32:55 UTC 2012

On Wed, Feb 29, 2012 at 3:27 PM, Fields, Christopher J
<cjfields at illinois.edu> wrote:
> On Feb 29, 2012, at 4:32 AM, Peter Cock wrote:
>
>> On Wed, Feb 29, 2012 at 2:42 AM, Fields, Christopher J
>> <cjfields at illinois.edu> wrote:
>>> Frankly, there never seemed to be a real fixed standard in the way that FASTQ
>>> headers were written (and just when it seems there is some consensus, Illumina
>>> pulls the rug out from under you), hence the reason I leave it alone.  We could
>>> add some ID munging in there if needed, would just need a qr// with a standard
>>> fallback.
>>>
>>> chris
>>
>> Indeed - just like FASTA, it seems every company/tool/database has its own
>> conventions about the FASTQ ID line and how to stuff as much meta-data
>> into it as possible. This is a major reason why I hope unaligned reads in
>> SAM/BAM takes off - places like the Sanger and Broad use this in their
>> pipelines.
>>
>> http://blastedbio.blogspot.com/2011/10/fastq-must-die-long-live-sambam.html
>>
>> Peter
>
> Unaligned BAM makes the most sense.  I've also been talking with the
> HDF5 folks here sporadically, they're still keen on promoting BioHDF
> (it is pretty fast), though that has cooled considerably.
>
> Anyone working directly with CRAM in their pipelines?
>
> chris

I understand that Sanger are looking at moving their pipelines from BAM to
CRAM later this year, but CRAM is still quite new and in flux.

Peter