[Bioperl-l] fastq splitter

Pablo marin-garcia harpactocrates at googlemail.com
Thu Mar 1 17:28:28 UTC 2012


On Thu, Mar 1, 2012 at 4:03 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Thu, Mar 1, 2012 at 2:41 PM, Pablo marin-garcia
> <harpactocrates at googlemail.com> wrote:
>> On Wed, Feb 29, 2012 at 4:32 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>>>
>>> I understand that Sanger are looking at moving their pipelines from BAM to
>>> CRAM later this year, but CRAM is still quite new and in flux.
>>>
>>
>> my concern is that being CRAM based in delta compression (comparison
>> against reference), I  am not sure how much compression it would
>> achieve with unaligned bams.
>
> This can be done with an appropriate dummy reference, for instance
> from a mini-assembly of the unmapped reads.
>
>> The other thing that CRAM does is to
>> remove a lot of extra tags and metadata (even from the header
>> reference info), and here the strong point of bam against FASTQ is the
>> availability of structured metadata. CRAM is still in development in
>> this area so we will see where they go.
>
> Did you miss Ewan's reply about CRAM 0.7 which is due soon?
> http://lists.open-bio.org/pipermail/bioperl-l/2012-March/036295.html
>

yes I miss it.


> Might this be better continued on the cram-dev list
> http://listserver.ebi.ac.uk/mailman/listinfo/cram-dev
> or on this SEQanswers thread?
> http://seqanswers.com/forums/showthread.php?t=18050
>

> Peter



-- 
   - Pablo Marin-Garcia




More information about the Bioperl-l mailing list