[Biopython-dev] [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond

Kevin Jacobs <jacobs@bioinformed.com> bioinformed at gmail.com
Fri May 25 11:15:00 UTC 2012


On Fri, May 25, 2012 at 2:49 AM, Mic <mictadlo at gmail.com> wrote:

> I think Pircard-tools does parallel compression/decompression of BGZF.
>
>
Here is what Picard's does for one command:
MergeSamFiles

Merges multiple SAM/BAM files into one file.

USE_THREADING=BooleanOption to create a background thread to encode,
compress and write to disk the output file. The threaded version uses about
20% more CPU and decreases runtime by ~20% when writing out a compressed
BAM file. Default value: false. This option can be set to 'null' to clear
the default value. Possible values: {true, false}
BAM output (dominated by zlib compression and/or IO write latency) is run
in a different thread, but is still performed sequentially over blocks.
 The recent samtools fork attempts to buffer uncompressed BAM blocks and
allocates multiple threads to compress several in parallel since they are
independent.

-Kevin



More information about the Biopython-dev mailing list