[GSoC] GSoC 2014 queries and inputs
Ujjwal Thaakar
ujjwalthaakar at gmail.com
Wed Mar 19 16:23:32 UTC 2014
Is there a template for the application proposal?
On 19 March 2014 19:56, Fields, Christopher J <cjfields at illinois.edu> wrote:
> On Mar 19, 2014, at 8:28 AM, Artem Tarasov <lomereiter at gmail.com> wrote:
>
> On Tue, Mar 18, 2014 at 11:44 PM, Ujjwal Thaakar <
> ujjwalthaakar at gmail.com> wrote:
>
>> What's the difference between SAM and VCF?
>
>
> SAM: mapping software aligns reads against the reference genome (and its
> reverse-complement) and writes to SAM/BAM file information about best
> alignment of each read (to which strand it aligned, what are the
> differences compared to the reference, and so on)
>
> VCF: not reads but positions on the reference genome are considered, and
> each record contains information about whether there's variability at a
> position. They are produced from SAM files by considering reads overlapping
> each position - if statistically significant number of reads have a base
> different from the reference (or an insertion/deletion), this is probably a
> true mutation which might have biological significance as well.
>
> For JRuby, I'd recommend using Picard. No need to reinvent the wheel.
> Plus, you might also want to support the binary counterpart, BCF format.
>
>
> --
> Artem
>
>
> Yep, if planning on going through jvm then Picard is nice and supports VCF
> (and BCF it seems). No CRAM support, but there is this:
>
> https://github.com/enasequence/cramtools
>
> (section on picard integration)
>
> chris
>
--
Thanks
Ujjwal
More information about the GSoC
mailing list