[GSoC] GSoC 2014 queries and inputs

Ujjwal Thaakar ujjwalthaakar at gmail.com
Wed Mar 19 16:23:32 UTC 2014


Is there a template for the application proposal?


On 19 March 2014 19:56, Fields, Christopher J <cjfields at illinois.edu> wrote:

>  On Mar 19, 2014, at 8:28 AM, Artem Tarasov <lomereiter at gmail.com> wrote:
>
>   On Tue, Mar 18, 2014 at 11:44 PM, Ujjwal Thaakar <
> ujjwalthaakar at gmail.com> wrote:
>
>> What's the difference between SAM and VCF?
>
>
>  SAM: mapping software aligns reads against the reference genome (and its
> reverse-complement) and writes to SAM/BAM file information about best
> alignment of each read (to which strand it aligned, what are the
> differences compared to the reference, and so on)
>
>  VCF: not reads but positions on the reference genome are considered, and
> each record contains information about whether there's variability at a
> position. They are produced from SAM files by considering reads overlapping
> each position - if statistically significant number of reads have a base
> different from the reference (or an insertion/deletion), this is probably a
> true mutation which might have biological significance as well.
>
>  For JRuby, I'd recommend using Picard. No need to reinvent the wheel.
> Plus, you might also want to support the binary counterpart, BCF format.
>
>
>  --
> Artem
>
>
> Yep, if planning on going through jvm then Picard is nice and supports VCF
> (and BCF it seems).  No CRAM support, but there is this:
>
>     https://github.com/enasequence/cramtools
>
>  (section on picard integration)
>
>  chris
>



-- 
Thanks
Ujjwal



More information about the GSoC mailing list