[GSoC] GSOC project: improve SegAnnDB interactive DNA copy number analysis

Eric Talevich eric.talevich at gmail.com
Mon Mar 3 21:44:42 UTC 2014


Hi Toby & all,

Since this idea is related to some very common types of bioinformatic
analysis, I would encourage looking at opportunities to integrate this
project with other software components that have already been developed,
potentially even replacing part of the SegAnnDB codebase with a newly
developed component:

- The proposed idea for GenomeDiagram would make it interactive, and
therefore very suitable for use in SegAnnDB. The current SegAnnDB generates
a static PNG and then draws on this bitmap, while a Bokeh implementation
could be smoother and more aesthetically pleasing. More importantly,
integration with GenomeDiagram or a more robust genome viewer (paging
GMOD?) would make it easier to display multiple tracks together, which
seems to be a very important feature for this sort of manual annotation
(e.g. CNVs are often correlated with loss of heterozygosity in SNPs, so
plotting SNP allele frequencies on another track would aid annotation).

- I could also envision this software as a Galaxy component. Would that
work?

- On the Biopython side, our modules for microarray analysis could benefit
from some attention. The raw probe copy number or copy ratio values need to
be extracted from a data source somehow before they can be used in
SegAnnDB, right? The initial segmentation also needs to be calculated; it
would also be useful to be able to do this in Python independently of
SegAnnDB. Can you see an opportunity to write reusable code that will
perform these operations in Biopython?

Cheers,
Eric


On Mon, Mar 3, 2014 at 9:08 AM, Toby Hocking <tdhock5 at gmail.com> wrote:

> Thanks for the input and the links to related work, Peter and Raoul.
>
> About BioPython, I have used it in previous projects, but its current
> features do not really help for this SegAnnDB project. As I understand,
> BioPython is best for things like sequence analysis and downloading data
> from GenBank, but for SegAnnDB I was doing something quite different:
> interactive visualization and storing user-specific annotations using a web
> server/database.
>
> About GenomeDiagram, it definitely could be used to plot DNA copy number
> profiles, but it is currently neither interactive nor linked to a database,
> and I really needed both of those features for SegAnnDB. GenomeDiagram
> could have been used to make some of the static PNG plots on SegAnnDB, but
> instead I used PIL directly since that is faster.
>
> About R, thanks for the link to the Java implementation of fastR, but I
> haven't used R at all in SegAnnDB since I wanted to just depend on 1
> language on the server side (Python).
>
> Finally about BioJS, I also wrote them, but now I realize that SegAnnDB is
> a better fit for the cross-language nature of OBF. BioJS is focused on
> developing web client-side JavaScript visualizations, which SegAnnDB does
> for DNA copy number profiles, so perhaps I could work with the BioJS guys
> on porting my existing JS code for their uses. However, my SegAnnDB project
> is also tightly integrated with a server-side Python component, which I
> would like a student to develop in GSOC.
>
> So again thanks for the encouraging comments and I will go ahead and post a
> more detailed project proposal on the wiki.
>
>
> On Mon, Mar 3, 2014 at 10:47 AM, Peter Cock <p.j.a.cock at googlemail.com
> >wrote:
>
> > On Mon, Mar 3, 2014 at 3:07 PM, Toby Hocking <tdhock5 at gmail.com> wrote:
> > > Hey OBF developers, I am a bioinformatics researcher and long-time
> > > developeR (admin and mentor for R's participation in GSOC). Using
> > > JavaScript and Python, I have developed SegAnnDB, a web site for
> > > visualization and interactive annotation of DNA copy number profiles
> > >
> >
> http://bioinformatics.oxfordjournals.org/content/early/2014/02/03/bioinformatics.btu072.shortand
> > > I want to get a GSOC student to implement some improvements. Would it
> > > be possible for me to propose this as an OBF project and possibly be a
> > > mentor for GSOC?
> > >
> > > I think SegAnnDB fits into the main theme of OBF: writing open-source
> > code
> > > for analysis and visualization of biological data. The student would
> need
> > > to write Python code for the server side and JavaScript code for the
> web
> > > client side so I think it would fit best into the "cross-project ideas"
> > > section.
> > >
> > > Anyway, if it is OK with you guys, can please I post my project
> proposal
> > to
> > > the OBF GSOC ideas wiki page?
> >
> > Do you see any natural links to Biopython on the server side (an OBF
> > project which would be good for the GSoC link) or BioJS on the client
> > side (not an OBF project, but also participating under GSoC directly)?
> >
> > See also Leighton's outline Biopython proposal on interactive graphics:
> >
> >
> http://biopython.org/wiki/Google_Summer_of_Code#Interactive_GenomeDiagram_Module
> >
> > Peter
> >
> _______________________________________________
> GSoC mailing list
> GSoC at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/gsoc
>



More information about the GSoC mailing list