[Biopython-dev] [GSoC] Project Proposal for Google Summer of Code

Joshua Klein mobiusklein at gmail.com
Wed Mar 4 03:57:38 UTC 2015


The codebase can be found at https://github.com/mobiusklein/pygly2 where I
am actively committing new code. I had borrowed a large portion of an
implementation of one facet of the library from another project released
under the Apache 2.0 license, which is more restrictive than the BSD
license. I've contacted the author with regard to how he would like us to
treat that component.

Assuming all these things pass smoothly, namespaces may be a complicated
matter now that I have read through more of the Bio codebase. At the risk
of sounding too general, the vast majority of the common namespaces in Bio
are related to operations on many different data formats that happen to
result in Seq or SeqRecord objects. This leads to lots of infrastructure
and class design being built up around ways of converting many different
resources into those types of objects. It would be sensible to keep the
structures I'm building in a separate namespace as they talk about
distinctly different data structures that don't share any class hierarchy.

I should post to the general user mailing list to find out about general
interest before trying to make the value judgement on whether integration
is a worthwhile investment. That's biopython at mailman.open-bio.org, correct?

Thank you,
Joshua Klein

On Tue, Mar 3, 2015 at 1:25 PM, Peter Cock <p.j.a.cock at googlemail.com>
wrote:

> Thanks Joshua,
>
> Outside GSoC, adding glycoinformatics to Biopython sounds interesting.
> As far as I know none of our existing contributors do much in this area,
> so our feedback may be limited to Python style (PEP8 please) and
> practicalities like unit tests and documentation.
>
> We should probably also ask on the main Biopython mailing list - that
> might reveal some potential users of the code?
>
> Is your code public already? e.g. GitHub or Bitbucket?
>
> Are there any pre-existing licences associated with your code? Can
> you simply adopt the Biopython Licence? This like the MIT/BSD
> licence in spirit (indeed we're considering moving to the 3-clause
> BSD licence as that is more mainstream).
>
> After a brief debate about appropriate namespace(s), I would suggest
> you fork our repository and work on a branch...
>
> Regards,
>
> Peter
>
> On Tue, Mar 3, 2015 at 5:05 PM, Joshua Klein <mobiusklein at gmail.com>
> wrote:
> > I am sorry to hear that the OBF did not make it in this year. We are
> still
> > interested in finding out if the Biopython team is interested in
> including
> > glycomics tools in the core library, or in some other looser association.
> > Real glycoinformatics is relatively inaccessible outside of two titanic
> Java
> > libraries, which doesn't jive with how a lot of modern bioinformatics is
> > done today.
> >
> > Thank you,
> > Joshua Klein
> >
> > On Tue, Mar 3, 2015 at 10:25 AM, Peter Cock <p.j.a.cock at googlemail.com>
> > wrote:
> >>
> >> Hi Joshua,
> >>
> >> Thank you for you interest. Unfortunately we've just heard that
> >> the  Open Bioinformatics Foundation (OBF) was not accepted
> >> into the Google Summer of Code 2015 programme:
> >>
> >>
> >>
> http://news.open-bio.org/news/2015/03/sadly-obf-not-accepted-for-gsoc-2015/
> >>
> >> It may be possible to pursue your project idea with one of the
> >> accepted organisations (perhaps the Python Software Foundation)
> >> by including a mentor from Biopython. We've not yet approached
> >> any of the other potential partner organisations yet though.
> >>
> >> Good luck,
> >>
> >> Peter
> >>
> >>
> >>
> >> Peter
> >>
> >>
> >> On Tue, Mar 3, 2015 at 2:58 AM, Joshua Klein <mobiusklein at gmail.com>
> >> wrote:
> >> > Hello,
> >> >
> >> > I've been working on a python library for reading, writing, and
> >> > manipulating
> >> > glycan structures and glycomics data. One of my collaborators
> suggested
> >> > I
> >> > inquire here if this might be of interest to the Biopython team and if
> >> > it
> >> > was not too late to apply for a Google Summer of Code sponsorship.
> >> >
> >> > The library can currently do the following:
> >> >
> >> > Read and write GlycoCT Condensed format carbohydrate structures that
> are
> >> > concrete (having no variable or undefined structures)
> >> > Read GlycoCTXML format carbohydrate structures that are concrete
> >> > Manipulate the tree structure, adding and removing monosaccharide and
> >> > substituent nodes, as well as altering existing nodes.
> >> > Calculate elemental compositions for glycan structures
> >> > Create arbitrary chemical derivitizations (e.g. permethylation) of
> >> > glycan
> >> > structures
> >> > Generate B, C, Y, and Z fragments from structures as observed when
> >> > analyzing
> >> > a structure with tandem mass spectrometry.
> >> > Make API calls against GlycomeDB to download structures and
> annotations
> >> > on
> >> > the fly
> >> > Perform sub-tree inclusion and maximum common substructure searches
> with
> >> > fuzzy matching.
> >> > Plot glycan structures using the Consortium for Functional Glycomics
> >> > symbol
> >> > nomenclature with matplotlib.
> >> > Perform error tolerance name inference on monosaccharides
> >> >
> >> > I am currently working on adding support for A and X cross-ring
> fragment
> >> > generation and adding other serialization formats such as IUPAC and
> >> > GlycoMinds Linear Codes.
> >> >
> >> > Currently, all of the features are implemented purely in Python.
> >> >
> >> > What other information, if any, can I provide?
> >> >
> >> > Thank you,
> >> > Joshua Klein
> >> >
> >> > _______________________________________________
> >> > GSoC mailing list
> >> > GSoC at mailman.open-bio.org
> >> > http://mailman.open-bio.org/mailman/listinfo/gsoc
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython-dev/attachments/20150303/6abbceb4/attachment.html>


More information about the Biopython-dev mailing list