[GSoC] GSoC weekly status report No.1

Sun May 6 15:00:07 UTC 2012

Hi Marjan,

You should probably incorporate into your test suite all of the test 
gff3 files in the test data directory of the Perl 
Bio::GFF3::LowLevel::Parser.  It has coverage for some corner cases that 
are a little bit tricky.

https://github.com/solgenomics/bio-gff3/tree/master/t/data

Rob

On 05/05/2012 09:07 AM, Marjan Povolni wrote:
> Hello all,
>
> It might be a little early, but there has been so much going on in the last
> 10 days since the results of GSoC were published...
>
> http://blog.mpthecoder.com/post/22380853664/gsoc-weekly-status-report-no-1
>
> A short summary:
>
> It has been 10 days since the GSoC results were published, and a lot has
> happened since then. I got to know the other students and mentors in a
> longish meeting on Google hangout, I got into a discussion with my mentor
> on IRC in which we didn’t agree about the parallelization strategy for the
> parser (experiments will show who’s right) and my inbox is full with mails
> from my mentor and other students, in which we exchanged loads of
> interesting ideas. Also, I solved a bug in biogems.info website, which was
> stopping Pjotr from updating the website with new information about biogems.
>
> There is now a GitHub repository for my project:
>
> https://github.com/mamarjan/bioruby-hpc-gff3
>
> The work for the first week of coding is halfway done too.
>
> There seems to be huge interest for a GFF3 parser with more features, like
> indexing, random access and writing output, and also support for linking
> into trees of features that are not located close to each other in the
> file. A fast sequential parser could be used to generate indexes, and the
> lower-level parts can be used to reorder the file for faster future usage.
> Based on that, I think this project is a good start.
>
> *I would like to ask you if you’re using the GFF3/GTF file formats in your
> research, to send me example files and descriptions of how are your
> applications using the data. This way I’ll be able to test the parser
> against your files and optimize it for your applications. Currently I have
> GFF files from Ensembl and Wormbase, and Pjotr pointed me to the genome
> browser web application at wormbase.org.*
>
> --
> Marjan
>
> _______________________________________________
> GSoC mailing list
> GSoC at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/gsoc
>