[Biopython-dev] [Wg-phyloinformatics] BioGeography update/BioPython tree module discussion

Nick Matzke matzke at berkeley.edu
Wed Aug 19 08:56:59 UTC 2009


OK, I nailed the bug, which was stemming from HTML links inside GBIF XML 
results which in some situations were screwing up parsing etc.  So I've 
updated the tutorial to add the chunk about downloading an arbitrarily 
large number of records, in user-specified increments, with an 
appropriate time-delay between server requests.

Also added a chunk on classifying records into user-specified geographic 
areas based on their latitude/longitude.

Also updated the test scripts and test results files, and deleted some 
remaining loose/unnecessary files.

Updated tutorial: http://biopython.org/wiki/BioGeography#Tutorial
Github commits: http://github.com/nmatzke/biopython/commits/Geography

I think I've reached a good stopping point for the moment, I welcome 
comments on the tutorial and/or on the prospects for turning this into 
an official biopython module, etc.

Thanks again, and cheers!
Nick




Nick Matzke wrote:
> Pencils down update: I have uploaded the relevant test scripts and data 
> files to git, and deleted old loose files.
> http://github.com/nmatzke/biopython/commits/Geography
> 
> Here is a simple draft tutorial:
> http://biopython.org/wiki/BioGeography#Tutorial
> 
> Strangely, while working on the tutorial I discovered that I did 
> something somewhere in the last revision that is messing up the parsing 
> of automatically downloaded records from GBIF, I am tracking this down 
> currently and will upload as soon as I find it.
> 
> I would like to thank everyone for the opportunity to participate in 
> GSoC, and to thank everyone for their help.  For me, this summer turned 
> into more of a "growing from a scripter to a programmer" summer than I 
> expected initially.  As a result I spent a more time refactoring and 
> retracing my steps than I figured.  However I think the resulting main 
> product, a GBIF interface and associated tools, is much better than it 
> would have been without the advice & encouragement of Brad, Hilmar, etc. 
>  I will be using this for my own research and will continue developing it.
> 
> Cheers!
> Nick
> 
> 
> Brad Chapman wrote:
>> Hi Nick;
>>
>>> Summary: Major focus is getting the GBIF access/search/parse module 
>>> into "done"/submittable shape.  This primarily requires getting the 
>>> documentation and testing up to biopython specs.  I have a fair bit 
>>> of documentation and testing, need advice (see below) for specifics 
>>> on what it should look like.
>>
>> Awesome. Thanks for working on the cleanup for this.
>>
>>> OK, I will do this.  Should I try and figure out the unittest stuff?  
>>> I could use a simple example of what this is supposed to look like.
>>
>> In addition to Peter's pointers, here is a simple example from a
>> small thing I wrote:
>>
>> http://github.com/chapmanb/bcbb/blob/master/align/adaptor_trim.py
>>
>> You can copy/paste the unit test part to get a base, and then
>> replace the t_* functions with your own real tests.
>>
>> Simple scripts that generate consistent output are also fine; that's
>> the print and compare approach.
>>
>>>> - What is happening with the Nodes_v2 and Treesv2 files? They look
>>>>   like duplicates of the Nexus Nodes and Trees with some changes.
>>>>   Could we roll those changes into the main Nexus code to avoid
>>>>   duplication?
>>> Yeah, these were just copies with your bug fix, and with a few mods I 
>>> used to track crashes.  Presumably I don't need these with after a 
>>> fresh download of biopython.
>>
>> Cool. It would be great if we could weed these out as well.
>>
>>> The API is really just the interface with GBIF.  I think developing a 
>>> cookbook entry is pretty easy, I assume you want something like one 
>>> of the entries in the official biopython cookbook?
>>
>> Yes, that would work great. What I was thinking of are some examples
>> where you provide background and motivation: Describe some useful 
>> information you want to get from GBIF, and then show how to do it.
>> This is definitely the most useful part as it gives people working
>> examples to start with. From there they can usually browse the lower
>> level docs or code to figure out other specific things.
>>
>>> Re: API documentation...are you just talking about the function 
>>> descriptions that are typically in """ """ strings beneath the 
>>> function definitions?  I've got that done.  Again, if there is more, 
>>> an example of what it should look like would be useful.
>>
>> That looks great for API level docs. You are right on here; for this
>> week I'd focus on the cookbook examples and cleanup stuff.
>>
>> My other suggestion would be to rename these to follow Biopython
>> conventions, something like:
>>
>> gbif_xml -> GbifXml
>> shpUtils -> ShapefileUtils
>> geogUtils -> GeographyUtils
>> dbfUtils -> DbfUtils
>>
>> The *Utils might have underscores if they are not intended to be
>> called directly.
>>
>> Thanks for all your hard work,
>> Brad
>>
> 

-- 
====================================================
Nicholas J. Matzke
Ph.D. Candidate, Graduate Student Researcher
Huelsenbeck Lab
Center for Theoretical Evolutionary Genomics
4151 VLSB (Valley Life Sciences Building)
Department of Integrative Biology
University of California, Berkeley

Lab websites:
http://ib.berkeley.edu/people/lab_detail.php?lab=54
http://fisher.berkeley.edu/cteg/hlab.html
Dept. personal page: 
http://ib.berkeley.edu/people/students/person_detail.php?person=370
Lab personal page: http://fisher.berkeley.edu/cteg/members/matzke.html
Lab phone: 510-643-6299
Dept. fax: 510-643-6264
Cell phone: 510-301-0179
Email: matzke at berkeley.edu

Mailing address:
Department of Integrative Biology
3060 VLSB #3140
Berkeley, CA 94720-3140

-----------------------------------------------------
"[W]hen people thought the earth was flat, they were wrong. When people 
thought the earth was spherical, they were wrong. But if you think that 
thinking the earth is spherical is just as wrong as thinking the earth 
is flat, then your view is wronger than both of them put together."

Isaac Asimov (1989). "The Relativity of Wrong." The Skeptical Inquirer, 
14(1), 35-44. Fall 1989.
http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm
====================================================



More information about the Biopython-dev mailing list