[Biopython-dev] [Wg-phyloinformatics] BioGeography update/BioPython tree module discussion

Nick Matzke matzke at berkeley.edu
Tue Aug 4 17:01:34 UTC 2009


Hi all, update:

Major improvements/fixes:
- removed any reliance on lagrange tree module, refactored all phylogeny 
code to use the revised Bio.Nexus.Tree module

- tree functions put in TreeSum (tree summary) class

- added functions for calculating phylodiversity measures, including 
necessary subroutines like subsetting trees, randomly selecting tips 
from a larger pool

- Code dealing with GBIF xml output completely refactored into the 
following classes:

* ObsRecs (observation records & search results/summary)
* ObsRec (an individual observation record)
* XmlString (functions for cleaning xml returned by Gbif)
* GbifXml (extention of capabilities for ElementTree xml trees, parsed 
from GBIF xml returns.

- another suggestion implemented: dependencies on tempfiles eliminated 
by using cStringIO (temporary file-like strings, not stored as temporary 
files) file_str objects instead

- another suggestion implemented: the _open method from biopython's ncbi 
www functionality has been copied & modified so that it is now a method 
of ObsRecs, and doesn't contain NCBI-specific defaults etc. (it does 
still include a 3-second waiting time between GBIF requests, figuring 
that is good practice).

- function to download large numbers of records in increments 
implemented as method of ObsRecs.



This week:
- Put GIS functions in a class (easy), allowing each ObsRec to be 
classified into an are (easy)

- Improve extraction of data from GBIF xmltree -- my Utricularia 
"practice XML file" didn't have problems, but with running online 
searches, I am discovering some fields are not always filled in, etc. 
This shouldn't be too hard, using the GbifXml xmltree searching 
functions, and including defaults for exceptions.

- Function for converting points to KML for Google Earth display.


Code uploaded here:
http://github.com/nmatzke/biopython/commits/Geography



-- 
====================================================
Nicholas J. Matzke
Ph.D. Candidate, Graduate Student Researcher
Huelsenbeck Lab
Center for Theoretical Evolutionary Genomics
4151 VLSB (Valley Life Sciences Building)
Department of Integrative Biology
University of California, Berkeley

Lab websites:
http://ib.berkeley.edu/people/lab_detail.php?lab=54
http://fisher.berkeley.edu/cteg/hlab.html
Dept. personal page: 
http://ib.berkeley.edu/people/students/person_detail.php?person=370
Lab personal page: http://fisher.berkeley.edu/cteg/members/matzke.html
Lab phone: 510-643-6299
Dept. fax: 510-643-6264
Cell phone: 510-301-0179
Email: matzke at berkeley.edu

Mailing address:
Department of Integrative Biology
3060 VLSB #3140
Berkeley, CA 94720-3140

-----------------------------------------------------
"[W]hen people thought the earth was flat, they were wrong. When people 
thought the earth was spherical, they were wrong. But if you think that 
thinking the earth is spherical is just as wrong as thinking the earth 
is flat, then your view is wronger than both of them put together."

Isaac Asimov (1989). "The Relativity of Wrong." The Skeptical Inquirer, 
14(1), 35-44. Fall 1989.
http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm
====================================================



More information about the Biopython-dev mailing list