[Biopython-dev] Accessing built-in data files

Tiago Antão tiagoantao at gmail.com
Mon Nov 19 15:41:57 UTC 2007

On Nov 19, 2007 3:25 PM, Peter <biopython at maubp.freeserve.co.uk> wrote:
> > 2. Accessing the data that was packaged
> > This one is trickier. The first problem is that the data files might
> > be installed in either the platform dependent install directory or in
> > the platform independent one. ...
> I may be naive here, but I would expect the datafiles to be installed in
> the same relative path as the python files.  Is this not always the case?

Yep, but which datafiles and where? By which I mean, platform
dependent or platform independent. Again this is mainly a not issue.
But "where" is important. "Where" might be the system path and all is
fine and dandy, but it also might be elsewhere (In my case, when
testing I don't install to the system path. Also, if you don't have
admin access to the machine...)

> As I pointed out in bug 2375, there is some code in setup.py (pre-dating
> the support in distutils) which takes care of installing data files for
> EUtils - if that actually works for the special cases you are concerned
> about, maybe we should just use that for your data files too.  See
> http://bugzilla.open-bio.org/show_bug.cgi?id=2375

I checked that. I don't think (as far as I researched) that the data
is really accessed at execution.
DTDs (you can check the DTDs directory inside EUtils) were converted
to py files. And the data directory is actually a python module
directory (the generated py code is on CVS, actually).

> > My work around? Go through all the directories on sys.path (PYTHONPATH) and search for
> > a data file that I know is there ...
> That doesn't seem very elegant :(

I know, that is one of the reasons I called it "work around" and I am
discussing this in the open.
The changes to setup.py are quite trivial and pacific. But at the code
level I cannot think of a better solution. All that occurs to me is:
1. Use sys.prefix. Clean but that would mean only installations to the
system directory. I don't think that that is acceptable
2. The current solution
3. If there was a way to know where the package code is installed, I
could infer the data directory from there (clean, precise and robust),
but I don't know how to do that (if there is a way at all). That would
sort things out.


More information about the Biopython-dev mailing list