[Biopython-dev] [Bug 2480] Local BLAST fails: Spaces in Windows file-path values

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Thu Oct 2 09:41:45 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2480





------- Comment #34 from biopython-bugzilla at maubp.freeserve.co.uk  2008-10-02 05:41 EST -------
(In reply to comment #31)
> > For any documentation I would want to recommend using the
> > win32api.GetShortPathName() function to avoid the spaces,
> > with an example showing how to do this for the database
> > name(s).  To me this seems much simpler than the complex
> > quoting solution.
> 
> To me it seems win32api.GetShortPathName() will not work for
> database paths because the specified values are not really
> files (e.g., BLAST uses the /data/mouse.db value to look for
> /data/mouse.db.nin, etc.), and win32api.GetShortPathName
> works only on files.

I believe win32api.GetShortPathName works on paths (directories) and files. 
But by the nature of the filing system, it can only work on existing
files/directories - the short names cannot be calculated in advance.  As you
have found this means the function cannot be used on a database name (which is
not a full filename).  Thus any example in the documentation would have to use
win32api.GetShortPathName on the folder and then add on the name.

This alternative approach (from comment 24) would have to known about multiple
extensions (nucleotide and protein databases differ):

my_blast_db = win32api.GetShortPathName('C:/Documents and Settings/patnaik/My
Documents/blast/bin/mine.nin')[:-4]

> For BLAST's 'd' argument, to specify multiple databases, one uses the space
> separator, and double-quotes the entire argument value ("Db1 Db2"). If a
> database value has spaces within, one backslash-double-quotes that database
> value (\"Db 3\") and BLAST is supplied with "Db1 Db2 \"Db 3\"".

If we extend the Biopython BLAST API to require multiple databases as a list of
strings this could be possible. Otherwise, how do we know if we are dealing
with two databases (e.g. "Db1 Db2") or a single database whose name contains a
space (e.g. "expressed genes")?  We might also want to cope with the situation
where the user has already pre-quoted their database string. 

(In reply to comment 33)
> Re: using subprocess.Popen with shell=True/False (comment #27),
> while 'shell=True' works on Mac OS X, and probably other Unix/like
> systems, ...

Probably, but we need to check this rather than assuming it.

> my_process = subprocess.Popen(my_blast_cmd, stdin=subprocess.PIPE,
>                       stdout=subprocess.PIPE, stderr=subprocess.PIPE,
>                       shell=(True,False)[sys.platform == "win32"])

Using shell=(sys.platform<>"win32") would be much simpler ;)


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list