[Bioperl-l] Installing BioPerl on Windows

Thu Dec 9 19:22:37 EST 2004

Just to provide one commercial user's experience -

I invoke BLAST, ClustalW, and EMBOSS programs on Windows and
Linux by using BioPerl.  I've found that using output files
works best.  This gets around the backticks vs system() issue.
I use the same code on both operating systems.

Scott

Brian Osborne wrote:

> Barry,
> 
> Your proposed revisions are the third time someone has attempted to redo the
> Windows installation file. Or perhaps the fourth? The other well-intentioned
> authors made their attempts for the same reasons you did: 1) Windows users
> are a sizable fraction of the users with installation problems. 2) Windows
> users with problems have the same questions, again and again ("where or what
> is GD?", etc). 3) These users have not read the INSTALL.WIN file, or have
> not paid attention.
> 
> So, I'm fairly certain that your proposed changes will make no difference,
> no matter how well-reasoned they are. If people don't read this file,
> changing it makes no difference. So, where would you put a "Windows tips"
> file? Again, I don't think Windows users pay attention to the files in the
> top directory. Check out the first section of the README file, it directs
> them immediately to INSTALL.WIN, very obvious, so these users aren't reading
> the README either. I'm not being snide here, I just think the mode of
> Windows installation doesn't naturally lead to reading these top-level
> documents. Different from Unix.
> 
> Question: when the Windows user downloads the package what do they do with
> it? Given a typical approach, what's the best place to put information on
> Windows installation? On the Web download page perhaps?
> 
> Another effective way to do these kinds of documents is to get all the
> frequently asked questions/problems and address them specifically. So, you'd
> have a "quick start" section first, as you did, then follow it immediately
> with a list of questions/problems and answers. Yes, you might consider
> putting these into the existing FAQ but then each time the user writes
> "where is ...?" you'd have to answer "please check the FAQ 4.2 ...". Less
> than ideal, since the idea is to set things up so that the users don't have
> to write bioperl-l.
> 
> Thank you for efforts.
> 
> Brian O.
> 
> 
> 
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org
> [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Barry Moore
> Sent: Wednesday, December 08, 2004 4:15 PM
> To: Jason Stajich
> Cc: Brian Osborne; Bioperl List
> Subject: [Bioperl-l] Installing BioPerl on Windows
> 
> 
> Jason, Brian, Others-
> 
> A recent message to the bioperl list suggests that new Windows users are
> still having problems installing Bioperl on Windows. This is not
> necessary because it's actually quite easy to install Bioperl 1.4. I had
> a look at the INSATLL.WIN document and I think that while it has been
> updated a bit, it is starting to suffer from fragmented editing over a
> long period of time. All the information that you need is there, but it
> doesn't really fit together to well anymore, and there is still some
> outdated and conflicting information present. Since new Windows users
> are often the least likely to be experienced programmers and also likely
> to have little Unix experience it may also need to be written with that
> in mind, providing more explanation for how things are done. I've taken
> a crack at this and rewritten INSTALL.WIN with a longer (perhaps to
> long) introduction to Bioperl, and updated installation instruction for
> Bioperl 1.4. In fact I think that the file name INSTALL.WIN should
> probably be changed as that is a filename that is intuitive to someone
> who has done a lot of installing from source.
> Installing_Bioperl_on_Windows.txt may be more obvious filename to new
> Windows users. If you think it looks useful please feel free to post it
> on the Bioperl web site as a replacement for or in addition to the
> current INSTALL.WIN. I'll be happy to try to keep this document up to
> date, but I'll need one of the developers to put it on the site for me.
> Finally, I didn't touch the Cygwin sections of the previous INSTALL.WIN
> document because I have no experience with it, so I'll have to assume
> that it is accurate and let others contribute any fixes necessary there.
> Let me know if I've made any errors or omissions that need to be corrected.
> 
> Barry
> 
> ============================================================================
> ======
> 
> Installing Bioperl on Windows
> =============================
> 
> 1) Quick Instructions for the impatient
> 2) Bioperl on Windows
> 3) Perl on Windows
> 4) BioPerl on Windows
> 5) Beyond the Core
> 6) BioPerl in Cygwin
> 7) Cygwin tips
> 
> This installation guide was written by Barry Moore and other Bioperl
> authors based on the
> original work of Paul Boutros. Please report problems and/or fixes to
> the bioper lmailing
> list, bioperl-l at bioperl.org
> 
> 1) Quick instructions for the impatient, lucky, or experienced user.
> =====================================================================
> 
> Download the ActivePerl MSI from
> http://www.activestate.com/Products/ActivePerl/
> Run the ActivePerl Installer (accepting all defaults is fine).
> Open a command prompt (Menus Start->Run and type cmd) and run the ppm
> shell (C:\>ppm).
> Add two new ppm repositories with the following commands:
> ppm> rep add Bioperl http://bioperl.org/DIST
> ppm> rep add Kobes http://theoryx5.uwinnipeg.ca/ppms
> Install Bioperl-1.4.
> Go to http://www.bioperl.org and start reading documentation or try the
> example script at
> the end of this file.
> 
> 
> 2) Bioperl on Windows
> ======================
> 
> Bioperl is a large collection of Perl modules (extensions to the Perl
> language) that aid
> in the task of writing perl code to deal with sequence data in a myriad
> of ways. Bioperl
> provides objects for various types of sequence data and their associated
> features and
> annotations. It provides interfaces for analysis of these sequences with
> a wide variety
> of external programs (BLAST, fasta, clustalw and EMBOSS to name just a
> few). It provides
> interfaces to various types of databases both remote (GenBank, EMBL
> etc.) and local
> (MySQL, flat files, GFF etc.) for storage and retrieval of sequences.
> And finally with
> its associated documentation and mailing list Bioperl represents a
> community of
> bioinformatics professionals working in perl who are committed to
> supporting both
> development of Bioperl and the new users who are drawn to the project.
> 
> While most bioinformatics and computational biology applications are
> developed in
> Unix/Linux environments, more and more programs are being ported to
> other operating
> systems like Windows, and many users (often biologists with little
> background in
> programming) are looking for ways to automate bioinformatics analyses in
> the Windows
> environment. Perl and Bioperl can be installed natively on Windows
> NT/2000/XP. Most of
> the functionality of Bioperl is available with this type of install.
> Much of the heavy
> lifting in bioinformatics is done by programs originally developed in
> lower level
> languages like C and Pascal (e.g. BLAST, clustalw, Staden etc.). Bioperl
> simply acts as a
> wrapper for running and parsing output from these external programs.
> Some of those
> programs (BLAST for example) are ported to Windows. These can be
> installed and work
> quite happily with BioPerl in the native Windows environment. Others,
> such as clustalw,
> have Windows ports, however the BioPerl developer who wrote the
> interface used Unix
> specific system calls to interact with these programs and so these
> wrappers will not work
> in the Windows environment. And finally some external programs such as
> Staden and the
> EMBOSS suite of programs can not be installed on Windows at all, and
> therefore any part
> of Bioperl that interacts with these packages either won’t work or can’t
> be installed at
> all.
> 
> If you have a fairly simple project in mind, want to start using Bioperl
> quickly, only
> have access to a computer running Windows, and/or don’t mind bumping up
> against some
> limitations then Bioperl on Windows may be a good place for you to
> start. For example,
> downloading a bunch of sequences from GenBank and sorting out the ones
> that have a
> particular annotation or feature works great. Running a bunch of your
> sequences against
> remote or local BLAST, parsing the output and storing it in a MySQL
> database would be
> fine also. Be aware that most if not all of the Bioperl developers are
> working in some
> type of a Unix environment (Linux, OSX, Cygwin). If you have problems
> with Bioperl that
> are specific to the Windows environment, you may be blazing new ground
> and your pleas for
> help on the Bioperl mailing list may get few responses – simply because
> no one knows the
> answer to your Windows specific problem. If this is or becomes a problem
> for you then
> you are better off working in some type of Unix like environment. One
> solution to this
> problem that will keep you working on a Windows machine it to install
> Cygwin, a Unix
> emulation environment for Windows. A number of Bioperl users are using
> this approach
> successfully and it is discussed more below.
> 
> 3) Perl on Windows
> ===================
> 
> There are a couple of ways of installing Perl on a Windows machine. The
> most common and
> easiest is to get the most recent build from ActiveState. ActiveState is
> a software
> company (http://www.activestate.com) that provides free builds of Perl
> for Windows
> users. The current (December 2004) build is ActivePerl 5.8.4.810
> (ActivePerl 5.6.1.638
> is also available and should work just fine). To install ActivePerl on
> Windows:
> Download the ActivePerl MSI from
> http://www.activestate.com/Products/ActivePerl/
> Run the ActivePerl Installer (accepting all defaults is fine).
> 
> You can also build Perl yourself (which requires a C compiler) or
> download one of the
> other binary distributions. The Perl source for building it yourself is
> available from
> CPAN (http://www.cpan.org), as are a few other binary distributions that
> are alternatives
> to ActiveState. This approach is not recommended unless you have
> specific reasons for
> doing so and know what you’re doing. It that’s the case you probably
> don’t need to be
> reading this guide.
> 
> Cygwin is a Unix emulation environment for Windows and comes with its
> own copy of Perl.
> Information on Cygwin and Bioperl is found below.
> 
> 4) BioPerl on Windows
> ======================
> 
> Perl is a programming language that has been extended a lot by the
> addition of external
> modules. These modules work with the core language to extend the
> functionality of Perl.
> Bioperl is one such extension to Perl. These modular extensions to Perl
> sometimes depend
> on the functionality of other Perl modules and this creates a
> dependency. You can’t
> install module X unless you have already installed module Y. Some Perl
> modules are so
> fundamentally useful that the Perl developers have included them in the
> core distribution
> of Perl – if you’ve installed Perl then these modules are already
> installed. Other
> modules are freely available from CPAN, but you’ll have to install them
> yourself if you
> want to use them. BioPerl has such dependencies.
> 
> Bioperl is actually a large collection of perl modules (over 1000
> currently) and these
> modules are split into six groups. These six groups are:
> 
> Bioperl Group Functions
> -----------------------------------------------------------------
> bioperl (the core) Most of the main functionality of Bioperl.
> bioperl-run Wrappers to a lot of external programs.
> bioperl-ext Interaction with some alignment functions
> and the Staden package.
> bioperl-db Using bioperl with BioSQL and local
> relational databases.
> bioperl-microarray Microarray specific functions.
> biperl-gui Some preliminary work on a graphical user
> interface to some Bioperl functions.
> 
> The Bioperl core is what most new users will want to start with. Bioperl
> 1.4 (the core)
> and the Perl modules that it depends on can be easily installed with
> ppm. PPM
> (Programming Package Manager) is an ActivePerl utility for installing
> Perl modules on
> systems using ActivePerl. PPM will look online (you have to be connected
> to the internet
> of course) for files (these files end with .ppd) that tell it how to
> install the modules
> you want and what other modules your new modules depends on. It will
> then download and
> install your modules and all dependent modules for you. These .ppd files
> are stored
> online in ppm repositories. ActiveState maintains the largest ppm
> repository and when
> you installed ActivePerl ppm was installed with directions for using the
> ActiveState
> repositories. Unfortunately the ActiveState repositories are far from
> complete and other
> ActivePerl users maintain their own ppm repositories to fill in the
> gaps. Installing
> will require you to direct ppm to look in two new repositories. You do
> this by opening a
> Windows command prompt, typing ppm to start the ppm shell and then
> typing the following
> two commands:
> ppm> rep add Bioperl http://bioperl.org/DIST
> ppm> rep add Kobes http://theoryx5.uwinnipeg.ca/ppms
> 
> Once ppm knows where to look for Bioperl and it’s dependencies you
> simply tell ppm to
> install it. This is done with the command:
> ppm> install Bioperl-1.4
> 
> 5) Beyond the Core
> ===================
> 
> You may find that you want some of the features of other Bioperl groups
> like bioperl-run
> or bioperl-db. There are currently no ppm packages for installing these
> parts of
> Bioperl. You will have to install these manually from source. For this
> you will need a
> Windows version of the program make called nmake
> (http://download.microsoft.com/download/vc15/Patch/1.52/W95/EN-US/Nmake15.ex
> e).
> You will
> also want to have a willingness to experiment. You’ll have to read the
> installation
> documents for each component that you want to install, and use nmake
> where the
> instructions call for make. You will have to determine from the
> installation documents
> what dependencies are required and you will have to get them, read there
> documentation
> and install them first. The details of this are beyond the scope of this
> guide. Read
> the documentation. Search Google. Try your best, and if you get stuck
> consult with
> other on the bioperl mailing list.
> 
> 6) BioPerl in Cygwin
> =====================
> 
> Cygwin is a Unix emulator and shell environment available free at
> www.cygwin.com. BioPerl
> runs well within Cygwin. Some users claim that installation of Bioperl
> is easier within
> Cygwin than within Windows, but these may be users with Unix backgrounds.
> 
> One advantage of using Bioperl in Cygwin is that all the external
> modules are available
> through CPAN, most if not all external programs can be installed and run
> so many of the
> limitation of Bioperl on Windows are circumvented.
> 
> To get Bioperl running first install the basic Cygwin package as well as
> the Cygwin Perl,
> make, and gcc packages. Clicking the "View" button in the upper right of
> the installer
> enables you to see details on the various packages. Then follow the
> BioPerl installation
> instructions for Unix in BioPerl's INSTALL file.
> 
> Note that expat comes with Cygwin (it's used by the module XML::Parser).
> 
> One known issue is that DBD::mysql can be tricky to install in
> Cygwin and this module is required for the bioperl-db, Biosql, and
> bioperl-pipeline
> external packages. Fortunately there's some good instructions online:
> http://search.cpan.org/src/JWIED/DBD-mysql-2.1025/INSTALL.html#windows/cygwi
> n.
> 
> Also, set the environmental variable TMPDIR, programs like BLAST and
> clustalw need a
> place to create temporary files. e.g.:
> 
> setenv TMPDIR e:/cygwin/tmp # csh, tcsh
> export TMPDIR=e:/cygwin/tmp # sh, bash
> 
> Note that this is not a syntax that Cygwin understands, which would be
> something like
> "/cygdrive/e/cygwin/tmp". This is the syntax that a Perl module expects
> on Windows.
> 
> If this variable is not set correctly you'll see errors like this when
> you run
> Bio::Tools::Run::StandAloneBlast:
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: Could not open /tmp/gXkwEbrL0a: No such file or directory
> STACK: Error::throw
> ..........
> 
> 7) Cygwin tips
> ===============
> 
> The easiest way to install Mysql is to use the Windows binaries
> available at
> www.mysql.com. Note that Windows does not have sockets, so you need to
> force the Mysql
> connections to use TCP/IP instead. Do this by using the "-h" option from
> the command-
> line:
> 
>  >mysql -h 127.0.0.1 -u blip -pblop biosql
> 
> Or, alias the mysql command in your .tcshrc, .cshrc, or .bashrc so it
> uses a host. For
> example, if your databases are installed locally:
> 
> alias mysql 'mysql -h 127.0.0.1'
> 
> If you're trying to use some application or resource "outside" of Cygwin
> and you're
> having a problem remember that Cygwin's path syntax may not be the
> correct one. Cygwin
> understands '/home/jacky' or '/cygdrive/e/cygwin/home/jacky' (when
> referring to the E:
> drive) but the external resource may want 'E:/cygwin/home/jacky'. So
> your *rc files may
> end up with paths written in these different syntaxes, depending.
> 
> If you can, install Cygwin on a drive or partition that's
> NTFS-formatted, not FAT32-
> formatted. When you install Cygwin on a FAT32 partition you will not be
> able to set
> permissions and ownership correctly. In most situations this probably
> won't make any
> difference but there may be occasions where this is a problem.
> 
> If you want to use BLAST we recommend that the Windows binary be
> obtained from NCBI
> (ftp://ftp.ncbi.nih.gov/blast/executables/LATEST-BLAST - the file will
> be named something
> like blast-2.2.6-ia32-win32.exe). Then follow the Windows instructions
> in README.bls.
> 
> Although we've recommended using the BLAST and MySQL binaries you should
> be able to
> compile just about everything else from source code using Cygwin's gcc.
> You'll notice
> when you're installing Cygwin that many different libraries are also
> available (gd, jpeg,
> etc.).
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel at scitegic.com
SciTegic Inc.                       mobile: +1 858 205 3653
9665 Chesapeake Drive, Suite 401    voice:  +1 858 279 8800, ext. 253
San Diego, CA 92123                 fax:    +1 858 279 8804
USA                                 web:    http://www.scitegic.com