[Bioperl-l] Installing Bioperl on Windows

Barry Moore barry.moore at genetics.utah.edu
Wed Dec 8 16:30:01 EST 2004


Of course as soon as I sent my last e-mail I found an error in the file 
I attached. It didn't include the example script that I reffered to.

Barry

==========================================================

Installing Bioperl on Windows
=============================

1) Quick Instructions for the Impatient
2) Bioperl on Windows
3) Perl on Windows
4) BioPerl on Windows
5) Beyond the Core
6) BioPerl and Cygwin
7) Cygwin Tips
8) Example Script

This installation guide was written by Barry Moore and other Bioperl 
authors based on the
original work of Paul Boutros. Please report problems and/or fixes to 
the bioperl mailing
list, bioperl-l at bioperl.org

1) Quick instructions for the impatient, lucky, or experienced user.
=====================================================================

Download the ActivePerl MSI from 
http://www.activestate.com/Products/ActivePerl/
Run the ActivePerl Installer (accepting all defaults is fine).
Open a command prompt (Menus Start->Run and type cmd) and run the ppm 
shell (C:\>ppm).
Add two new ppm repositories with the following commands:
ppm> rep add Bioperl http://bioperl.org/DIST
ppm> rep add Kobes http://theoryx5.uwinnipeg.ca/ppms
Install Bioperl-1.4.
Go to http://www.bioperl.org and start reading documentation or try the 
example script at
the end of this file.


2) Bioperl on Windows
======================

Bioperl is a large collection of Perl modules (extensions to the Perl 
language) that aid
in the task of writing Perl code to deal with sequence data in a myriad 
of ways. Bioperl
provides objects for various types of sequence data and their associated 
features and
annotations. It provides interfaces for analysis of these sequences with 
a wide variety
of external programs (BLAST, fasta, clustalw and EMBOSS to name just a 
few). It provides
interfaces to various types of databases both remote (GenBank, EMBL etc) 
and local
(MySQL, flat files, GFF etc.) for storage and retrieval of sequences. 
And finally with
it’s associated documentation and mailing list Bioperl represents a 
community of
bioinformatics professionals working in Perl who are committed to 
supporting both
development of Bioperl and the new users who are drawn to the project.

While most bioinformatics and computational biology applications are 
developed in
Unix/Linux environments, more and more programs are being ported to 
other operating
systems like Windows, and many users (often biologists with little 
background in
programming) are looking for ways to automate bioinformatics analyses in 
the Windows
environment. Perl and Bioperl can be installed natively on Windows 
NT/2000/XP. Most of
the functionality of Bioperl is available with this type of install. 
Much of the heavy
lifting in bioinformatics is done by programs originally developed in 
lower level
languages like C and Pascal (e.g. BLAST, clustalw, Staden etc). Bioperl 
simply acts as a
wrapper for running and parsing output from these external programs. 
Some of those
programs (BLAST for example) are ported to Windows. These can be 
installed and work
quite happily with BioPerl in the native Windows environment. Others, 
such as clustalw,
have Windows ports, however the BioPerl developer who wrote the 
interface used Unix
specific system calls to interact with these programs and so these 
wrappers will not work
in the Windows environment. And finally some external programs such as 
Staden and the
EMBOSS suite of programs can not be installed on Windows at all, and 
therefore any part
of Bioperl that interacts with these packages either won’t work or can’t 
be installed at
all.

If you have a fairly simple project in mind, want to start using Bioperl 
quickly, only
have access to a computer running Windows, and/or don’t mind bumping up 
against some
limitations then Bioperl on Windows may be a good place for you to 
start. For example,
downloading a bunch of sequences from GenBank and sorting out the ones 
that have a
particular annotation or feature works great. Running a bunch of your 
sequences against
remote or local BLAST, parsing the output and storing it in a MySQL 
database would be
fine also. Be aware that most if not all of the Bioperl developers are 
working in some
type of a Unix environment (Linux, OSX, Cygwin). If you have problems 
with Bioperl that
are specific to the Windows environment, you may be blazing new ground 
and your pleas for
help on the Bioperl mailing list may get few responses – simply because 
no one knows the
answer to your Windows specific problem. If this is or becomes a problem 
for you then
you are better off working in some type of Unix like environment. One 
solution to this
problem that will keep you working on a Windows machine it to install 
Cygwin, a Unix
emulation environment for Windows. A number of Bioperl users are using 
this approach
successfully and it is discussed more below.

3) Perl on Windows
===================

There are a couple of ways of installing Perl on a Windows machine. The 
most common and
easiest is to get the most recent build from ActiveState. ActiveState is 
a software
company (http://www.activestate.com) that provides free builds of Perl 
for Windows
users. The current (December 2004) build is ActivePerl 5.8.4.810 
(ActivePerl 5.6.1.638
is also available and should work just fine). To install ActivePerl on 
Windows:
Download the ActivePerl MSI from 
http://www.activestate.com/Products/ActivePerl/
Run the ActivePerl Installer (accepting all defaults is fine).

You can also build Perl yourself (which requires a C compiler) or 
download one of the
other binary distributions. The Perl source for building it yourself is 
available from
CPAN (http://www.cpan.org), as are a few other binary distributions that 
are alternatives
to ActiveState. This approach is not recommended unless you have 
specific reasons for
doing so and know what you’re doing. It that’s the case you probably 
don’t need to be
reading this guide.

Cygwin is a Unix emulation environment for Windows and comes with its 
own copy of Perl.
Information on Cygwin and Bioperl is found below.

4) BioPerl on Windows
======================

Perl is a programming language that has been extended a lot by the 
addition of external
modules. These modules work with the core language to extend the 
functionality of Perl.
Bioperl is one such extension to Perl. These modular extensions to Perl 
sometimes depend
on the functionality of other Perl modules and this creates a 
dependency. You can’t
install module X unless you have already installed module Y. Some Perl 
modules are so
fundamentally useful that the Perl developers have included them in the 
core distribution
of Perl – if you’ve installed Perl then these modules are already 
installed. Other
modules are freely available from CPAN, but you’ll have to install them 
yourself if you
want to use them. BioPerl has such dependencies.

Bioperl is actually a large collection of Perl modules (over 1000 
currently) and these
modules are split into six groups. These six groups are:

Bioperl Group Functions
-----------------------------------------------------------------
bioperl (the core) Most of the main functionality of Bioperl.
bioperl-run Wrappers to a lot of external programs.
bioperl-ext Interaction with some alignment functions
and the Staden package.
bioperl-db Using bioperl with BioSQL and local
relational databases.
bioperl-microarray Microarray specific functions.
biperl-gui Some preliminary work on a graphical user
interface to some Bioperl functions.

The Bioperl core is what most new users will want to start with. Bioperl 
1.4 (the core)
and the Perl modules that it depends on can be easily installed with 
ppm. PPM
(Programming Package Manager) is an ActivePerl utility for installing 
Perl modules on
systems using ActivePerl. PPM will look online (you have to be connected 
to the internet
of course) for files (these files end with .ppd) that tell it how to 
install the modules
you want and what other modules your new modules depends on. It will 
then download and
install your modules and all dependent modules for you. These .ppd files 
are stored
online in ppm repositories. ActiveState maintains the largest ppm 
repository and when
you installed ActivePerl ppm was installed with directions for using the 
ActiveState
repositories. Unfortunately the ActiveState repositories are far from 
complete and other
ActivePerl users maintain their own ppm repositories to fill in the 
gaps. Installing
will require you to direct ppm to look in two new repositories. You do 
this by opening a
Windows command prompt, typing ppm to start the ppm shell and then 
typing the following
two commands:
ppm> rep add Bioperl http://bioperl.org/DIST
ppm> rep add Kobes http://theoryx5.uwinnipeg.ca/ppms

Once ppm knows where to look for Bioperl and it’s dependencies you 
simply tell ppm to
install it. This is done with the command:
ppm> install Bioperl-1.4

5) Beyond the Core
===================

You may find that you want some of the features of other Bioperl groups 
like bioperl-run
or bioperl-db. There are currently no ppm packages for installing these 
parts of
Bioperl. You will have to install these manually from source. For this 
you will need a
Windows version of the program make called nmake
(http://download.microsoft.com/download/vc15/Patch/1.52/W95/EN-US/Nmake15.exe). 
You will
also want to have a willingness to experiment. You’ll have to read the 
installation
documents for each component that you want to install, and use nmake 
where the
instructions call for make. You will have to determine from the 
installation documents
what dependencies are required and you will have to get them, read there 
documentation
and install them first. The details of this are beyond the scope of this 
guide. Read
the documentation. Search Google. Try your best, and if you get stuck 
consult with
other on the bioperl mailing list.

6) BioPerl and Cygwin
=====================

Cygwin is a Unix emulator and shell environment available free at 
www.cygwin.com. BioPerl
runs well within Cygwin. Some users claim that installation of Bioperl 
is easier within
Cygwin than within Windows, but these may be users with Unix backgrounds.

One advantage of using Bioperl in Cygwin is that all the external 
modules are available
through CPAN, most if not all external programs can be installed and run 
so many of the
limitation of Bioperl on Windows are circumvented.

To get Bioperl running first install the basic Cygwin package as well as 
the Cygwin Perl,
make, and gcc packages. Clicking the "View" button in the upper right of 
the installer
enables you to see details on the various packages. Then follow the 
BioPerl installation
instructions for Unix in BioPerl's INSTALL file.

Note that expat comes with Cygwin (it's used by the module XML::Parser).

One known issue is that DBD::mysql can be tricky to install in
Cygwin and this module is required for the bioperl-db, Biosql, and 
bioperl-pipeline
external packages. Fortunately there's some good instructions online:
http://search.cpan.org/src/JWIED/DBD-mysql-2.1025/INSTALL.html#windows/cygwin.

Also, set the environmental variable TMPDIR, programs like BLAST and 
clustalw need a
place to create temporary files. e.g.:

setenv TMPDIR e:/cygwin/tmp # csh, tcsh
export TMPDIR=e:/cygwin/tmp # sh, bash

Note that this is not a syntax that Cygwin understands, which would be 
something like
"/cygdrive/e/cygwin/tmp". This is the syntax that a Perl module expects 
on Windows.

If this variable is not set correctly you'll see errors like this when 
you run
Bio::Tools::Run::StandAloneBlast:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Could not open /tmp/gXkwEbrL0a: No such file or directory
STACK: Error::throw
..........

7) Cygwin Tips
===============

The easiest way to install MySQL is to use the Windows binaries 
available at
www.mysql.com. Note that Windows does not have sockets, so you need to 
force the MySQL
connections to use TCP/IP instead. Do this by using the "-h" option from 
the command-
line:

 >mysql -h 127.0.0.1 -u blip -pblop biosql

Or, alias the mysql command in your .tcshrc, .cshrc, or .bashrc so it 
uses a host. For
example, if your databases are installed locally:

alias mysql 'mysql -h 127.0.0.1'

If you're trying to use some application or resource "outside" of Cygwin 
and you're
having a problem remember that Cygwin's path syntax may not be the 
correct one. Cygwin
understands '/home/jacky' or '/cygdrive/e/cygwin/home/jacky' (when 
referring to the E:
drive) but the external resource may want 'E:/cygwin/home/jacky'. So 
your *rc files may
end up with paths written in these different syntaxes, depending.

If you can, install Cygwin on a drive or partition that's 
NTFS-formatted, not FAT32-
formatted. When you install Cygwin on a FAT32 partition you will not be 
able to set
permissions and ownership correctly. In most situations this probably 
won't make any
difference but there may be occasions where this is a problem.

If you want to use BLAST we recommend that the Windows binary be 
obtained from NCBI
(ftp://ftp.ncbi.nih.gov/blast/executables/LATEST-BLAST - the file will 
be named something
like blast-2.2.6-ia32-win32.exe). Then follow the Windows instructions 
in README.bls.

Although we've recommended using the BLAST and MySQL binaries you should 
be able to
compile just about everything else from source code using Cygwin's gcc. 
You'll notice
when you're installing Cygwin that many different libraries are also 
available (gd, jpeg,
etc.).

8) Example Script
=================

#!/usr/bin/perl

#A short script to demonstrate how to download sequences from GenBank 
and access
#the sequence and some associated annotations using Bioperl.

use strict;
use warnings;
use Bio::SeqIO;
use Bio::DB::GenBank; #use Bio::DB::GenPept or Bio::DB::RefSeq if needed

#Get some sequence IDs either like below, or read in from a file. Note that
#this sample script works with the accession numbers below (at least at 
the time
#it was written). If you add different accession numbers, and you get 
errors,
#you may be calling for something that the sequence doesn't have. You'll 
have
#to add your own error trapping code to handle that.
my @ids = ('K03160', 'AB039327', 'BC035972');

#Create the GenBank database object to read from the database.
my $gb = new Bio::DB::GenBank();

#Create a sequence stream to pass the sequences from the database to the 
program.
my $seqio = $gb->get_Stream_by_id(\@ids);

#Loop over all of the sequences that you requested.
while (my $seq = $seqio->next_seq) {

#Here is how you get methods directly from the RichSeq object. Replace
#'display_name' with any other method in Table 2. that can be called on
#either the RichSeq object directly, or the PrimarySeq object which it has
#inherited.
print "Display Name: ", $seq->display_name,"\n";
print "Sequence Date: ",$seq->get_dates,"\n";

#Here is how to access the classification data from the species object.
my $species = $seq->species;
print "Species :", $species->common_name,"\n";
my @class = $species->classification;
print "Classification: @class\n";

#Here is a general way to call things that are stored as a Bio::SeqFeature::
#Generic object. Replace 'source' with any other of the "major" headings in
#the feature table (e.g gene, CDS, etc.) and replace 'organism' with any of
#the tag values found under that heading (mol_type, locus_tag, gene, etc.)
my @source_feats = grep { $_->primary_tag eq 'source' } 
$seq->get_SeqFeatures();
my $source_feat = shift @source_feats;
my @mol_type = $source_feat->get_tag_values('mol_type');
print "Molecule Type: @mol_type\n";

#Here is a general way to call things that are stored as some type of a
#Bio::Annotation oject. This includes reference information, and comments.
#Replace reference with 'comment' to get the comment, and replace
#$ref->authors with $ref->title (or location, medline, etc.) to get other
#reference categories
my $ann = $seq->annotation();
my @references = ($ann->get_Annotations('reference'));
my $ref = shift @references;
my ($title, $authors, $location, $pubmed, $reference);
if (defined $ref) {
$authors = $ref->authors;
print "Authors: $authors\n";
}
print "Sequence: \n", $seq->seq, "\n\n";
}

-- 
Barry Moore
Dept. of Human Genetics
University of Utah
Salt Lake City, UT

-------------- next part --------------
Installing Bioperl on Windows
=============================

1) Quick Instructions for the Impatient 
2) Bioperl on Windows
3) Perl on Windows
4) BioPerl on Windows
5) Beyond the Core
6) BioPerl and Cygwin
7) Cygwin Tips
8) Example Script

This installation guide was written by Barry Moore and other Bioperl authors based on the 
original work of Paul Boutros. Please report problems and/or fixes to the bioperl mailing 
list, bioperl-l at bioperl.org

1) Quick instructions for the impatient, lucky, or experienced user.
=====================================================================

Download the ActivePerl MSI from http://www.activestate.com/Products/ActivePerl/
Run the ActivePerl Installer (accepting all defaults is fine).
Open a command prompt (Menus Start->Run and type cmd) and run the ppm shell (C:\>ppm).
Add two new ppm repositories with the following commands:
	ppm> rep add Bioperl http://bioperl.org/DIST
	ppm> rep add Kobes http://theoryx5.uwinnipeg.ca/ppms
Install Bioperl-1.4.
Go to http://www.bioperl.org and start reading documentation or try the example script at 
the end of this file.


2) Bioperl on Windows
======================

Bioperl is a large collection of Perl modules (extensions to the Perl language) that aid 
in the task of writing Perl code to deal with sequence data in a myriad of ways.  Bioperl 
provides objects for various types of sequence data and their associated features and 
annotations.  It provides interfaces for analysis of these sequences with a wide variety 
of external programs (BLAST, fasta, clustalw and EMBOSS to name just a few).  It provides 
interfaces to various types of databases both remote (GenBank, EMBL etc) and local 
(MySQL, flat files, GFF etc.) for storage and retrieval of sequences.  And finally with 
it’s associated documentation and mailing list Bioperl represents a community of 
bioinformatics professionals working in Perl who are committed to supporting both 
development of Bioperl and the new users who are drawn to the project.

While most bioinformatics and computational biology applications are developed in 
Unix/Linux environments, more and more programs are being ported to other operating 
systems like Windows, and many users (often biologists with little background in 
programming) are looking for ways to automate bioinformatics analyses in the Windows 
environment.  Perl and Bioperl can be installed natively on Windows NT/2000/XP.  Most of 
the functionality of Bioperl is available with this type of install.  Much of the heavy 
lifting in bioinformatics is done by programs originally developed in lower level 
languages like C and Pascal (e.g. BLAST, clustalw, Staden etc).  Bioperl simply acts as a 
wrapper for running and parsing output from these external programs.  Some of those 
programs (BLAST for example) are ported to Windows.  These can be installed and work 
quite happily with BioPerl in the native Windows environment.  Others, such as clustalw, 
have Windows ports, however the BioPerl developer who wrote the interface used Unix 
specific system calls to interact with these programs and so these wrappers will not work 
in the Windows environment.  And finally some external programs such as Staden and the 
EMBOSS suite of programs can not be installed on Windows at all, and therefore any part 
of Bioperl that interacts with these packages either won’t work or can’t be installed at 
all.

If you have a fairly simple project in mind, want to start using Bioperl quickly, only 
have access to a computer running Windows, and/or don’t mind bumping up against some 
limitations then Bioperl on Windows may be a good place for you to start.  For example, 
downloading a bunch of sequences from GenBank and sorting out the ones that have a 
particular annotation or feature works great.  Running a bunch of your sequences against 
remote or local BLAST, parsing the output and storing it in a MySQL database would be 
fine also.  Be aware that most if not all of the Bioperl developers are working in some 
type of a Unix environment (Linux, OSX, Cygwin).  If you have problems with Bioperl that 
are specific to the Windows environment, you may be blazing new ground and your pleas for 
help on the Bioperl mailing list may get few responses – simply because no one knows the 
answer to your Windows specific problem.  If this is or becomes a problem for you then 
you are better off working in some type of Unix like environment.  One solution to this 
problem that will keep you working on a Windows machine it to install Cygwin, a Unix 
emulation environment for Windows.  A number of Bioperl users are using this approach 
successfully and it is discussed more below.

3) Perl on Windows
===================

There are a couple of ways of installing Perl on a Windows machine.  The most common and 
easiest is to get the most recent build from ActiveState.  ActiveState is a software 
company (http://www.activestate.com) that  provides free builds of Perl for Windows 
users.  The current  (December 2004) build is ActivePerl 5.8.4.810 (ActivePerl 5.6.1.638   
is also available and should work just fine).  To install ActivePerl on Windows:
	Download the ActivePerl MSI from http://www.activestate.com/Products/ActivePerl/
	Run the ActivePerl Installer (accepting all defaults is fine).  

You can also build Perl yourself (which requires a C compiler) or download one of the 
other binary distributions.  The Perl source for building it yourself is available from 
CPAN (http://www.cpan.org), as are a few other binary distributions that are alternatives 
to ActiveState.  This approach is not recommended unless you have specific reasons for 
doing so and know what you’re doing.  It that’s the case you probably don’t need to be 
reading this guide.

Cygwin is a Unix emulation environment for Windows and comes with its own copy of Perl.  
Information on Cygwin and Bioperl is found below.

4) BioPerl on Windows
======================

Perl is a programming language that has been extended a lot by the addition of external 
modules.  These modules work with the core language to extend the functionality of Perl.  
Bioperl is one such extension to Perl.  These modular extensions to Perl sometimes depend 
on the functionality of other Perl modules and this creates a dependency.  You can’t 
install module X unless you have already installed module Y.  Some Perl modules are so 
fundamentally useful that the Perl developers have included them in the core distribution 
of Perl – if you’ve installed Perl then these modules are already installed.  Other 
modules are freely available from CPAN, but you’ll have to install them yourself if you 
want to use them.  BioPerl has such dependencies.

Bioperl is actually a large collection of Perl modules (over 1000 currently) and these 
modules are split into six groups.  These six groups are:

	Bioperl Group                         Functions
	-----------------------------------------------------------------
      bioperl (the core)        Most of the main functionality of Bioperl.
      bioperl-run               Wrappers to a lot of external programs.
      bioperl-ext               Interaction with some alignment functions
                                and the Staden package.
      bioperl-db                Using bioperl with BioSQL and local
                                relational databases.
      bioperl-microarray        Microarray specific functions.
      biperl-gui                Some preliminary work on a graphical user
                                interface to some Bioperl functions.

The Bioperl core is what most new users will want to start with.  Bioperl 1.4 (the core) 
and the Perl modules that it depends on can be easily installed with ppm.  PPM 
(Programming Package Manager) is an ActivePerl utility for installing Perl modules on 
systems using ActivePerl.  PPM will look online (you have to be connected to the internet 
of course) for files (these files end with .ppd) that tell it how to install the modules 
you want and what other modules your new modules depends on.  It will then download and 
install your modules and all dependent modules for you.  These .ppd files are stored 
online in ppm repositories.  ActiveState maintains the largest ppm repository and when 
you installed ActivePerl ppm was installed with directions for using the ActiveState 
repositories.  Unfortunately the ActiveState repositories are far from complete and other 
ActivePerl users maintain their own ppm repositories to fill in the gaps.  Installing 
will require you to direct ppm to look in two new repositories.  You do this by opening a 
Windows command prompt, typing ppm to start the ppm shell and then typing the following 
two commands:
      ppm> rep add Bioperl http://bioperl.org/DIST
      ppm> rep add Kobes http://theoryx5.uwinnipeg.ca/ppms

Once ppm knows where to look for Bioperl and it’s dependencies you simply tell ppm to 
install it.  This is done with the command:
      ppm> install Bioperl-1.4 

5) Beyond the Core
===================

You may find that you want some of the features of other Bioperl groups like bioperl-run 
or bioperl-db.  There are currently no ppm packages for installing these parts of 
Bioperl.  You will have to install these manually from source.  For this you will need a 
Windows version of the program make called nmake 
(http://download.microsoft.com/download/vc15/Patch/1.52/W95/EN-US/Nmake15.exe).  You will 
also want to have a willingness to experiment.  You’ll have to read the installation 
documents for each component that you want to install, and use nmake where the 
instructions call for make.  You will have to determine from the installation documents 
what dependencies are required and you will have to get them, read there documentation 
and install them first.  The details of this are beyond the scope of this guide.  Read 
the documentation.  Search Google.  Try your best, and if you get stuck consult with 
other on the bioperl mailing list.

6) BioPerl and Cygwin
=====================

Cygwin is a Unix emulator and shell environment available free at www.cygwin.com. BioPerl 
runs well within Cygwin. Some users claim that installation of Bioperl is easier within 
Cygwin than within Windows, but these may be users with Unix backgrounds.

One advantage of using Bioperl in Cygwin is that all the external modules are available 
through CPAN, most if not all external programs can be installed and run so many of the 
limitation of Bioperl on Windows are circumvented.

To get Bioperl running first install the basic Cygwin package as well as the Cygwin Perl, 
make, and gcc packages. Clicking the "View" button in the upper right of the installer 
enables you to see details on the various packages. Then follow the BioPerl installation 
instructions for Unix in BioPerl's INSTALL file.

Note that expat comes with Cygwin (it's used by the module XML::Parser).

One known issue is that DBD::mysql can be tricky to install in
Cygwin and this module is required for the bioperl-db, Biosql, and bioperl-pipeline 
external packages. Fortunately there's some good instructions online: 
http://search.cpan.org/src/JWIED/DBD-mysql-2.1025/INSTALL.html#windows/cygwin.

Also, set the environmental variable TMPDIR, programs like BLAST and clustalw need a 
place to create temporary files. e.g.:

setenv TMPDIR e:/cygwin/tmp     # csh, tcsh
export TMPDIR=e:/cygwin/tmp     # sh, bash

Note that this is not a syntax that Cygwin understands, which would be something like 
"/cygdrive/e/cygwin/tmp". This is the syntax that a Perl module expects on Windows.

If this variable is not set correctly you'll see errors like this when you run 
Bio::Tools::Run::StandAloneBlast:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Could not open /tmp/gXkwEbrL0a: No such file or directory
STACK: Error::throw
..........

7) Cygwin Tips
===============

The easiest way to install MySQL is to use the Windows binaries available at 
www.mysql.com. Note that Windows does not have sockets, so you need to force the MySQL 
connections to use TCP/IP instead. Do this by using the "-h" option from the command-
line:

>mysql -h 127.0.0.1 -u blip -pblop biosql

Or, alias the mysql command in your .tcshrc, .cshrc, or .bashrc so it uses a host. For 
example, if your databases are installed locally:

alias mysql 'mysql -h 127.0.0.1'

If you're trying to use some application or resource "outside" of Cygwin and you're 
having a problem remember that Cygwin's path syntax may not be the correct one. Cygwin 
understands '/home/jacky' or '/cygdrive/e/cygwin/home/jacky' (when referring to the E: 
drive) but the external resource may want 'E:/cygwin/home/jacky'. So your *rc files may 
end up with paths written in these different syntaxes, depending.

If you can, install Cygwin on a drive or partition that's NTFS-formatted, not FAT32-
formatted. When you install Cygwin on a FAT32 partition you will not be able to set 
permissions and ownership correctly. In most situations this probably won't make any 
difference but there may be occasions where this is a problem.

If you want to use BLAST we recommend that the Windows binary be obtained from NCBI 
(ftp://ftp.ncbi.nih.gov/blast/executables/LATEST-BLAST - the file will be named something 
like blast-2.2.6-ia32-win32.exe). Then follow the Windows instructions in README.bls.

Although we've recommended using the BLAST and MySQL binaries you should be able to 
compile just about everything else from source code using Cygwin's gcc. You'll notice 
when you're installing Cygwin that many different libraries are also available (gd, jpeg, 
etc.).

8) Example Script
=================

#!/usr/bin/perl

#A short script to demonstrate how to download sequences from GenBank and access
#the sequence and some associated annotations using Bioperl.

use strict;
use warnings;
use Bio::SeqIO;
use Bio::DB::GenBank; #use Bio::DB::GenPept or Bio::DB::RefSeq if needed

#Get some sequence IDs either like below, or read in from a file.  Note that
#this sample script works with the accession numbers below (at least at the time
#it was written).  If you add different accession numbers, and you get errors,
#you may be calling for something that the sequence doesn't have.  You'll have
#to add your own error trapping code to handle that.
my @ids = ('K03160', 'AB039327', 'BC035972');

#Create the GenBank database object to read from the database.
my $gb = new Bio::DB::GenBank();

#Create a sequence stream to pass the sequences from the database to the program.
my $seqio = $gb->get_Stream_by_id(\@ids);

#Loop over all of the sequences that you requested.
while (my $seq = $seqio->next_seq) {

  #Here is how you get methods directly from the RichSeq object.  Replace
  #'display_name' with any other method in Table 2. that can be called on
  #either the RichSeq object directly, or the PrimarySeq object which it has
  #inherited.
  print "Display Name:  ", $seq->display_name,"\n";
  print "Sequence Date:  ",$seq->get_dates,"\n";

  #Here is how to access the classification data from the species object.
  my $species = $seq->species;
  print "Species  :", $species->common_name,"\n";
  my @class = $species->classification;
  print "Classification:  @class\n";

  #Here is a general way to call things that are stored as a Bio::SeqFeature::
  #Generic object.  Replace 'source' with any other of the "major" headings in
  #the feature table (e.g gene, CDS, etc.) and replace 'organism' with any of
  #the tag values found under that heading (mol_type, locus_tag, gene, etc.)
  my @source_feats = grep { $_->primary_tag eq 'source' } $seq->get_SeqFeatures();
  my $source_feat = shift @source_feats;
  my @mol_type = $source_feat->get_tag_values('mol_type');
  print "Molecule Type:  @mol_type\n";
  
  #Here is a general way to call things that are stored as some type of a
  #Bio::Annotation oject.  This includes reference information, and comments.
  #Replace reference with 'comment' to get the comment, and replace
  #$ref->authors with $ref->title (or location, medline, etc.) to get other
  #reference categories
  my $ann = $seq->annotation();
  my @references = ($ann->get_Annotations('reference'));
  my $ref = shift @references;
  my ($title, $authors, $location, $pubmed, $reference);
  if (defined $ref) {
    $authors = $ref->authors;
    print "Authors:  $authors\n";
  }
  print "Sequence:  \n", $seq->seq, "\n\n";
}


More information about the Bioperl-l mailing list