[Bioperl-l] operating systems and bio-computing

Mark Dalphin mdalphin at amgen.com
Wed May 14 15:06:58 EDT 2003


Dear Thomas,

Here at Amgen, we are running DEC Alphas, SGI Octanes, HP-UX and a few Suns in 
the Computational Biology group. We are switching to Linux due to the huge 
cost advantage.

I have recently given up my SGI Octane running Irix in exchange for a Linux 
box.  It has saved me a great deal of time downloading and building 
"freeware" which wasn't available (or out of date at the SGI web site). I 
find my Linux box to be MUCH faster than the SGI (dual P4 at 1.4 GHz 1 Gbyte 
RAM vs R10000 256MByte RAM) and easily as stable. And freeware for 
bioinformatics installs quickly and easily.

When I benchmarked Blast (wu-blast) as well as several other programs we used 
(FGenes, GenScan, RepeatMasker, various protein modeling and threading 
tools), I found that a 1 GHz, dual CPU Linux box ran about 1/2 as fast as an 
Alpha DS-10 (the SGI, HP and Suns weren't in the running for creating a 
Compute Cluster) but cost about 1/5 as much. Since our problems scale well by 
CPU (2x CPUs mean 2x faster), the cost advantage of Linux was obvious. We 
built a Linux cluster (not Beowolf, just a cluster of headless work-stations 
with a queueing system for distributing jobs) and almost all of our work is 
done there instead of on the older Alpha compute cluster.

In this heterogenous environment, people are migrating to Linux (RedHat 7.3) 
out of choice for almost all tasks except viewing protein models; there SGI 
still prevails. And it sounds from your email as if people will migrate to 
Linux for that work as well.

One final caveat: our SysAdmins find that running a Linux Compute Cluster is 
somewhat more time-consuming than running the Alpha Cluster. Certain tasks 
are well supported, but others are not nearly as clean as for the commercial 
OSs. For example, we had a problem getting NFS to work at close to wire-speed 
from the Linux boxes. Our SysAdmins finally located some patches to the 
network drivers and the throughput increased significantly. A commercial OS 
seller would probably have worked on that for us; instead we needed Google, 
lots of time to experiment and some knowledgable SysAdmins. A summary of this 
might be: for 'enterprise scale' computing the commercial OSs have some 
advantages that cost alone ignores. For in individual work station, this is 
probable much less of a problem as Red Hat has already tuned their product to 
a single user.

Cheers,
Mark

PS: This reflects my personal opinion and personal experience; Amgen did buy 
the Linux compute cluster, but my benchmarks were only part of the reason, I 
am sure.

-- 
Mark Dalphin                          email: mdalphin at amgen.com
Mail Stop: 29-2-A                     phone: +1-805-447-4951 (work)
One Amgen Center Drive                       +1-805-375-0680 (home)
Thousand Oaks, CA 91320                 fax: +1-805-499-9955 (work)




More information about the Bioperl-l mailing list