[Biojava-l] Database Management Systems

ZHOU, YIHUA [FND/1000] yihua.zhou@monsanto.com
Tue, 19 Dec 2000 11:59:11 -0600


I have used an older version of a free JDBC drivers for MS SQL Server from
FreeTDS. I think you can still get it from
ftp://freetds.internetcds.com/pub/freetds_jdbc/ if you have not tried yet.

Yihua

-----Original Message-----
From: biojava-l-request@biojava.org
[mailto:biojava-l-request@biojava.org]
Sent: Tuesday, December 19, 2000 11:02 AM
To: biojava-l@biojava.org
Subject: Biojava-l digest, Vol 1 #202 - 9 msgs

Message: 3
From: "McCulloch, Alan" <alan.mcculloch@agresearch.co.nz>
To: biojava-l@biojava.org
Subject: RE: [Biojava-l] Database Management Systems
Date: Tue, 19 Dec 2000 09:49:08 +1300

Apologies in advance if this is a bit off-topic. (There is one biojava
reference though !?, and several
Java ones). 

Briefly

SQL Server .v. Oracle Hypotheses

* DBA and development costs higher upfront with Oracle, but more ongoing
costs further down  the track with  
  SQL Server, to solve growing pains.

* Oracle on Unix performance scales better, for larger databases and more
users, than SQL Server on NT.

* Oracle stored procedures/functions API alot more mature and powerful than
SQL Server.



Details


Re below...

We've  used both Oracle and SQL Server, and are still on the learning curve
with both, with respect to 
bioinformatics databases, however I thought I'd contribute my own positive
and negative experiences for what 
they are worth - they simply reflect my experience, I don't claim they are
authoritative. 

(I would emphasize that I have not thoroughly explored to what extent the
negatives are
caused by the database design or management , as opposed to SQL Server
itself. And I welcome 
any corrections of fact - e.g. I could be wrong about lack of user-defined
stored functions in SS, I haven't explored
that issue thoroughly either.)

This is Oracle 8i (8.1.6)  .v. SQL Server 7.0


SQL Server Negatives


* significant problems with "killer" SQL queries consuming all resources and
locking out other users,
  and then sometimes failing.
  Of course this is a complex issue and the problem/solution could involve
index and query design, 
  amounts of resources available , operating system multi-user performance
etc etc. However I do have a 
  hypothesis that SQL Server running on an NT machine may not be as good at
load balancing as Oracle on 
  a Unix box, based on my experience with both. 

* problems importing large datasets , due to transaction log files using up
too much
  disk space. The import is done by a dedicated Java application program
(via JDBC) as there are a number of foreign 
  keys to bind and also some checks that have to be done, so could not use
import wizards. Commits are 
  done every 50 inserts, however this doesn't help. I suspect its a DBA
issue - however the point is this
  has in fact involved quite a bit of SQL Server DBA time and problems
(still not 100% resolved).

* problems with long sequences overflowing the data type available
(varchar). Oracle 
  has similar limitations with its varchar type, however later versions also
have a CLOB (character large object)
   type , and and associated package of functions (DBMS_LOB). I guess SQL
Server has
  a LOB type of some sort that could be used, however am waiting on DBA
advice there.

* Apparent lack of user-defined stored functions (as opposed to procedures)
in SQL Server. At least I'm darned if 
  I can find any reference to them. This is a nuisance. They are very handy,
in Oracle you can reference them in a 
  SQL statement.

* I am alot more familiar with the Oracle stored PL/SQL procedures/functions
environment than SQL Server.
  Have had a bit of a go with SQL Server stored procedures. My impression is
that the
  Oracle environment of packages, stored PL/SQL procedures and stored PL/SQL
functions is alot
   more mature than what is available in SQL Server, and I doubt I would be
able to do some of the things
   I want to in SQL server, that I can do in Oracle. One exception is
dynamically constructing SQL statements - 
   can't do this in PL/SQL, but can (I think) in SQL Server stored
procedures. However can do in
   Oracle 8i's Java based stored procedures.

* SQL Server thin JDBC drivers seem to be harder to get hold of than Oracle
ones, and I can't find a free one.
  Obviously not an issue if you don't want JDBC connectivity to
  your bioinformatics database , but we do - we have our own Java based
server process and currently
  have to run an instance on an NT machine rather than our main Tru64
Bioinformatics server,
  for SQL Server connectivity. We also want to integrate biojava. (Tru64
apparently offers ODBC 
  however haven't sussed this yet - if it works I could use the JDBC-ODBC
bridge).

* We have had significant problems building full text catalogs on the SQL
Server database for
  our larger bioinformatics tables, and still can't.  This has used up alot
of DBA time. 


Positives with SQL Server
  
* I've used the import wizard to import data from Excel spreadsheets quite
alot, and its great. Would be good
  if I could batch it though

* I have not done any DBA work with SQL Server however have looked over the
shoulder of people doing it and
   it looks very user friendly.

* The availability of "select top whatever from whever" . Small but useful.

* our SQL server based system has on the whole been succesfull up to this
point, and extremely  cost effective - 
   I am reasonably confident that to get  to the same point under Oracle
would have been more expensive. 
   (However the strain is showing as we start to load it up with more data,
meaner queries and more users)

* can dynamically construct and execute SQL statements in SQL Server stored
procedures. (At least I'm
  pretty sure you can). This is not possible in Oracle's PL/SQL stored
procedures - a big nuisance.
  However it can be done in Oracle 8i's Java based stored procedures.


Oracle Negatives

* lack of dynamic SQL in PL/SQL. However it is available in the new Java
based stored procedures, in Oracle 8i.

* My experience with the core database over the years has been very
positive, but not so with some of the
add-ons, such as Oracle Application Server. This is now obsolete and
replaced by something with
apparently very expensive licensing cost - this move caused us a few
problems, on a seperate
bioinfomatics related project.

SQL Server .v. Oracle Hypotheses

* DBA and development costs higher upfront with Oracle, but more ongoing
costs further down  the track with  
  SQL Server, to solve growing pains.

* Oracle on Unix performance scales better, for larger databases and more
users, than SQL Server on NT.

* Oracle stored procedures/functions API alot more mature and powerful than
SQL Server



Cheers

AMcC