[Biobiz] BOF summary

24 Jul 2001 10:09:45 +0200

The First 'Supporting Open Source Software' BOF
-----------------------------------------------

Introduction

The first Supporting Open Source Software BOF took place at BOSC 2001,
held during ISMB 2001 in Copenhagen, Denmark.

The purpose of the meeting was to initiate a conversation about both the
business opportunities created by the growing development and deployment
of open source tools within the bioinformatics community as well as the
obstacles that exist to their wider adoption within their respective
user communities.

Apologies were made for Tania Broveak Hide who was originally scheduled
to facilitate the discussion but could unfortunately not make it due to
personal reasons beyond her control.

Background

It is important to note that the widespread use and adoption of open
source software is a relatively recent phenomena and, as such, there
exists very little information about the impact that this model of
software development will have upon users and business models.

Traditionally business models within the software development industry
are based upon the sale of binary distributions of software, generally
priced per user. 

A number of policies consequently arose out of this business model,
amongst which are: 

* Exclusive control of both distribution, modification and 
  contribution to the source code making up an application.
* Restrictive licensing terms in order to prevent unauthorized
redistribution.
* Pressure to limit interoperability with other products which may
compete for 'per-user' installations.
* Support services frequently  being viewed as a loss center within a
company as it rarely contributes directly to more 'per-user'
installations.
* etc. etc.

The promise of open source software is that it seriously challenges
business models founded upon the exclusive control of the right to
distribute binary distributions of software and, thus, will hopefully
liberate end users from the restrictive policies which often accompanied
these business models.

However, despite the impressive initial successes enjoyed by the open
source bioinformatics projects it is becoming clear that a new set of
problems are arising for end users as well as commercial enterprises who
wish to develop open source software.

Obstacles

Through the course of the BOF discussion a picture started to emerge
about the frustrations and problems most commonly faced by users and
developers of the OpenBio projects, this is that list.

* There exists a distinct lack of documentation as most OpenBio
development is done by technically proficient volunteers who,
understandably, prefer to focus their efforts on those areas most
closely reflecting their skills and abilities.

* Although some training events have been arranged by specific OpenBio
projects (BioJava boot camp, ENSEMBL training etc.) a question to the
participants established that there exists considerable demand for more
high quality training, especially if that training were geographically
closer to the attendees !

* Although many people felt that support from the OpenBio mailing lists
was timely and of very high quality there does exist questions about
whether that support would be sufficient in the case of a larger user
community as well as a more inexperienced community.

* There exists significant frustration with the misperceptions which
exist about open source software especially at the higher levels of
organizations. These perceptions include: Open source is an all or
nothing proposition requiring you to throw out your existing
infrastructure. Using open source software may harm your intellectual
property. Open source software is of a low technical standard. Open
source software can constitute a security risk as anyone could make
unauthorized changes to the code with malicious intent.

* Many larger potential users of open source software have valid
concerns about who they can turn to when things go wrong as, while there
exists a legally binding obligation for a commercial supplier to provide
recompense, no such obligation exists in the case of a product developed
by volunteers.

* The OpenBio community is relatively small, leading to a general
shortage of volunteers with skills in graphic design, copy writing and
editing.

* A traditional software development company has no difficulty hiring
and retaining project members dedicated to quality control,
documentation and support as the cost of their salaries can be
incorporated into the product selling price. Volunteer open source
developers as well as commercial open source developers find it a lot
more difficult to cover this expense. Also, In the case of a commercial
open source developer the contributed code was often a spin-off from a
proprietary product and as such may not generate any direct revenue for
the company making it hard to justify the expense.

* Many potential users of the OpenBio projects are unaware that many of
these projects can also function under Microsoft Windows thus scaring
them off due to the (very real !) technical challenges inherent in
running a Unix workstation.

* Many potential users of the OpenBio projects are unaware that these
projects even exist ! More technically savvy users may be used to
searching for tools on the world wide web, but this is not generally a
tactic utilized by everyone who may instead rely on print publications,
journal reviews, IT departments and other sources.

* Users who may have particular needs that are not addressed by an
OpenBio project may not always know how to go about making a request for
a feature known. It is hard to fire off an email into the unknown if you
are used to dealing directly with a product manager who is paid to
listen to you !

* There exists a lot of pressure within specific scientific fields to
use the same tools as other researchers in the and often these tools are
proprietary. There may be considerable career risk involved in ?crossing
over? to free alternatives as this may result in your results being
questioned by peers who have a long history of use and confidence in
these proprietary tools, many of which have a prestigious list of
publications attesting to their worth.

Possible Remedies

It is fortunate that we rarely raise obstacles without also coming up
with ideas to overcome them as the preceding list is almost starting to
look like a very strong case for avoiding open source software !

A number of explicit suggestions were made at the meeting which are
worthy of further investigation:

* A 'BioSlashDot' which could service the community with announcements
of new projects, reviews of software, updates on mailing list activity
etc.

* A centralised documentation repository similiar to the FreeBSD
documentation project and the Linux documentation project. Documentation
would be distributed under an open license and written in a common
format such as Docbook which would facilitate the generation of specific
documentation distributions in a wide variety of formats and media
types.

* Approaching technical publishers about the feasibility of publishing
high quality books about the OpenBio projects.

* Encouraging the adoption of inline source code documentation
practices within those projects which are not yet following this
practice.

* A booth at future ISMB?s and other conferences dedicated to the open
source bioinformatics projects.

* Tutorial sessions for specific OpenBio projects at future ISMB?s.

* The provision of commercial support, customisation and training
services as many users of OpenBio projects are already willing to pay
for such services from proprietary vendors.

* OpenBio projects may wish to consider approaching their respective
funding sources to investigate the feasibility of additional funding
specifically set aside for quality control, support and documentation.

* Investigating the practices of existing successful open source
businesses within other fields. e.g. Redhat, MySQL

* It also became apparent that their exists a bit of a disconnect
between vendors and users as to exactly what constitutes support ! It is
clear that businesses which were, in the past, used to viewing support
as a necessary but unprofitable evil may have to do some significant
re-evaluation of both the content and profitability of an environment in
which support may be the primary product.

* Investigate the possibility of opening a channel of communication
between the OpenBio projects and the prestigious journals such as
Science and Nature for project announcements and reviews.

* Encouraging the creation of more online tutorial and ?HOWTO?
documentation enabling users to get started with useful applications of
OpenBio projects much quicker.

Lastly, Chris Dagdigian generously offered to create the
biobiz@open-bio.org mailing list in order that we may continue this
conversation online.

Conclusion

It is becoming clear that, despite the large amounts of high quality
code that exists within the OpenBio projects CVS servers there exists a
'last mile' gap between those servers and the end users.

In order to ensure that this code does not go wasted it is becoming
increasingly important to identify the specific problems and challenges
to wider adoption of this work and overcome them.

Although some of these challenges can only be overcome within the
OpenBio development community there are also a number of areas where
significant opportunities present themselves to commercial enterprises
who are not averse to risk or scared of change.

Finally I would like to thank everyone for their invaluable
participation, Chris Dagdigian for setting up the mailing list, the BOSC
organizing committee for providing a forum for the BOFs as well as
Andrew Dalke of Dalke Scientific Software for co-ordinating the BOFs.

- antoine

--
Antoine van Gelder
antoine@egenetics.com