[EMBOSS] Files included in EMBOSS but licensed ...

Chris Fields cjfields at illinois.edu
Fri Jul 29 13:51:53 UTC 2011

On Jul 29, 2011, at 3:39 AM, Peter Rice wrote:

> On 07/29/2011 08:46 AM, Peter Rice wrote:
>> On 28/07/2011 15:38, Charles Plessy wrote:
>>> Dear EMBOSS developers,
>>> (CC Debian Med mailing list)
>>> while working on upgrading Debian's emboss package to version 6.4.0
>>> (congratulations, by the way), I found some files in EMBOSS that are
>>> not considered ‘Free software’ by Debian. 
> While we're on the topic of licensing, some other data files in EMBOSS
> 6.4.0 have licences.
> emboss/data/OBO contains copies of several Open Bio-Ontologies for which
> EMBOSS includes index files - so you need the data file version that
> matches the index files.
> For example, the Gene Ontology terms
> http://www.geneontology.org/GO.cite.shtml are:
> GO Usage Policy
> The GO Consortium gives permission for any of its products to be used
> without license for any purpose under three conditions:
>    That the Gene Ontology Consortium is clearly acknowledged as the
> source of the product;
>    That any GO Consortium file(s) displayed publicly include the
> date(s) and/or version number(s) of the relevant GO file(s) (the GO is
> evolving and changes will occur with time);
>    That neither the content of the GO file(s) nor the logical
> relationships embedded within the GO file(s) be altered in any way.
> which looks rather like the problem you had with Creative Commons.
> Licenses that protect the official database release from derives
> versions are entirely reasonable and standard in bioinformatics.
> Basically, making sure that when you refer to a UniProt entry, or a, OBO
> ontology term, everyone agrees you are referring to one agreed entry or
> term.
> EMBOSS does depend on these files. The database names are hard-coded
> into some of the new (and more to come) applications.
> You could download the databases and indexes from our rsync copies we
> use to keep developers in sync. These are at
> rsync://emboss.open-bio.org/EMBOSS/
> It might make things clearer if someone from Debian could explain:
> (a) why a Creative Commons licence is an issue for you
> (b) why you appear to consider a copy of a whole or part of a public
> biological database as part of an "operating system"
> regards,
> Peter Rice


>From the BioPerl perspective, this will very likely be a problem for us as well as all other Bio* language (Biopython, BioJava, BioRuby); we typically include data derived from these sources.  We may have a bit more flexibility in that the vast majority are mainly only for tests, but I believe some data is hard-coded in.  Fallback data like REBase for restriction analysis and GO (as Peter mentioned above) come to mind.


Christopher Fields
Senior Research Scientist
National Center for Supercomputing Applications
Institute for Genomic Biology
University of Illinois Urbana-Champaign
1206 W. Gregory Dr. , MC-195
Urbana, IL 61801

More information about the EMBOSS mailing list