[EMBOSS] Files included in EMBOSS but licensed ...
pmr at ebi.ac.uk
Sat Jul 30 08:58:07 UTC 2011
Quoted in full for the benefit of the debian-med list who missed the
On 29/07/2011 21:35, Adam Sjøgren wrote:
> On Fri, 29 Jul 2011 09:39:46 +0100, Peter wrote:
>> It might make things clearer if someone from Debian could explain:
> (I am not from Debian, but here is my take on it anyway:)
>> (a) why a Creative Commons licence is an issue for you
> One of the fundamental software freedoms is the freedom to change the
> The Debian Free Software Guidelines' definition of free software
> includes this freedom².
> So the "No Derivatives" variants of the Creative Commons licenses aren't
> free by the DFSG definition.
> (The GNU Free Documentation License on documents with invariant sections
> is considered non-free by DFSG-standards as well, even if the invariant
> sections are things that nobody would want to change.)
> When a project of volunteers packages 29000+ thousand packages, I think
> making a judgement call on whether it is okay that the license of a
> couple of files does not live up to the guidelines is neigh impossible.
> The answer to "Why would you want to?" is, because you might need to.
> It is more obvious with programs and code than it is with database
> entries, granted - but I guess the equivalent problem would be that the
> licensor didn't want to fix a problem in such a database, and that
> problem made the programs using it malfunction. It would be a pain if
> you weren't allowed to fix the problem and distribute the fixed data
> yourself, say, if "upstream" didn't want to include the fix for some
> reason or another; maybe they happened to turn sour on the world/you -
> stranger things have happened.
> So, nobody is probably ever going to exercise that freedom in this
> specific case, I think, but ignoring some of the freedoms in special
> cases is infeasible for a project such as Debian.
> This is just me trying to explain how I understand it, so take it with a
> grain of salt, and swing by debian-legal³ for the experts.
A specific example might help. About 5 years ago a release of the
UniProt database (as plain text files) broke the Wisconsin (GCG)
sequence analysis package. They introduced extremely long lines in a
data file that everyone assumed was only maximum 80 characters.
As GCG was closed source, the fix required a change to the UniProt files
to either wrap or truncate the 'offending' records.
The fix was not to distribute a change to the data of course, but to
write and distribute a simple perl script that wrapped the long records.
That was not a licensing issue - the content stays the same, the format
is changed, no changed data is distributed. But it does illustrate that
the database licensing does not prevent 'fixing' a database.
>> (b) why you appear to consider a copy of a whole or part of a public
>> biological database as part of an "operating system"
> They are part of a package which is included in the Debian GNU/Linux
> free operating system.
I expect there are many problems that arise if data ... and
documentation ... are considered to be software. For EMBOSS we didn't
officially specify a license for the documentation but other packages
probably do. It still worries me that some of our documentation files
officially include GPL licensed (EMBOSS) source code but I did not like
any of the alternative documentation licenses.
> (I personally think it would make sense to change to a Creative Commons
> license that allows derivative works - Uniprot and others are going to
> be the canonical source for the data anyway, so nothing will be lost by
> them by doing that, as far as I can see.)
Unlikely. The no-derivatives version is specifically there to prevent
derivatives - for example Debian distributing a modified UniProt without
The ontologies are similar, but do allow for the use case of importing
terms from one ontology into another if the ontology name is changed
(and preferably if cross-references to the original are provided).
Again, the need is to protect the integrity of the original ontology
content so references to a GO term or a UniProt entry are clearly defined.
This is essential for many of the public bioinformatics databases. Data
and software are not the same in this context. I am curious whether
documentation licensing raises any issues.
Just my 2c worth
More information about the EMBOSS