LGPL (was RE: [Biojava-l] Restriction digest progress)

Wiepert, Mathieu Wiepert.Mathieu@mayo.edu
Tue, 2 Jul 2002 08:30:04 -0500


Hi,

Then this should apply, from a long ago conversation I had with by Simon
Brocklehust,

<snip>

It depends what you mean by 'free'.  Biojava *itself* _is_ free under the
LGPL in the following senses:

o The source code of biojava is available without charge to anyone who wants
it.

o Having made changes to biojava, you cannot distribute your new version of
the software to anyone without making the source code available.  That is,
the software in biojava is free in spirit, and can live on forever etc etc.

o The entire biojava distribution is always available without charge to
anyone who wants it.

The point about LGPL (as opposed to GPL) is that it means that biojava
doesn't contaminate everything it touches e.g. force everyone who uses it to
make all their source code freely available if they don't want to.

Neither LGPL nor GPL say that people can't charge money for things e.g.
RedHat charges money for their Linux distribution under the GPL.

I hope you're able to contribute - LGPL is about being inclusive and
encouraging participation from as wide a circle as possible.
</snip>

The trick is to have your code use biojava, which is then not contaminiated.
Make any changes to biojava itself, you have to share.

That is about as far as my understanding can go, I am sure there are other
opinions...

-Mat

-----Original Message-----
From: Brian Osborne [mailto:brian_osborne@cognia.com]
Sent: Tuesday, July 02, 2002 8:19 AM
To: Wiepert, Mathieu; BioJava List
Subject: RE: LGPL (was RE: [Biojava-l] Restriction digest progress)


Mat,

I did read the licenses themselves, I believe I understand them, somewhat.
What I don't understand is the idea that Biojava is distributed under the
LGPL so that it's not under the LGPL. Is this what the Biojava authors
actually want to say? It looks like a typo. Is there a third kind of GPL
license?

But to answer your answer one of your questions, we here at Cognia have
contributed to Biojava and we're all happy to do so, including the business
people (and I help with Bioperl myself). What's not as clear is the reverse
case, meaning what happens if we incorporate Biojava code into our products.

Thanks again,

Brian O.

-----Original Message-----
From: biojava-l-admin@biojava.org [mailto:biojava-l-admin@biojava.org]On
Behalf Of Wiepert, Mathieu
Sent: Tuesday, July 02, 2002 8:45 AM
To: 'Brian Osborne'; BioJava List
Subject: LGPL (was RE: [Biojava-l] Restriction digest progress)

Hi,

Since I am not why you are asking (i.e. did you want to make your own
software to market, or is someone like a boss concerned about you
contributing) you should read it yourself rather than rely on anyone else's'
interpretation, just to be on the safe side?

http://www.gnu.org/licenses/licenses.html

http://www.gnu.org/licenses/licenses.html#LGPL


-Mat

-----Original Message-----
From: Brian Osborne [mailto:brian_osborne@cognia.com]
Sent: Tuesday, July 02, 2002 7:37 AM
To: BioJava List
Subject: RE: [Biojava-l] Restriction digest progess


To Biojava,

I see the following on the first page of the Biojava Web site:

BioJava is distributed under LGPL. This means that you can use the libraries
without your software being forced under either the LGPL or GPL.

Can someone tell me what this means?

Thanks again,

Brian O.


-----Original Message-----
From: biojava-l-admin@biojava.org [mailto:biojava-l-admin@biojava.org]On
Behalf Of Keith James
Sent: Tuesday, July 02, 2002 5:18 AM
To: BioJava List
Subject: [Biojava-l] Restriction digest progess


I've had some free time to get started on this. Here's a summary of
what is currently checked in:

org.biojava.bio.symbol.MotifTools

This is another support class which contains static methods (well,
just one right now). String createRegex(SymbolList motif) will create
a regular expression String from a SymbolList, including ambiguities:

e.g. AANNNTGG returns A{2}[ACTG]{3}TG{2}

This should work for all finite alphabets.

This class is used by RestrictionEnzyme (see below) to create regex
Strings for the forward and reverse strand recognition sites.

There is a new package molbio alongside the proteomics package.

org.biojava.bio.molbio.RestrictionEnzyme

This class specifies restriction enzyme properties (recognition site,
cut site(s), type of end produced) and also returns regex Strings
suitable for finding forward and reverse strand recognition sites.

The constructors are public so that you can create custom enzymes, but
the main way to get instances is through the RestrictionEnzymeManager.

org.biojava.bio.molbio.RestrictionEnzymeManager

This class is allows you to get an enzyme by name, get all
isoschizomers of an enzyme by name, get all n-cutters and get a pair
of java.util.regex.Patterns for the forward and reverse strand
sites. There is a properties file
(RestrictionEnzymeManager.properties) which is loaded as a
ResourceBundle and tells the class where to find a REBASE file
(withrefm.### format, same format as used by EMBOSS program
rebaseextract - see REBASE site). I have not checked in a fallback
copy of REBASE - it's quite big and I wanted to get some feedback
first. Do we want the whole of a specific version of REBASE, or just a
subset of common enzymes? Anyone can override this by using their own
copy of REBASE and putting a new properties file in their CLASSPATH.

The part which is only partly implemented is searching. You can now
do searches using

org.biojava.bio.seq.io.SymbolListCharSequence

This class is an implementation of the Java 1.4 interface
CharSequence. It wraps a SymbolList and allows full regex seaching of
any SymbolList whose Symbols can be tokenized to chars. It appears
that the regex Matcher does not call the subSequence or toString
methods, only charAt (which translates directly to symbolAt) so no
extra copies of a big sequence get made. You need to use the regex
engine in Java 1.4

Finally there's stuff to do:

org.biojava.bio.molbio.RestrictionDigest

Is not written. This will do the convenience stuff of spitting out
SymbolList products etc. It should probably be threaded to search
multiple enzymes (or at least both strands for one enzyme)
simultaneously.

One thing I'm not clear on. Do we want "biologically correct"
cutting. That is, if my sequence has two different enzyme sites which
overlap and I do sequential digests, does the second fail to cut
because its site is now partly single-stranded, even though the regex
still matches on one strand? It seems the right way to me, but it may
not to to everyone.

In summary, you can currently do full ambiguity searches on both
strands with a bit of work.

1. Get a copy of REBASE format #31
2. Edit the RestrictionEnzymeManager.properties file to point to it

Do something like this:

RestrictionEnzyme ecoRI = RestrictionEnzymeManager.getEnzyme("EcoRI");
Pattern [] pat = RestrictionEnzymeManager.getPatterns(ecoRI);

CharSequence charSeq = new SymbolListCharSequence(mySymbolList);

Matcher forward = pat[0].matcher(charSeq);
Matcher reverse = pat[1].matcher(charSeq);

Then proceed to use the Matcher as normal. Right now the coordinate
you get back will be the start of the recognition site and you will
have to calculate the actual cut(s). There are methods in
RestrictionEnzyme which return the position(s) of the cut site in the
coordinate space of the recognition site SymbolList (there are some
freaky enzymes which cut both sides of their recognition site).

Please report bugs (or better still, add a test case which fails
because of the bug). Enjoy.

Keith

--

-= Keith James - kdj@sanger.ac.uk - http://www.sanger.ac.uk/Users/kdj =-
Pathogen Sequencing Unit, Wellcome Trust Sanger Institute, Cambridge, UK
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l