[Biojava-l] Multiple questions
mark.schreiber at novartis.com
mark.schreiber at novartis.com
Tue Nov 29 20:15:24 EST 2005
Regarding the format guessing function. It was deprecated cause it cannot
be gaurenteed to work. However, deprecation might be a bit extreme,
especially if many people use it. I would propose that we undeprecate it
and just document a warning saying it may not work. Any objections?
Kalle Näslund <kalle.naslund at genpat.uu.se>
Sent by: biojava-l-bounces at portal.open-bio.org
11/29/2005 09:34 PM
To: Ola Spjuth <ola.spjuth at farmbio.uu.se>
cc: biojava-l at biojava.org, (bcc: Mark Schreiber/GP/Novartis)
Subject: Re: [Biojava-l] Multiple questions
Ola Spjuth wrote:
>I am investigating the usefulness of BioJava as a backend for sequence
>management in Bioclipse (www.bioclipse.net). As a total newbie to
>Biojava, I have read the tutorial, BIA examples, glanced at the API,
>read my first FASTA-sequence and have come up with a few questions:
>1) Is it possible to search the Biojava-l archives without having to
>manually browse by month?
>2) Is there a wrapper for SequenceIO.fileToBiojava(..) that
>automatically detects file formats or is it necessary to distinguish
>sequence formats externally, i.e. with different file-extensions? If so,
>does anyone know of a complete list of file-extensions that could be
>mapped to a format?
There is a deprecated piece of code available, that quite many people
in their code still. Even though it might not be the greatest thing to
try to auto
guess file format, its the desireable thing to do in many cases.
If i just look at people in my lab, they want to open the file, they
dont want to keep
track of what file format that particular sequence was in, and so on.
So, even if file format guessing is bad, people are going to write it,
and imho its
better to have one centralised good, known to work file guesser, then
different implementations that differ in each persons own application.
So, my suggestion is to start with using the deprecated version thats in
it gets removed you can easily just copy that small part of the code
into your own
application, or as an external little jarfile.
>3) How robust are the I/O-classes for different formats? The
>test-library provided is rather short in my opinion and my first test
>broke since there was a space in the wrong position...
>4) What are the capabilities for multiple sequence alignment in Biojava?
>Is it limited to parse results into Biojava objects (as in BIA) or does
>it contain any stable MSA-implementations? Due to BioJavas size it is
>not easy to get an overview of the current capabilities and the standard
>of different parts.
There is some support for multiple alignments in biojava. The Alignment
and implementations happily handle multiple alignments. And you can
to interpret it, either as SymbolList over a crossproduct alphabet, or
sequences accessable by some label.
There is a basic framework for handling multiple alignment formats in
org.biojava.bio.seq.io package. It currently only implements two
and MSF. Most programs seem to be able to generate multiple alignment
into either FASTA or MSF format so you should be able to get the results
>5) As a novice, has anyone implemented BLAST or CLUSTALW in Java? Any
>public web-services running for this?
I have been told by greater deities that implementing BLAST in java is
the blast algorithm makes heavy use of low level data structures,
pointers ? and similar
things that are very hard to implement and controll in java. So the
would most likely run pretty darn slow, and not do what you want.
Depending on what you want to do with BLAST, the biojava SSAHA
might be something you can use instead ( it works pretty ok on quite
but its not realy suited for more divergent sequences )
When it comes to webservices i just know of a few things, i have not
used any of these
to an large extent, so i cant comment on how well they work for large
jobs and so on.
Sadly they all use their own data encoding and service invocation setup,
so its pretty darn
annoying to use.
>6) Is there some example-code on how to use DAS (as a client)?
>7) How can I submit an RFE?
>Sorry for so many questions in one post; I have a lot of catching up to
>do and was hoping for some guidance. Some answers have probably already
>been answered in earlier posts but I have not been able to search the
>Biojava-l mailing list - Biojava-l at biojava.org
Biojava-l mailing list - Biojava-l at biojava.org
More information about the Biojava-l